Full Observability Stack on k3s: Prometheus, Loki, Jaeger, Grafana, and Cloudflare Logpush
A complete guide to building a full observability stack on a 4-node ARM64 k3s homelab cluster. No Helm — everything is raw Kustomize manifests. The stack covers metrics (Prometheus + Alertmanager), logging (Loki + Alloy), tracing (Jaeger), and visualization (Grafana with 14 dashboards). On top of the standard LGTM stack, Cloudflare Logpush feeds HTTP request logs, firewall events, and Workers traces through a custom Traefik decompression plugin into Loki for security analytics and performance monitoring.
The guide is structured as a linear build-up: Prometheus Operator and core metrics first, then Loki and log collection, then Jaeger tracing, then Grafana with dashboards and SSO, then the Cloudflare Logpush pipeline with its custom Traefik plugin. Each section includes the actual manifests used in production.
Architecture Overview
Section titled “Architecture Overview”The cluster runs on 4x ARM64 Rock boards (rock1-rock4) on a 10.0.71.x LAN behind a VyOS router with a PPPoE WAN link. All HTTP traffic enters via Cloudflare Tunnel through Traefik. The monitoring stack runs entirely in the monitoring namespace.
Component versions
Section titled “Component versions”| Component | Version | Image |
|---|---|---|
| Prometheus Operator | v0.89.0 | quay.io/prometheus-operator/prometheus-operator:v0.89.0 |
| Prometheus | v3.9.1 | quay.io/prometheus/prometheus:v3.9.1 |
| Alertmanager | v0.31.1 | quay.io/prometheus/alertmanager:v0.31.1 |
| kube-state-metrics | v2.18.0 | registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0 |
| Node Exporter | v1.10.2 | quay.io/prometheus/node-exporter:v1.10.2 |
| Blackbox Exporter | v0.28.0 | quay.io/prometheus/blackbox-exporter:v0.28.0 |
| Grafana | 12.3.3 | docker.io/grafana/grafana:12.3.3 |
| k8s-sidecar | 2.5.0 | quay.io/kiwigrid/k8s-sidecar:2.5.0 |
| Loki | 3.6.5 | docker.io/grafana/loki:3.6.5 |
| Grafana Alloy | v1.13.0 | docker.io/grafana/alloy:v1.13.0 |
| Jaeger | 2.15.1 | docker.io/jaegertracing/jaeger:2.15.1 |
External access
Section titled “External access”All monitoring UIs are exposed via Cloudflare Tunnel through Traefik IngressRoutes:
| Service | URL | IngressRoute |
|---|---|---|
| Grafana | https://grafana-k3s.example.io | ingressroutes/grafana-ingress.yaml |
| Prometheus | https://prom-k3s.example.io | ingressroutes/prometheus-ingress.yaml |
| Alertmanager | https://alertmanager-k3s.example.io | ingressroutes/alertmanager-ingress.yaml |
| Jaeger | https://jaeger-k3s.example.io | ingressroutes/jaeger-ingress.yaml |
DNS CNAME records and Cloudflare tunnel ingress rules are managed by OpenTofu in cloudflare-tunnel-tf/.
Part 1: Prometheus Operator
Section titled “Part 1: Prometheus Operator”Why raw manifests instead of Helm
Section titled “Why raw manifests instead of Helm”The entire stack is deployed as raw Kustomize manifests. No Helm. This gives full visibility into every resource, avoids Helm’s template abstraction layer, and makes it straightforward to patch individual fields. The trade-off is manual version bumps, which is acceptable for a homelab.
Operator CRDs
Section titled “Operator CRDs”The Prometheus Operator provides 10 CRDs totaling ~3.7 MB:
resources: # CRDs (must be applied before operator) - crd-alertmanagerconfigs.yaml - crd-alertmanagers.yaml - crd-podmonitors.yaml - crd-probes.yaml - crd-prometheusagents.yaml - crd-prometheuses.yaml - crd-prometheusrules.yaml - crd-scrapeconfigs.yaml - crd-servicemonitors.yaml - crd-thanosrulers.yaml # Operator RBAC and workload - serviceaccount.yaml - clusterrole.yaml - clusterrolebinding.yaml - deployment.yaml - service.yaml - servicemonitor.yaml - webhook.yamlThe webhook cert-gen Jobs must complete before the operator Deployment starts. Kustomize handles ordering if everything is in the same kustomization.
Prometheus CR
Section titled “Prometheus CR”The operator manages Prometheus via a Prometheus custom resource. It creates a StatefulSet (prometheus-prometheus), a config-reloader sidecar, and handles all ServiceMonitor/PrometheusRule reconciliation:
apiVersion: monitoring.coreos.com/v1kind: Prometheusmetadata: name: prometheus namespace: monitoringspec: version: v3.9.1 image: quay.io/prometheus/prometheus:v3.9.1 replicas: 1 serviceAccountName: prometheus
retention: 7d retentionSize: 8GB
storage: volumeClaimTemplate: spec: storageClassName: nfs-client accessModes: [ReadWriteMany] resources: requests: storage: 10Gi
# Config reloader sidecar resources -- uses strategic merge patch containers: - name: config-reloader resources: requests: cpu: 10m memory: 25Mi limits: cpu: 50m memory: 50Mi
walCompression: true resources: requests: cpu: 200m memory: 512Mi limits: cpu: "2" memory: 2Gi
# Selectors -- all match `release: prometheus` label serviceMonitorSelector: matchLabels: release: prometheus serviceMonitorNamespaceSelector: {} podMonitorSelector: matchLabels: release: prometheus podMonitorNamespaceSelector: {} probeSelector: matchLabels: release: prometheus probeNamespaceSelector: {} ruleSelector: matchLabels: release: prometheus ruleNamespaceSelector: {} scrapeConfigSelector: matchLabels: release: prometheus scrapeConfigNamespaceSelector: {}
alerting: alertmanagers: - namespace: monitoring name: alertmanager port: http-web apiVersion: v2
securityContext: fsGroup: 65534 runAsGroup: 65534 runAsNonRoot: true runAsUser: 65534 seccompProfile: type: RuntimeDefault
externalUrl: https://prom-k3s.example.ioServiceMonitors
Section titled “ServiceMonitors”18 ServiceMonitors scrape targets across the cluster. The release: prometheus label is the common selector:
| ServiceMonitor | Namespace | Target |
|---|---|---|
| prometheus-operator | monitoring | Operator metrics |
| prometheus | monitoring | Prometheus self-metrics |
| alertmanager | monitoring | Alertmanager metrics |
| grafana | monitoring | Grafana metrics |
| kube-state-metrics | monitoring | kube-state-metrics |
| node-exporter | monitoring | Node Exporter (all nodes) |
| blackbox-exporter | monitoring | Blackbox Exporter |
| loki | monitoring | Loki metrics |
| alloy | monitoring | Grafana Alloy (DaemonSet) |
| alloy-logpush | monitoring | Alloy Logpush receiver |
| jaeger | monitoring | Jaeger metrics |
| traefik | traefik | Traefik ingress controller |
| cloudflared | cloudflared | Cloudflare tunnel daemon |
| authentik-metrics | authentik | Authentik server |
| revista | revista | Revista app |
| kubelet | kube-system | Kubelet + cAdvisor |
| coredns | kube-system | CoreDNS |
| apiserver | default | Kubernetes API server |
Cross-namespace ServiceMonitors (traefik, cloudflared, authentik, revista, kubelet, coredns, apiserver) live in monitoring/servicemonitors/ and use namespaceSelector.matchNames to reach across namespaces.
The kubelet ServiceMonitor scrapes three endpoints from the same port: /metrics (kubelet), /metrics/cadvisor (container metrics), and /metrics/probes (probe metrics). All use bearer token auth against the k8s API server CA.
Alert rules
Section titled “Alert rules”Six PrometheusRule CRs provide alerting and recording rules:
| Rule file | Coverage |
|---|---|
general-rules.yaml | Watchdog, InfoInhibitor, TargetDown |
kubernetes-apps.yaml | Pod CrashLoopBackOff, container restarts, Deployment/StatefulSet failures |
kubernetes-resources.yaml | CPU/memory quota overcommit, namespace resource limits |
node-rules.yaml | Node filesystem, memory, CPU, network, clock skew |
k8s-recording-rules.yaml | Pre-computed recording rules for dashboards |
traefik-rules.yaml | Traefik-specific alerting rules |
KEDA autoscaling
Section titled “KEDA autoscaling”Prometheus and Grafana use KEDA ScaledObjects for autoscaling:
# Prometheus: targets the operator-created StatefulSetapiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata: name: prometheus-keda namespace: monitoringspec: scaleTargetRef: apiVersion: apps/v1 kind: StatefulSet name: prometheus-prometheus minReplicaCount: 1 maxReplicaCount: 8 triggers: - type: cpu metadata: type: Utilization value: "50" - type: memory metadata: type: Utilization value: "50"Part 2: Alertmanager
Section titled “Part 2: Alertmanager”Managed by the Prometheus Operator via the Alertmanager CR:
apiVersion: monitoring.coreos.com/v1kind: Alertmanagermetadata: name: alertmanager namespace: monitoringspec: version: v0.31.1 image: quay.io/prometheus/alertmanager:v0.31.1 replicas: 1 serviceAccountName: prometheus
storage: volumeClaimTemplate: spec: storageClassName: nfs-client accessModes: [ReadWriteMany] resources: requests: storage: 1Gi
resources: requests: cpu: 50m memory: 64Mi limits: cpu: 200m memory: 256Mi
securityContext: fsGroup: 65534 runAsGroup: 65534 runAsNonRoot: true runAsUser: 65534 seccompProfile: type: RuntimeDefault
externalUrl: https://alertmanager-k3s.example.ioThe Alertmanager config (routing rules, SMTP credentials) lives in alertmanager/secret.yaml and must be SOPS-encrypted before committing:
sops --encrypt --age <YOUR_AGE_PUBLIC_KEY> \ --encrypted-regex '^(data|stringData)$' \ --in-place monitoring/alertmanager/secret.yamlPart 3: Loki
Section titled “Part 3: Loki”Loki runs in monolithic mode (-target=all) as a single-replica StatefulSet with filesystem storage on NFS.
Configuration
Section titled “Configuration”data: loki.yaml: | target: all auth_enabled: false server: http_listen_port: 3100 grpc_listen_port: 9095 log_level: info common: path_prefix: /loki ring: instance_addr: 0.0.0.0 kvstore: store: inmemory replication_factor: 1 schema_config: configs: - from: "2024-01-01" store: tsdb object_store: filesystem schema: v13 index: prefix: index_ period: 24h storage_config: filesystem: directory: /loki/chunks tsdb_shipper: active_index_directory: /loki/index cache_location: /loki/index_cache compactor: working_directory: /loki/compactor compaction_interval: 5m retention_enabled: true delete_request_store: filesystem retention_delete_delay: 2h retention_delete_worker_count: 150 limits_config: retention_period: 744h # 31 days reject_old_samples: true reject_old_samples_max_age: 168h # 7 days ingestion_rate_mb: 10 ingestion_burst_size_mb: 20 max_query_parallelism: 2 max_query_series: 5000 allow_structured_metadata: true volume_enabled: trueKey settings:
| Setting | Value | Why |
|---|---|---|
schema: v13 | TSDB | Latest Loki schema, required for structured metadata |
retention_period: 744h | 31 days | Matches Cloudflare’s Logpush retention |
max_query_series: 5000 | High | Required for topk queries on high-cardinality Logpush data (see Part 8) |
ingestion_rate_mb: 10 | 10 MB/s | Logpush batches can be large; default was too low |
max_query_parallelism: 2 | Conservative | ARM64 nodes have limited resources |
delete_request_store: filesystem | Required | Must be set when retention_enabled: true, otherwise Loki fails to start |
StatefulSet
Section titled “StatefulSet”apiVersion: apps/v1kind: StatefulSetmetadata: name: loki namespace: monitoringspec: replicas: 1 serviceName: loki-headless template: spec: securityContext: runAsUser: 10001 runAsGroup: 10001 fsGroup: 10001 runAsNonRoot: true containers: - name: loki image: docker.io/grafana/loki:3.6.5 args: - -config.file=/etc/loki/loki.yaml resources: requests: cpu: 250m memory: 512Mi limits: cpu: 1000m memory: 1Gi volumeMounts: - name: config mountPath: /etc/loki - name: data mountPath: /loki volumeClaimTemplates: - metadata: name: data spec: storageClassName: nfs-client accessModes: [ReadWriteMany] resources: requests: storage: 20GiPart 4: Grafana Alloy
Section titled “Part 4: Grafana Alloy”Grafana Alloy serves two roles in this stack:
- DaemonSet (
alloy/) — runs on every node, collects pod logs and forwards OTLP traces - Deployment (
alloy-logpush/) — single instance, receives Cloudflare Logpush data (covered in Part 7)
DaemonSet configuration
Section titled “DaemonSet configuration”The DaemonSet Alloy discovers pods on its node, tails their log files, and forwards to Loki. It also receives OTLP traces and batches them to Jaeger:
logging { level = "info" format = "logfmt"}
// Pod discovery and log collectiondiscovery.kubernetes "pods" { role = "pod" selectors { role = "pod" field = "spec.nodeName=" + coalesce(env("HOSTNAME"), "") }}
discovery.relabel "pod_logs" { targets = discovery.kubernetes.pods.targets
rule { source_labels = ["__meta_kubernetes_pod_phase"] regex = "Pending|Succeeded|Failed|Unknown" action = "drop" } rule { source_labels = ["__meta_kubernetes_namespace"] target_label = "namespace" } rule { source_labels = ["__meta_kubernetes_pod_name"] target_label = "pod" } rule { source_labels = ["__meta_kubernetes_pod_container_name"] target_label = "container" } rule { source_labels = ["__meta_kubernetes_pod_uid", "__meta_kubernetes_pod_container_name"] separator = "/" target_label = "__path__" replacement = "/var/log/pods/*$1/*.log" }}
local.file_match "pod_logs" { path_targets = discovery.relabel.pod_logs.output}
loki.source.file "pod_logs" { targets = local.file_match.pod_logs.targets forward_to = [loki.process.pod_logs.receiver]}
loki.process "pod_logs" { stage.cri {} forward_to = [loki.write.default.receiver]}
loki.write "default" { endpoint { url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push" }}
// OTLP trace receiver -> Jaegerotelcol.receiver.otlp "default" { grpc { endpoint = "0.0.0.0:4317" } http { endpoint = "0.0.0.0:4318" } output { traces = [otelcol.processor.batch.default.input] }}
otelcol.processor.batch "default" { output { traces = [otelcol.exporter.otlp.jaeger.input] }}
otelcol.exporter.otlp "jaeger" { client { endpoint = "jaeger-collector.monitoring.svc.cluster.local:4317" tls { insecure = true } }}The pipeline:
discovery.kubernetesdiscovers pods on the current node (filtered byHOSTNAMEenv var)discovery.relabelextracts namespace/pod/container labels and constructs the log file pathloki.source.filetails the CRI log files under/var/log/pods/loki.processapplies thestage.cri {}pipeline to parse CRI-format log linesloki.writepushes to Lokiotelcol.receiver.otlpreceives traces from applications on gRPC 4317 / HTTP 4318otelcol.processor.batchbatches traces for efficiencyotelcol.exporter.otlpforwards to Jaeger’s collector
Part 5: Jaeger
Section titled “Part 5: Jaeger”Jaeger v2 uses the OpenTelemetry Collector config format. It runs as an all-in-one Deployment with Badger embedded storage on an NFS PVC (10Gi).
Configuration
Section titled “Configuration”data: config.yaml: | service: extensions: - jaeger_storage - jaeger_query - healthcheckv2 pipelines: traces: receivers: [otlp] processors: [batch] exporters: [jaeger_storage_exporter] telemetry: resource: service.name: jaeger metrics: level: detailed readers: - pull: exporter: prometheus: host: 0.0.0.0 port: 8888 logs: level: info
extensions: healthcheckv2: use_v2: true http: endpoint: 0.0.0.0:13133 jaeger_query: storage: traces: badger_main jaeger_storage: backends: badger_main: badger: directories: keys: /badger/data/keys values: /badger/data/values ephemeral: false ttl: spans: 168h
receivers: otlp: protocols: grpc: { endpoint: 0.0.0.0:4317 } http: { endpoint: 0.0.0.0:4318 }
processors: batch: send_batch_size: 10000 timeout: 5s
exporters: jaeger_storage_exporter: trace_storage: badger_mainThe Deployment uses strategy: Recreate since Badger uses file locking and cannot run multiple instances:
spec: replicas: 1 strategy: type: Recreate template: spec: containers: - name: jaeger image: docker.io/jaegertracing/jaeger:2.15.1 args: [--config, /etc/jaeger/config.yaml] ports: - name: otlp-grpc containerPort: 4317 - name: otlp-http containerPort: 4318 - name: query-http containerPort: 16686 - name: metrics containerPort: 8888 - name: health containerPort: 13133 resources: requests: cpu: 250m memory: 512Mi limits: cpu: 1000m memory: 2GiLog-to-trace correlation
Section titled “Log-to-trace correlation”Loki’s datasource config includes derivedFields that extract trace IDs from log lines and link them to Jaeger:
# In grafana/datasources.yaml- name: Loki type: loki uid: loki url: http://loki.monitoring.svc:3100 jsonData: derivedFields: - datasourceUid: jaeger matcherRegex: '"traceID":"(\w+)"' name: traceID url: "$${__value.raw}"When a log line contains a traceID field, Grafana renders it as a clickable link that opens the trace in Jaeger.
Part 6: Grafana
Section titled “Part 6: Grafana”Authentication with Authentik SSO
Section titled “Authentication with Authentik SSO”Grafana uses Authentik as an OAuth2/OIDC provider:
[auth]oauth_allow_insecure_email_lookup = true
[auth.generic_oauth]enabled = truename = Authentikallow_sign_up = trueauto_login = falsescopes = openid email profileauth_url = https://authentik.example.io/application/o/authorize/token_url = https://authentik.example.io/application/o/token/api_url = https://authentik.example.io/application/o/userinfo/signout_redirect_url = https://authentik.example.io/application/o/grafana/end-session/role_attribute_path = contains(groups, 'Grafana Admins') && 'Admin' || contains(groups, 'Grafana Editors') && 'Editor' || 'Viewer'groups_attribute_path = groupslogin_attribute_path = preferred_usernamename_attribute_path = nameemail_attribute_path = emailuse_pkce = trueuse_refresh_token = trueRole mapping via Authentik groups:
| Authentik Group | Grafana Role |
|---|---|
Grafana Admins | Admin |
Grafana Editors | Editor |
| (everyone else) | Viewer |
Credentials (oauth-client-id, oauth-client-secret) are stored in grafana-secret and injected as env vars. The secret must be SOPS-encrypted.
Datasources
Section titled “Datasources”Four datasources are provisioned via a directly-mounted ConfigMap (not the sidecar):
datasources: - name: Prometheus type: prometheus uid: prometheus url: http://prometheus.monitoring.svc:9090 isDefault: true jsonData: httpMethod: POST timeInterval: 30s
- name: Alertmanager type: alertmanager uid: alertmanager url: http://alertmanager.monitoring.svc:9093 jsonData: implementation: prometheus
- name: Loki type: loki uid: loki url: http://loki.monitoring.svc:3100 jsonData: derivedFields: - datasourceUid: jaeger matcherRegex: '"traceID":"(\w+)"' name: traceID url: "$${__value.raw}"
- name: Jaeger type: jaeger uid: jaeger url: http://jaeger-query.monitoring.svc:16686Deployment
Section titled “Deployment”The Grafana Deployment has two containers: the k8s-sidecar for dashboard provisioning and Grafana itself:
containers: - name: grafana-sc-dashboard image: quay.io/kiwigrid/k8s-sidecar:2.5.0 env: - name: LABEL value: grafana_dashboard - name: LABEL_VALUE value: "1" - name: METHOD value: WATCH - name: FOLDER value: /tmp/dashboards - name: NAMESPACE value: ALL - name: RESOURCE value: configmap resources: requests: cpu: 50m memory: 64Mi
- name: grafana image: docker.io/grafana/grafana:12.3.3 env: - name: GF_SECURITY_ADMIN_USER valueFrom: secretKeyRef: name: grafana-secret key: admin-user - name: GF_SECURITY_ADMIN_PASSWORD valueFrom: secretKeyRef: name: grafana-secret key: admin-password - name: GF_AUTH_GENERIC_OAUTH_CLIENT_ID valueFrom: secretKeyRef: name: grafana-secret key: oauth-client-id - name: GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET valueFrom: secretKeyRef: name: grafana-secret key: oauth-client-secret resources: requests: cpu: 100m memory: 128MiThe sidecar in WATCH mode detects ConfigMaps with grafana_dashboard: "1" across all namespaces and writes them to /tmp/dashboards. Grafana’s dashboard provider reads from that directory.
Dashboard management
Section titled “Dashboard management”All 14 dashboards are standalone .json files managed by kustomize configMapGenerator:
generatorOptions: disableNameSuffixHash: true labels: grafana_dashboard: "1"
configMapGenerator: - name: alertmanager-dashboard files: - alertmanager.json - name: cloudflare-logpush-dashboard files: - cloudflare-logpush.json # ... 12 more entriesThis replaced the previous approach of inlining dashboard JSON inside YAML ConfigMaps. The benefits:
- JSON files get proper syntax highlighting in editors
- No YAML escaping issues with special characters in JSON
- Files can be imported/exported directly from Grafana’s UI
- Easy to diff and review in git
| Dashboard | Source | Panels |
|---|---|---|
| Alertmanager | grafana.com | ~6 |
| Alloy | grafana.com | ~30 |
| Authentik | grafana.com | ~20 |
| Blackbox Exporter | grafana.com | ~12 |
| Cloudflare Logpush | custom gen script | 85 |
| Cloudflare Tunnel | custom gen script | 41 |
| CoreDNS | grafana.com | ~15 |
| Grafana Stats | grafana.com | ~8 |
| Jaeger | grafana.com | ~20 |
| K8s Cluster | grafana.com | ~15 |
| Loki | grafana.com | ~40 |
| Node Exporter | grafana.com | ~40 |
| Prometheus | grafana.com | ~35 |
| Traefik | grafana.com | ~25 |
Adding upstream dashboards from grafana.com:
cd monitoring/grafana/dashboards/./add-dashboard.sh <gnet-id> <name> [revision]
# Example:./add-dashboard.sh 1860 node-exporter 37The script downloads the JSON, replaces all datasource template variables with hardcoded UIDs (prometheus, loki), strips __inputs/__requires, fixes deprecated panel types (grafana-piechart-panel -> piechart), writes a standalone .json file, and adds a configMapGenerator entry.
Regenerating custom dashboards:
python3 gen-cloudflare-logpush.py # 85 panelspython3 gen-cloudflared.py # 41 panelsPython dashboard generators
Section titled “Python dashboard generators”Custom dashboards are generated by Python scripts rather than hand-edited JSON. An 85-panel dashboard is ~5000 lines of JSON but only ~500 lines of Python with reusable helper functions:
#!/usr/bin/env python3"""Generate the Cloudflare Logpush Grafana dashboard JSON."""import json
DS = {"type": "loki", "uid": "loki"}
def stat_panel(id, title, expr, legend, x, y, w=6, unit="short", thresholds=None, instant=True): """Stat panel with threshold colors.""" # ... returns panel dict
def ts_panel(id, title, targets, x, y, w=12, h=8, unit="short", stack=True, overrides=None, fill=20): """Time series panel with stacking.""" # ... returns panel dict
def table_panel(id, title, expr, legend, x, y, w=8, h=8): """Table panel for topk queries.""" # ... returns panel dict
def pie_panel(id, title, expr, legend, x, y, w=6, h=8): """Pie chart with donut style and right legend.""" # ... returns panel dict
# Shared query fragments with template variable filtersHTTP = ('{job="cloudflare-logpush", dataset="http_requests"} | json' ' | ClientRequestHost =~ "$host" | ClientCountry =~ "$country"')FW = ('{job="cloudflare-logpush", dataset="firewall_events"} | json' ' | ClientRequestHost =~ "$host"')WK = '{job="cloudflare-logpush", dataset="workers_trace_events"} | json'The gen-cloudflared.py script follows the same pattern but uses DS = {"type": "prometheus", "uid": "prometheus"} since cloudflared exports Prometheus metrics natively.
Part 7: Cloudflare Logpush Pipeline
Section titled “Part 7: Cloudflare Logpush Pipeline”This is the most complex part of the stack. Cloudflare Logpush pushes HTTP request logs, firewall events, and Workers trace events as gzip-compressed NDJSON to an HTTPS endpoint on the cluster. The challenge: Alloy’s /loki/api/v1/raw endpoint does not handle gzip, and Traefik has no built-in request body decompression.
The compression problem
Section titled “The compression problem”When Cloudflare Logpush sends data to an HTTP destination:
- Logpush always gzip-compresses HTTP payloads — no way to disable this
- Alloy’s
loki.source.api/loki/api/v1/rawdoes not handleContent-Encoding: gzip— confirmed by reading Alloy source. Only/loki/api/v1/push(protobuf/JSON) handles gzip - Traefik’s
compressmiddleware only handles response compression, not request body decompression
This means a decompression layer is needed between Cloudflare and Alloy.
The Traefik decompress plugin
Section titled “The Traefik decompress plugin”I wrote a Traefik Yaegi (Go interpreter) local plugin that intercepts Content-Encoding: gzip requests, decompresses the body, and passes through to the next handler:
package decompress
import ( "bytes" "compress/gzip" "context" "fmt" "io" "net/http" "strconv" "strings")
type Config struct{}
func CreateConfig() *Config { return &Config{} }
type Decompress struct { next http.Handler name string}
func New(ctx context.Context, next http.Handler, config *Config, name string) (http.Handler, error) { return &Decompress{next: next, name: name}, nil}
func (d *Decompress) ServeHTTP(rw http.ResponseWriter, req *http.Request) { encoding := strings.ToLower(req.Header.Get("Content-Encoding")) if encoding != "gzip" { d.next.ServeHTTP(rw, req) return }
gzReader, err := gzip.NewReader(req.Body) if err != nil { http.Error(rw, fmt.Sprintf("failed to create gzip reader: %v", err), http.StatusBadRequest) return } defer gzReader.Close()
decompressed, err := io.ReadAll(gzReader) if err != nil { http.Error(rw, fmt.Sprintf("failed to decompress body: %v", err), http.StatusBadRequest) return }
req.Body = io.NopCloser(bytes.NewReader(decompressed)) req.ContentLength = int64(len(decompressed)) req.Header.Set("Content-Length", strconv.Itoa(len(decompressed))) req.Header.Del("Content-Encoding")
d.next.ServeHTTP(rw, req)}Published at github.com/erfianugrah/decompress, tagged v0.1.0.
Deploying the plugin on k3s
Section titled “Deploying the plugin on k3s”Traefik loads local plugins from /plugins-local/src/<moduleName>/. Since Traefik runs with readOnlyRootFilesystem: true, the plugin files are packaged as a ConfigMap and mounted:
Step 1: ConfigMap in traefik namespace containing decompress.go, go.mod, .traefik.yml:
apiVersion: v1kind: ConfigMapmetadata: name: traefik-plugin-decompress namespace: traefikdata: decompress.go: | package decompress // ... (full Go source) go.mod: | module github.com/erfianugrah/decompress go 1.22 .traefik.yml: | displayName: Decompress Request Body type: middleware import: github.com/erfianugrah/decompress summary: Decompresses gzip-encoded request bodies for upstream services. testData: {}Step 2: Volume mount in Traefik Deployment:
volumeMounts: - name: plugin-decompress mountPath: /plugins-local/src/github.com/erfianugrah/decompress readOnly: truevolumes: - name: plugin-decompress configMap: name: traefik-plugin-decompressStep 3: Traefik arg to enable the plugin:
args: - "--experimental.localPlugins.decompress.moduleName=github.com/erfianugrah/decompress"Step 4: Middleware CRD (must be in same namespace as IngressRoute):
apiVersion: traefik.io/v1alpha1kind: Middlewaremetadata: name: decompress namespace: monitoringspec: plugin: decompress: {}Dataset-agnostic Alloy receiver
Section titled “Dataset-agnostic Alloy receiver”The Alloy Logpush receiver runs as a separate Deployment. The key design: it knows nothing about individual Logpush datasets. Each job injects a _dataset field via output_options.record_prefix, and Alloy extracts only that as a label:
loki.source.api "cloudflare" { http { listen_address = "0.0.0.0" listen_port = 3500 } labels = { job = "cloudflare-logpush", } forward_to = [loki.process.cloudflare.receiver]}
loki.process "cloudflare" { stage.json { expressions = { dataset = "_dataset" } } stage.labels { values = { dataset = "dataset" } } forward_to = [loki.write.default.receiver]}
loki.write "default" { endpoint { url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push" }}Adding a new Logpush dataset requires zero Alloy changes — just create the job with the right record_prefix and data flows automatically.
DNS and tunnel routing
Section titled “DNS and tunnel routing”The Logpush endpoint needs a public HTTPS URL. This is provided by the Cloudflare Tunnel:
resource "cloudflare_record" "logpush-k3s" { zone_id = var.cloudflare_secondary_zone_id name = "logpush-k3s" type = "CNAME" content = cloudflare_zero_trust_tunnel_cloudflared.k3s.cname proxied = true tags = ["k3s", "monitoring"]}
# cloudflare-tunnel-tf/tunnel_config.tfingress_rule { hostname = "logpush-k3s.${var.secondary_domain_name}" service = "https://traefik.traefik.svc.cluster.local" origin_request { origin_server_name = "logpush-k3s.${var.secondary_domain_name}" http2_origin = true no_tls_verify = true }}The IngressRoute ties hostname, middleware, and backend together:
apiVersion: traefik.io/v1alpha1kind: IngressRoutemetadata: name: alloy-logpush namespace: monitoringspec: entryPoints: - websecure routes: - kind: Rule match: Host(`logpush-k3s.example.io`) middlewares: - name: decompress namespace: monitoring services: - kind: Service name: alloy-logpush port: 3500OpenTofu Logpush jobs
Section titled “OpenTofu Logpush jobs”Seven Logpush jobs are managed in OpenTofu. Shared config uses locals:
logpush_loki_dest = "https://logpush-k3s.example.io/loki/api/v1/raw?header_Content-Type=application%2Fjson&header_X-Logpush-Secret=${var.logpush_secret}"
zone_ids = { example_com = var.cloudflare_zone_id example_dev = var.secondary_cloudflare_zone_id example_io = var.thirdary_cloudflare_zone_id}The destination URL uses Logpush’s header_ query parameter syntax to inject Content-Type and a shared secret header.
HTTP requests (one per zone, using for_each):
resource "cloudflare_logpush_job" "http_loki" { for_each = local.zone_ids
dataset = "http_requests" destination_conf = local.logpush_loki_dest enabled = true max_upload_interval_seconds = 30
output_options { output_type = "ndjson" record_prefix = "{\"_dataset\":\"http_requests\"," field_names = local.http_requests_fields timestamp_format = "rfc3339" cve20214428 = false }
zone_id = each.value}Firewall events (same pattern, for_each over zones):
resource "cloudflare_logpush_job" "firewall_loki" { for_each = local.zone_ids
dataset = "firewall_events" destination_conf = local.logpush_loki_dest enabled = true
output_options { output_type = "ndjson" record_prefix = "{\"_dataset\":\"firewall_events\"," field_names = local.firewall_events_fields }
zone_id = each.value}Workers trace events (account-scoped, single job):
resource "cloudflare_logpush_job" "workers_loki" { dataset = "workers_trace_events" destination_conf = local.logpush_loki_dest enabled = true
output_options { output_type = "ndjson" record_prefix = "{\"_dataset\":\"workers_trace_events\"," field_names = local.workers_trace_events_fields }
account_id = var.cloudflare_account_id}The record_prefix trick prepends {"_dataset":"http_requests", to every JSON line, producing:
{"_dataset":"http_requests","ClientIP":"1.2.3.4","RayID":"abc123",...}Alloy extracts _dataset as a label; everything else stays in the log line for LogQL | json.
| Dataset | Scope | Jobs | Zones |
|---|---|---|---|
http_requests | Zone | 3 | example.com, example.dev, example.io |
firewall_events | Zone | 3 | example.com, example.dev, example.io |
workers_trace_events | Account | 1 | (all Workers) |
| Total | 7 |
Cloudflare Logpush dashboard
Section titled “Cloudflare Logpush dashboard”The custom dashboard has 85 panels across 11 sections:
| Section | Panels | Key visualizations |
|---|---|---|
| Overview | 8 stats | Request count, 5xx error rate, cache hit ratio, WAF attacks, bot traffic %, leaked credentials |
| HTTP Requests | 8 | By host/status/method, top paths, suspicious user agents (BotScore < 30) |
| Performance | 10 | Edge TTFB (avg/p95/p99), origin timing breakdown (DNS/TCP/TLS stacked area) |
| Cache | 9 | Cache status, hit ratio trend, tiered fill, compression ratio |
| Security & Firewall | 5 | Firewall events by action/source/host, top rules |
| WAF Attack Analysis | 6 | Attack score buckets, SQLi/XSS/RCE breakdown, unmitigated attacks |
| Threat Intelligence | 9 | Leaked credentials, IP classification, geo anomaly on sensitive paths |
| Bot Analysis | 6 | Bot score distribution, JA4/JA3 fingerprints, verified bot categories |
| Geography | 4 | Top countries, edge colos, ASNs, device types |
| SSL/TLS | 4 | Client SSL versions/ciphers, mTLS status, origin SSL |
| Workers | 6 | CPU/wall time by script, outcomes, subrequest count |
Template variables are textbox type with .* default (matches everything). Grafana’s label_values() only works for indexed Loki labels, not JSON-extracted fields — since all fields are in the JSON body, textbox is the only practical option.
Cloudflared tunnel dashboard
Section titled “Cloudflared tunnel dashboard”Expanded from 18 to 41 panels using gen-cloudflared.py:
| Section | Key panels |
|---|---|
| Tunnel Overview | Requests/sec, errors/sec, active connections, uptime |
| Response Status | Status code distribution, error rate by origin |
| Latency | Request/response duration (avg/p95/p99) |
| Throughput | Bytes sent/received, request body size percentiles |
| Edge Locations | Top colo codes, request distribution by edge |
| RPC Operations | Register, unregister, reconnect calls |
| Process Resources | CPU, memory, goroutines, open FDs |
Part 8: LogQL Pitfalls
Section titled “Part 8: LogQL Pitfalls”Working with high-cardinality Cloudflare Logpush data in Loki exposed nine specific traps. These cost real debugging time — the error messages are often unhelpful.
1. count_over_time without sum() explodes series
Section titled “1. count_over_time without sum() explodes series”# BAD: one series per unique log linecount_over_time({job="cloudflare-logpush"} | json [5m])
# GOOD: single aggregated countsum(count_over_time({job="cloudflare-logpush"} | json [5m]))After | json, every extracted field becomes a potential label. Without sum(), count_over_time returns one series per unique label combination — easily hitting max_query_series.
2. unwrap aggregations don’t support by ()
Section titled “2. unwrap aggregations don’t support by ()”# BAD: parse erroravg_over_time(... | unwrap EdgeTimeToFirstByteMs [5m]) by (Host)
# GOOD: outer aggregation for groupingsum by (Host) (avg_over_time(... | unwrap EdgeTimeToFirstByteMs [5m]))3. Stat panels need instant: true
Section titled “3. Stat panels need instant: true”Without instant: true, Loki returns a range result. The stat panel picks lastNotNull which may not reflect the full window. Set "queryType": "instant", "instant": true on stat panel targets.
4. $__auto vs fixed intervals
Section titled “4. $__auto vs fixed intervals”- Time series panels:
[$__auto]— adapts to the visible time range - Table panels:
[5m]fixed —$__autocreates too many evaluation windows - Stat panels:
[5m]withinstant: true
5. Cannot compare two extracted fields
Section titled “5. Cannot compare two extracted fields”# IMPOSSIBLE: compare two extracted fields{...} | json | OriginResponseStatus != EdgeResponseStatusLogQL can only compare extracted fields to literal values. Use two queries or dashboard transformations.
6. unwrap produces one series per stream
Section titled “6. unwrap produces one series per stream”Always wrap unwrap aggregations in an outer sum() or avg by ():
# BAD: one series per label combinationavg_over_time(... | unwrap EdgeTimeToFirstByteMs [$__auto])
# GOOD: collapsedsum(avg_over_time(... | unwrap EdgeTimeToFirstByteMs [$__auto]))7. max_query_series applies to inner cardinality
Section titled “7. max_query_series applies to inner cardinality”topk(10, sum by (Path) (count_over_time(... | json [5m])))Loki evaluates sum by (Path) first. If there are thousands of unique paths (bots/scanners), it exceeds max_query_series before topk ever runs. Reducing the time window does not help — the cardinality is inherent in the data.
8. High-cardinality topk requires high max_query_series
Section titled “8. High-cardinality topk requires high max_query_series”Even a 1-second scan window can have 1500+ unique paths due to bots. Raised max_query_series to 5000:
limits_config: max_query_series: 5000The memory impact on single-instance homelab Loki is negligible for instant queries.
9. Table panels with [$__auto] hit series limits
Section titled “9. Table panels with [$__auto] hit series limits”Combines pitfalls 4, 7, and 8. Over a 24h range, $__auto might resolve to 15-second intervals, creating many evaluation windows. Use [5m] fixed for all table instant queries.
Part 9: VyOS Metrics
Section titled “Part 9: VyOS Metrics”Probe vs ScrapeConfig
Section titled “Probe vs ScrapeConfig”The VyOS router runs node_exporter on port 9100 (HTTPS, self-signed cert). Initially we used the Prometheus Probe CRD, but it routes through blackbox exporter, producing only probe_* metrics — not the actual node_* metrics. VyOS never appeared in the Node Exporter dashboard.
The fix: ScrapeConfig CRD for direct scraping:
apiVersion: monitoring.coreos.com/v1alpha1kind: ScrapeConfigmetadata: name: vyos-nl namespace: monitoring labels: release: prometheusspec: metricsPath: /metrics scheme: HTTPS tlsConfig: insecureSkipVerify: true staticConfigs: - targets: - prom-vyos.example.com labels: job: node-exporter instance: prom-vyos.example.com scrapeInterval: 30s| Aspect | Probe CRD | ScrapeConfig CRD |
|---|---|---|
| Path | Prometheus -> blackbox -> target | Prometheus -> target (direct) |
| Metrics | probe_* only | All target metrics |
| Use case | Endpoint availability | Actual metric scraping |
With job: node-exporter, VyOS appears in the Node Exporter dashboard alongside cluster nodes.
Secrets Management
Section titled “Secrets Management”Two Secret files require SOPS encryption before committing:
| File | Contents |
|---|---|
monitoring/grafana/secret.yaml | admin-user, admin-password, oauth-client-id, oauth-client-secret |
monitoring/alertmanager/secret.yaml | Alertmanager config with SMTP credentials |
sops --encrypt --age <YOUR_AGE_PUBLIC_KEY> \ --encrypted-regex '^(data|stringData)$' \ --in-place monitoring/grafana/secret.yaml
sops --encrypt --age <YOUR_AGE_PUBLIC_KEY> \ --encrypted-regex '^(data|stringData)$' \ --in-place monitoring/alertmanager/secret.yamlOpenTofu secrets (logpush_secret, zone IDs, API tokens) live in SOPS-encrypted secrets.tfvars:
sops -d secrets.tfvars > /tmp/secrets.tfvarstofu plan -var-file=/tmp/secrets.tfvarstofu apply -var-file=/tmp/secrets.tfvarsrm /tmp/secrets.tfvarsDeployment
Section titled “Deployment”Full stack deploy
Section titled “Full stack deploy”# 1. Deploy the entire monitoring stack (includes all components)kubectl apply -k monitoring/ --server-side --force-conflicts
# 2. Deploy decompress plugin + middleware (separate from monitoring kustomization)kubectl apply -f middleware/decompress-configmap.yamlkubectl apply -f middleware/decompress-middleware.yaml
# 3. Deploy updated Traefik with plugin enabledkubectl apply -f services/traefik.yaml
# 4. Deploy ingress routeskubectl apply -f ingressroutes/grafana-ingress.yamlkubectl apply -f ingressroutes/prometheus-ingress.yamlkubectl apply -f ingressroutes/alertmanager-ingress.yamlkubectl apply -f ingressroutes/jaeger-ingress.yamlkubectl apply -f ingressroutes/alloy-logpush-ingress.yaml
# 5. Deploy KEDA autoscalingkubectl apply -f hpa/grafana-keda-autoscaling.yamlkubectl apply -f hpa/prom-keda-autoscaling.yaml
# 6. Apply OpenTofu for DNS + tunnel configcd cloudflare-tunnel-tf/ && tofu apply
# 7. Apply OpenTofu for Logpush jobscd ../cloudflare-tf/main_zone/tofu apply -var-file=secrets.tfvars--server-side is required because the Prometheus Operator CRDs and Node Exporter dashboard exceed the 262144-byte annotation limit. IngressRoutes and KEDA ScaledObjects are outside the monitoring/ kustomization directory because Kustomize cannot reference files outside its root.
Verification
Section titled “Verification”# All pods runningkubectl get pods -n monitoring
# Prometheus targetskubectl port-forward svc/prometheus 9090 -n monitoring# Visit http://localhost:9090/targets -- all should be UP
# Loki receiving datakubectl logs deploy/alloy-logpush -n monitoring --tail=20
# Logpush data flowing# In Grafana Explore with Loki:# {job="cloudflare-logpush"} | json
# Dashboard ConfigMapskubectl get cm -n monitoring -l grafana_dashboard=1# Should show 14 ConfigMapsResource Budget
Section titled “Resource Budget”| Component | Instances | CPU Req | Mem Req | Storage |
|---|---|---|---|---|
| Prometheus Operator | 1 | 100m | 128Mi | — |
| Prometheus | 1 | 200m | 512Mi | 10Gi NFS |
| Alertmanager | 1 | 50m | 64Mi | 1Gi NFS |
| Grafana | 1 | 100m | 128Mi | 1Gi NFS |
| kube-state-metrics | 1 | 50m | 64Mi | — |
| Node Exporter | 4 (DaemonSet) | 50m x4 | 32Mi x4 | — |
| Blackbox Exporter | 1 | 25m | 32Mi | — |
| Loki | 1 | 250m | 512Mi | 20Gi NFS |
| Grafana Alloy | 4 (DaemonSet) | 100m x4 | 128Mi x4 | — |
| Alloy Logpush | 1 | 50m | 64Mi | — |
| Jaeger | 1 | 250m | 512Mi | 10Gi NFS |
| Total | ~1.68 cores | ~2.59Gi | ~42Gi |
File Reference
Section titled “File Reference”monitoring/ kustomization.yaml # Top-level: composes all components namespace.yaml
operator/ # Prometheus Operator v0.89.0 kustomization.yaml crd-*.yaml # 10 CRDs (~3.7 MB) serviceaccount.yaml clusterrole.yaml / clusterrolebinding.yaml deployment.yaml / service.yaml / servicemonitor.yaml webhook.yaml # Cert-gen Jobs + webhook configs
prometheus/ # Prometheus v3.9.1 (operator-managed) kustomization.yaml prometheus.yaml # Prometheus CR serviceaccount.yaml / clusterrole.yaml / clusterrolebinding.yaml service.yaml / servicemonitor.yaml rules/ general-rules.yaml # Watchdog, TargetDown kubernetes-apps.yaml # CrashLoopBackOff, restarts kubernetes-resources.yaml # CPU/memory quota node-rules.yaml # Filesystem, memory, CPU k8s-recording-rules.yaml # Pre-computed recording rules traefik-rules.yaml # Traefik alerts
alertmanager/ # Alertmanager v0.31.1 alertmanager.yaml # Alertmanager CR secret.yaml # SOPS-encrypted SMTP config
grafana/ # Grafana 12.3.3 configmap.yaml # grafana.ini + dashboard provider datasources.yaml # Prometheus, Loki, Jaeger, Alertmanager deployment.yaml # Grafana + k8s-sidecar secret.yaml # SOPS-encrypted credentials dashboards/ kustomization.yaml # configMapGenerator add-dashboard.sh # Download from grafana.com gen-cloudflare-logpush.py # 85-panel dashboard generator gen-cloudflared.py # 41-panel dashboard generator *.json # 14 dashboard files
loki/ # Loki 3.6.5 (monolithic) configmap.yaml / statefulset.yaml / service.yaml / servicemonitor.yaml
alloy/ # Grafana Alloy v1.13.0 (DaemonSet) configmap.yaml / daemonset.yaml / service.yaml / servicemonitor.yaml
alloy-logpush/ # Alloy Logpush receiver (Deployment) configmap.yaml / deployment.yaml / service.yaml / servicemonitor.yaml
jaeger/ # Jaeger 2.15.1 (all-in-one) configmap.yaml / deployment.yaml / service.yaml / servicemonitor.yaml
kube-state-metrics/ # v2.18.0 node-exporter/ # v1.10.2 (DaemonSet) blackbox-exporter/ # v0.28.0
servicemonitors/ # Cross-namespace ServiceMonitors apiserver.yaml / authentik.yaml / cloudflared.yaml coredns.yaml / kubelet.yaml / revista.yaml / traefik.yaml
probes/ vyos-scrape.yaml # ScrapeConfig for VyOS node_exporter
middleware/ # Traefik decompress plugin decompress-plugin/ decompress.go / go.mod / .traefik.yml decompress-configmap.yaml # ConfigMap for k8s decompress-middleware.yaml # Middleware CRD
ingressroutes/ grafana-ingress.yaml / prometheus-ingress.yaml alertmanager-ingress.yaml / jaeger-ingress.yaml alloy-logpush-ingress.yaml
hpa/ grafana-keda-autoscaling.yaml # maxReplicas: 1 (SQLite limitation) prom-keda-autoscaling.yaml # maxReplicas: 8
cloudflare-tunnel-tf/ # OpenTofu: DNS + tunnel ingress rules records.tf / tunnel_config.tf
cloudflare-tf/main_zone/ # OpenTofu: Logpush jobs zone_logpush_job.tf / locals.tf / variables.tfLessons Learned
Section titled “Lessons Learned”-
Server-side apply is mandatory. Prometheus Operator CRDs (~3.7 MB) and the Node Exporter dashboard (~472KB) exceed the 262144-byte annotation limit. Use
kubectl apply --server-sidefor everything. -
Prometheus v2 -> v3 is a major breaking change. Removed flags (
--storage.tsdb.retention,--alertmanager.timeout), removed feature flags (now default), PromQL renames (holt_winters->double_exponential_smoothing), regex.now matches newlines. -
Grafana 10 -> 12 removed Angular.
grafana-piechart-panelis now the built-inpiecharttype. Theadd-dashboard.shscript handles this automatically. -
SQLite + NFS + multiple replicas = data corruption. Keep Grafana at 1 replica or migrate to PostgreSQL.
-
Mount datasources directly, not via sidecar. The k8s-sidecar in LIST mode has a startup race condition. Direct ConfigMap mount is reliable.
-
Jaeger v2 telemetry config is nested.
pull.exporter.prometheus.host/port, not flataddress. -
Loki requires
delete_request_store: filesystemwhen retention is enabled. Without it, Loki fails to start. -
configMapGeneratorwithdisableNameSuffixHash: trueis the right dashboard pattern. WithoutdisableNameSuffixHash, every edit creates a new ConfigMap name and orphans the old one. -
Cloudflare Logpush always gzip-compresses HTTP payloads. No option to disable. Alloy’s
/loki/api/v1/rawdoesn’t handle gzip. Traefik has no built-in request decompression. Requires a custom plugin. -
record_prefixenables dataset-agnostic ingestion. By prepending{"_dataset":"<name>",to every line, Alloy extracts a label without knowing the dataset schema. Zero cluster changes for new datasets. -
LogQL
unwrapaggregations always need an outersum(). Without it, one series per unique label combination hitsmax_query_series. -
topk()does not reduce inner cardinality. Loki evaluates the inner aggregation fully before applyingtopk. Raisemax_query_seriesinstead. -
Grafana template variables with Loki only work for indexed labels. JSON-extracted fields cannot use
label_values(). Usetextboxorcustomtype with.*default. -
Prometheus
ProbeCRD routes through blackbox exporter. It only returnsprobe_*metrics, not the target’s actual metrics. UseScrapeConfigfor real metric scraping. -
Python generators >> hand-editing dashboard JSON. 85 panels = ~5000 lines of JSON, but only ~500 lines of Python with reusable helpers.
-
Grafana admin password with special characters gets mangled.
$,&, and other shell metacharacters inGF_SECURITY_ADMIN_PASSWORDare unreliable through env vars. Use alphanumeric passwords. -
Grafana OAuth email matching. Bootstrap admin has
email: admin@localhost. Update it via the API before first SSO login, or setemail_verified: truein Authentik. -
Pie charts with many slices need
legend.placement: "right". Bottom placement shrinks the chart to nothing when there are 10+ slices.