Multi-Site Vaultwarden: PostgreSQL, R2 Sync, and Cloudflare Load Balancer Failover
A complete guide to migrating a single-site Vaultwarden instance (SQLite on Unraid) to a multi-site active/passive setup with PostgreSQL as the database backend, Cloudflare R2 as the cross-site sync mechanism, and Cloudflare Load Balancer for automatic failover. The primary site runs on an Unraid server (“Servarr”) with Docker Compose, and the warm standby runs on a k3s ARM64 homelab cluster in a different country.
The guide is structured as a linear build-up: PostgreSQL migration on the primary site first, then R2 backup automation, then the k3s standby deployment, then the Cloudflare Load Balancer with tunnel-based health checks, and finally the DNS cutover. Each section includes the actual manifests, compose files, and Terraform config used in production.
Architecture Overview
Section titled “Architecture Overview”Servarr (Unraid, Site A) is the primary site — all user writes go here. The k3s cluster (ARM64, Site B) is a warm standby that restores from R2 every 15 minutes. Cloudflare Load Balancer owns the vault.example.com DNS record and routes traffic to Servarr by default. If Servarr’s health check fails, the LB fails over to k3s automatically. Both sites connect to Cloudflare via their own tunnels (no VPN between them).
Why Active/Passive (Not Active/Active)
Section titled “Why Active/Passive (Not Active/Active)”Both sites have independent PostgreSQL databases synced via R2 dump/restore. There is no real-time replication between them. If a user writes to k3s while Servarr is still alive, the next R2 restore will overwrite that write. Active/active would require streaming replication or a shared database, neither of which is feasible across two sites without a VPN.
The active/passive model is safe: Servarr handles all writes, k3s stays in sync via periodic restores, and the LB only sends traffic to k3s when Servarr is confirmed down.
Component Versions
Section titled “Component Versions”| Component | Version |
|---|---|
| Vaultwarden | 1.35.3 |
| PostgreSQL | 17 |
| aws-cli (K8s jobs) | 2.27.31 |
| Cloudflare Provider (TF) | 4.52.5 |
| SOPS | 3.11.0 |
| age | (key: age1xxxx...xxxx) |
Part 1: SQLite to PostgreSQL Migration (Servarr)
Section titled “Part 1: SQLite to PostgreSQL Migration (Servarr)”Vaultwarden ships with SQLite by default. Switching to PostgreSQL requires: (1) bootstrapping the PG schema by letting Vaultwarden start once against an empty PG database, and (2) migrating data from SQLite using pgloader.
1.1 Start PostgreSQL
Section titled “1.1 Start PostgreSQL”Add a PostgreSQL 17 container to the Vaultwarden Docker Compose stack on Servarr:
# /mnt/user/data/dockge/stacks/vaultwarden/compose.yaml (excerpt)postgres_vaultwarden: image: postgres:17 container_name: postgres_vaultwarden hostname: postgres_vaultwarden restart: always deploy: resources: limits: cpus: "2" memory: 512M expose: - 5432 environment: - POSTGRES_USER=vaultwarden - POSTGRES_PASSWORD=${PG_PASSWORD} - POSTGRES_DB=vaultwarden - TZ=Your/Timezone volumes: - /mnt/user/data/postgres-vaultwarden/data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U vaultwarden"] interval: 10s timeout: 5s retries: 5 networks: warden: ipv4_address: 172.20.0.4Start it alone first to initialize the data directory:
docker compose up -d postgres_vaultwarden1.2 Bootstrap the Vaultwarden Schema
Section titled “1.2 Bootstrap the Vaultwarden Schema”Vaultwarden auto-creates its schema (via Diesel migrations) when it connects to an empty PostgreSQL database. Temporarily point Vaultwarden at PG without mounting /data:
docker run --rm \ --network vaultwarden_warden \ -e DATABASE_URL="postgresql://vaultwarden:${PG_PASSWORD}@postgres_vaultwarden:5432/vaultwarden" \ -e I_REALLY_WANT_VOLATILE_STORAGE=true \ vaultwarden/server:1.35.3Wait for the log line [INFO] Starting Vaultwarden, then Ctrl+C. The schema is now created in PostgreSQL.
1.3 Disable WAL on the SQLite Database
Section titled “1.3 Disable WAL on the SQLite Database”Before migration, disable Write-Ahead Logging on the SQLite database. pgloader can have issues with WAL-enabled databases:
sqlite3 /mnt/user/data/vaultwarden/vw-data/db.sqlite3 "PRAGMA journal_mode=delete;"1.4 Migrate Data with pgloader
Section titled “1.4 Migrate Data with pgloader”pgloader handles the SQLite → PostgreSQL data migration. It requires a load command file (not CLI flags) for advanced options like table exclusions. Create a file called vaultwarden.load:
load database from sqlite:///data/db.sqlite3 into postgresql://vaultwarden:YOUR_PG_PASSWORD@postgres_vaultwarden:5432/vaultwarden WITH data only, include no drop, reset sequences EXCLUDING TABLE NAMES LIKE '__diesel_schema_migrations' ALTER SCHEMA 'bitwarden' RENAME TO 'public';The key options:
data only— the schema already exists from step 1.2include no drop— don’t drop tables that Vaultwarden createdreset sequences— fix auto-increment sequences after bulk insertEXCLUDING TABLE NAMES LIKE '__diesel_schema_migrations'— Vaultwarden already ran its Diesel migrations in step 1.2; re-importing this table would cause duplicate key errorsALTER SCHEMA 'bitwarden' RENAME TO 'public'— pgloader may create tables in a schema named after the SQLite file; this remaps them to thepublicschema where Vaultwarden expects them
Run pgloader with the load file:
docker run --rm \ --network vaultwarden_warden \ -v /mnt/user/data/vaultwarden/vw-data:/data:ro \ -v $(pwd)/vaultwarden.load:/vaultwarden.load:ro \ dimitri/pgloader:latest \ pgloader /vaultwarden.load1.5 Switch Vaultwarden to PostgreSQL
Section titled “1.5 Switch Vaultwarden to PostgreSQL”Update the Vaultwarden service in compose.yaml to use the DATABASE_URL environment variable:
vaultwarden: image: vaultwarden/server:1.35.3 container_name: vaultwarden hostname: vaultwarden restart: always deploy: resources: limits: cpus: "4" memory: 512M expose: - 80 environment: - SIGNUPS_ALLOWED=false - INVITATIONS_ALLOWED=false - DOMAIN=${DOMAIN} - DATABASE_URL=postgresql://vaultwarden:${PG_PASSWORD}@postgres_vaultwarden:5432/vaultwarden - SMTP_HOST=smtp.gmail.com - SMTP_FROM=${GMAIL} - SMTP_PORT=587 - SMTP_SECURITY=starttls - SMTP_USERNAME=${GMAIL} - SMTP_PASSWORD=${PASSWORD} - PUID=1000 - PGID=100 - UMASK=0000 - TZ=Your/Timezone volumes: - /mnt/user/data/vaultwarden/vw-data:/data depends_on: postgres_vaultwarden: condition: service_healthy networks: warden: ipv4_address: 172.20.0.21.6 Environment Variables
Section titled “1.6 Environment Variables”GMAIL=you@gmail.comPASSWORD=<gmail-app-password>DOMAIN=https://vault.example.comPG_PASSWORD=<postgres-password>R2_ACCESS_KEY_ID=<r2-access-key>R2_SECRET_ACCESS_KEY=<r2-secret-key>Part 2: R2 Backup Pipeline (Servarr → R2)
Section titled “Part 2: R2 Backup Pipeline (Servarr → R2)”A sidecar container runs alongside Vaultwarden on Servarr. It dumps PostgreSQL every 15 minutes, packages the dump with RSA keys and attachments into a tarball, and uploads both a timestamped copy and a latest copy to Cloudflare R2. Old backups are pruned after 30 days.
2.1 pg_backup_r2 Container
Section titled “2.1 pg_backup_r2 Container”# /mnt/user/data/dockge/stacks/vaultwarden/compose.yaml (excerpt)pg_backup_r2: image: postgres:17 container_name: pg_backup_r2 hostname: pg_backup_r2 restart: always deploy: resources: limits: cpus: "1" memory: 256M environment: - PGHOST=postgres_vaultwarden - PGUSER=vaultwarden - PGPASSWORD=${PG_PASSWORD} - PGDATABASE=vaultwarden - AWS_ACCESS_KEY_ID=${R2_ACCESS_KEY_ID} - AWS_SECRET_ACCESS_KEY=${R2_SECRET_ACCESS_KEY} - AWS_DEFAULT_REGION=auto - R2_ENDPOINT=https://<your-account-id>.r2.cloudflarestorage.com - TZ=Your/Timezone entrypoint: ["/bin/bash", "-c"] command: - > set -e
echo "Installing aws-cli and tar..." apt-get update -qq && apt-get install -y -qq awscli tar gzip > /dev/null 2>&1 echo "Starting pg_dump -> R2 backup loop (every 15 minutes)..."
backup_and_upload() { TIMESTAMP=$$(date +%Y%m%d-%H%M%S) BACKUP_DIR=/tmp/backup-$$TIMESTAMP mkdir -p $$BACKUP_DIR
echo "[$$TIMESTAMP] Running pg_dump..." pg_dump -Fc --no-owner --no-acl -f $$BACKUP_DIR/vaultwarden.pgdump
# Copy VW data files (RSA keys, icons, attachments) [ -d /data/attachments ] && cp -r /data/attachments $$BACKUP_DIR/ || true [ -d /data/sends ] && cp -r /data/sends $$BACKUP_DIR/ || true [ -d /data/icon_cache ] && cp -r /data/icon_cache $$BACKUP_DIR/ || true [ -f /data/config.json ] && cp /data/config.json $$BACKUP_DIR/ || true [ -f /data/rsa_key.pem ] && cp /data/rsa_key.pem $$BACKUP_DIR/ || true [ -f /data/rsa_key.pub.pem ] && cp /data/rsa_key.pub.pem $$BACKUP_DIR/ || true
TARBALL=/tmp/vaultwarden-pg-$$TIMESTAMP.tar.gz tar -czf $$TARBALL -C $$BACKUP_DIR . rm -rf $$BACKUP_DIR
echo "[$$TIMESTAMP] Uploading to R2..." aws s3 cp $$TARBALL s3://vault/pg-dumps/ --endpoint-url $$R2_ENDPOINT rm -f $$TARBALL
# Also upload a "latest" copy for easy restore LATEST_DIR=/tmp/latest-backup mkdir -p $$LATEST_DIR pg_dump -Fc --no-owner --no-acl -f $$LATEST_DIR/vaultwarden.pgdump [ -f /data/rsa_key.pem ] && cp /data/rsa_key.pem $$LATEST_DIR/ || true [ -f /data/rsa_key.pub.pem ] && cp /data/rsa_key.pub.pem $$LATEST_DIR/ || true [ -d /data/attachments ] && cp -r /data/attachments $$LATEST_DIR/ || true [ -d /data/sends ] && cp -r /data/sends $$LATEST_DIR/ || true tar -czf /tmp/vaultwarden-pg-latest.tar.gz -C $$LATEST_DIR . aws s3 cp /tmp/vaultwarden-pg-latest.tar.gz \ s3://vault/pg-dumps/vaultwarden-pg-latest.tar.gz \ --endpoint-url $$R2_ENDPOINT rm -rf $$LATEST_DIR /tmp/vaultwarden-pg-latest.tar.gz
# Prune R2: keep last 30 days of timestamped backups CUTOFF=$$(date -d '30 days ago' +%Y%m%d) aws s3 ls s3://vault/pg-dumps/ --endpoint-url $$R2_ENDPOINT | while read -r line; do FILE=$$(echo "$$line" | awk '{print $$4}') case "$$FILE" in vaultwarden-pg-2[0-9][0-9][0-9]*) FILE_DATE=$$(echo "$$FILE" | sed 's/vaultwarden-pg-\([0-9]\{8\}\).*/\1/') if [ "$$FILE_DATE" -lt "$$CUTOFF" ] 2>/dev/null; then echo "Pruning old backup: $$FILE" aws s3 rm "s3://vault/pg-dumps/$$FILE" --endpoint-url $$R2_ENDPOINT fi ;; esac done
echo "[$$TIMESTAMP] Backup complete." }
# Run immediately on start backup_and_upload
# Then every 15 minutes while true; do sleep 900 backup_and_upload done volumes: - /mnt/user/data/vaultwarden/vw-data:/data:ro - /mnt/user/data/vaultwarden/vaultwarden_backup/backups:/backups depends_on: postgres_vaultwarden: condition: service_healthy networks: warden: ipv4_address: 172.20.0.32.2 R2 Bucket Structure
Section titled “2.2 R2 Bucket Structure”After the first run, the R2 bucket looks like:
s3://vault/pg-dumps/├── vaultwarden-pg-20260221-080000.tar.gz # timestamped├── vaultwarden-pg-20260221-081500.tar.gz├── vaultwarden-pg-20260221-083000.tar.gz├── ...└── vaultwarden-pg-latest.tar.gz # always the newestEach tarball contains:
.├── vaultwarden.pgdump # pg_dump custom format├── rsa_key.pem # Vaultwarden RSA private key├── rsa_key.pub.pem # Vaultwarden RSA public key├── attachments/ # file attachments (if any)└── sends/ # Bitwarden Send files (if any)2.3 Docker Network
Section titled “2.3 Docker Network”The stack uses a dedicated bridge network with static IPs:
networks: warden: driver: bridge ipam: config: - subnet: 172.20.0.0/24 gateway: 172.20.0.1| Container | IP | Port |
|---|---|---|
| vaultwarden | 172.20.0.2 | 80 |
| pg_backup_r2 | 172.20.0.3 | — |
| postgres_vaultwarden | 172.20.0.4 | 5432 |
Part 3: k3s Standby Deployment
Section titled “Part 3: k3s Standby Deployment”The k3s site runs a full Vaultwarden + PostgreSQL stack that stays in sync with Servarr via R2 restores every 15 minutes. All manifests live in services/vaultwarden/ and are managed with Kustomize.
3.1 Kustomization
Section titled “3.1 Kustomization”apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomization
resources: - namespace.yaml - secret.yaml - pvc.yaml - postgres-deployment.yaml - postgres-service.yaml - deployment.yaml - service.yaml - ingressroute.yaml - backup-cronjob.yaml - restore-cronjob.yaml3.2 Namespace
Section titled “3.2 Namespace”apiVersion: v1kind: Namespacemetadata: name: vaultwarden3.3 SOPS-Encrypted Secret
Section titled “3.3 SOPS-Encrypted Secret”The secret contains all sensitive values encrypted with SOPS + age:
# services/vaultwarden/secret.yaml (decrypted view)apiVersion: v1kind: Secretmetadata: name: vaultwarden-secrets namespace: vaultwardentype: OpaquestringData: DOMAIN: https://vault.example.com SMTP_FROM: you@gmail.com SMTP_USERNAME: you@gmail.com SMTP_PASSWORD: <gmail-app-password> R2_ACCESS_KEY_ID: <r2-access-key> R2_SECRET_ACCESS_KEY: <r2-secret-key> PG_PASSWORD: <postgres-password> DATABASE_URL: postgresql://vaultwarden:<postgres-password>@postgres-vaultwarden:5432/vaultwardenTo encrypt with SOPS:
sops --encrypt \ --age age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \ --unencrypted-suffix _unencrypted \ secret.yaml > secret.enc.yamlmv secret.enc.yaml secret.yaml3.4 Persistent Volume Claims
Section titled “3.4 Persistent Volume Claims”Three PVCs on the nfs-client storageClass:
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: vaultwarden-data namespace: vaultwardenspec: accessModes: - ReadWriteMany storageClassName: nfs-client resources: requests: storage: 5Gi---apiVersion: v1kind: PersistentVolumeClaimmetadata: name: vaultwarden-backups namespace: vaultwardenspec: accessModes: - ReadWriteMany storageClassName: nfs-client resources: requests: storage: 10Gi---apiVersion: v1kind: PersistentVolumeClaimmetadata: name: postgres-vaultwarden-data namespace: vaultwardenspec: accessModes: - ReadWriteOnce storageClassName: nfs-client resources: requests: storage: 5Gi| PVC | Size | Access Mode | Purpose |
|---|---|---|---|
vaultwarden-data | 5Gi | RWX | VW /data (RSA keys, attachments, icons) |
vaultwarden-backups | 10Gi | RWX | Local backup tarballs |
postgres-vaultwarden-data | 5Gi | RWO | PostgreSQL data directory |
3.5 PostgreSQL Deployment
Section titled “3.5 PostgreSQL Deployment”apiVersion: apps/v1kind: Deploymentmetadata: name: postgres-vaultwarden namespace: vaultwardenspec: replicas: 1 strategy: type: Recreate selector: matchLabels: app: postgres-vaultwarden template: metadata: labels: app: postgres-vaultwarden spec: containers: - name: postgres image: postgres:17 ports: - containerPort: 5432 name: postgres env: - name: POSTGRES_USER value: vaultwarden - name: POSTGRES_DB value: vaultwarden - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: vaultwarden-secrets key: PG_PASSWORD - name: TZ value: Your/Timezone - name: PGDATA value: /var/lib/postgresql/data/pgdata resources: limits: cpu: "2" memory: 512Mi requests: cpu: 100m memory: 128Mi volumeMounts: - name: postgres-data mountPath: /var/lib/postgresql/data livenessProbe: exec: command: - pg_isready - -U - vaultwarden initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: exec: command: - pg_isready - -U - vaultwarden initialDelaySeconds: 5 periodSeconds: 5 volumes: - name: postgres-data persistentVolumeClaim: claimName: postgres-vaultwarden-dataapiVersion: v1kind: Servicemetadata: name: postgres-vaultwarden namespace: vaultwardenspec: selector: app: postgres-vaultwarden ports: - port: 5432 targetPort: 5432 protocol: TCP type: ClusterIP3.6 Vaultwarden Deployment
Section titled “3.6 Vaultwarden Deployment”apiVersion: apps/v1kind: Deploymentmetadata: name: vaultwarden namespace: vaultwardenspec: replicas: 1 strategy: type: Recreate selector: matchLabels: app: vaultwarden template: metadata: labels: app: vaultwarden spec: securityContext: fsGroup: 100 containers: - name: vaultwarden image: vaultwarden/server:1.35.3 ports: - containerPort: 80 name: http
env: - name: SIGNUPS_ALLOWED value: "false" - name: INVITATIONS_ALLOWED value: "false" - name: TZ value: Your/Timezone - name: DOMAIN valueFrom: secretKeyRef: name: vaultwarden-secrets key: DOMAIN - name: DATABASE_URL valueFrom: secretKeyRef: name: vaultwarden-secrets key: DATABASE_URL - name: SMTP_HOST value: smtp.gmail.com - name: SMTP_PORT value: "587" - name: SMTP_SECURITY value: starttls - name: SMTP_FROM valueFrom: secretKeyRef: name: vaultwarden-secrets key: SMTP_FROM - name: SMTP_USERNAME valueFrom: secretKeyRef: name: vaultwarden-secrets key: SMTP_USERNAME - name: SMTP_PASSWORD valueFrom: secretKeyRef: name: vaultwarden-secrets key: SMTP_PASSWORD resources: limits: cpu: "4" memory: 512Mi requests: cpu: 100m memory: 128Mi volumeMounts: - name: data mountPath: /data livenessProbe: httpGet: path: /alive port: 80 initialDelaySeconds: 15 periodSeconds: 30 readinessProbe: httpGet: path: /alive port: 80 initialDelaySeconds: 5 periodSeconds: 10 volumes: - name: data persistentVolumeClaim: claimName: vaultwarden-dataThe /alive endpoint returns HTTP 200 when Vaultwarden is healthy. It also verifies the database connection (the handler takes a DbConn parameter), so it will fail if PostgreSQL is unreachable. This is used by both K8s probes and the Cloudflare LB health monitor.
3.7 Service and IngressRoute
Section titled “3.7 Service and IngressRoute”apiVersion: v1kind: Servicemetadata: name: vaultwarden namespace: vaultwardenspec: selector: app: vaultwarden ports: - name: http protocol: TCP port: 80 targetPort: 80apiVersion: traefik.io/v1alpha1kind: IngressRoutemetadata: name: vaultwarden namespace: vaultwardenspec: entryPoints: - websecure routes: - kind: Rule match: Host(`vault.example.com`) middlewares: - name: inflight-req namespace: traefik - name: retry namespace: traefik services: - kind: Service name: vaultwarden port: 80The IngressRoute uses the websecure entrypoint (port 443 with TLS via ACME DNS challenge). The global sentinel + security-headers middlewares are applied automatically by Traefik’s entrypoint config. Per-route middlewares add inflight-req (100 concurrent connections per IP) and retry (3 attempts).
3.8 Restore CronJob (R2 → k3s)
Section titled “3.8 Restore CronJob (R2 → k3s)”This CronJob runs every 15 minutes and is the core of the standby sync:
- initContainer (
amazon/aws-cli): Downloadsvaultwarden-pg-latest.tar.gzfrom R2 - main container (
postgres:17): Extracts the tarball, runspg_restore --clean --if-existsinto the local PG, and copies RSA keys + attachments to the Vaultwarden data PVC
apiVersion: batch/v1kind: CronJobmetadata: name: vaultwarden-restore namespace: vaultwardenspec: schedule: "*/15 * * * *" concurrencyPolicy: Forbid successfulJobsHistoryLimit: 3 failedJobsHistoryLimit: 3 jobTemplate: spec: template: metadata: labels: app: vaultwarden-restore spec: initContainers: - name: download image: amazon/aws-cli:2.27.31 command: - sh - -c - | echo "Downloading latest PG backup from R2..." aws s3 cp s3://vault/pg-dumps/vaultwarden-pg-latest.tar.gz /restore/vaultwarden-pg-latest.tar.gz \ --endpoint-url https://<your-account-id>.r2.cloudflarestorage.com echo "Download complete." env: - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: vaultwarden-secrets key: R2_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: vaultwarden-secrets key: R2_SECRET_ACCESS_KEY - name: AWS_DEFAULT_REGION value: auto resources: limits: cpu: 500m memory: 128Mi requests: cpu: 50m memory: 64Mi volumeMounts: - name: restore-tmp mountPath: /restore containers: - name: restore image: postgres:17 command: - bash - -c - | set -e echo "Extracting backup..." cd /restore tar -xzf vaultwarden-pg-latest.tar.gz
echo "Restoring PG dump..." # Drop and recreate all tables, then restore pg_restore --clean --if-exists --no-owner --no-acl \ -d "$DATABASE_URL" \ /restore/vaultwarden.pgdump 2>&1 || true
# Copy RSA keys and data files to VW data volume [ -f /restore/rsa_key.pem ] && cp /restore/rsa_key.pem /data/ && echo "Restored rsa_key.pem" [ -f /restore/rsa_key.pub.pem ] && cp /restore/rsa_key.pub.pem /data/ && echo "Restored rsa_key.pub.pem" [ -d /restore/attachments ] && cp -r /restore/attachments /data/ && echo "Restored attachments" [ -d /restore/sends ] && cp -r /restore/sends /data/ && echo "Restored sends"
# Cleanup rm -rf /restore/*
echo "Restore complete at $(date)" env: - name: DATABASE_URL valueFrom: secretKeyRef: name: vaultwarden-secrets key: DATABASE_URL resources: limits: cpu: "1" memory: 256Mi requests: cpu: 100m memory: 128Mi volumeMounts: - name: restore-tmp mountPath: /restore - name: data mountPath: /data restartPolicy: OnFailure volumes: - name: restore-tmp emptyDir: {} - name: data persistentVolumeClaim: claimName: vaultwarden-data3.9 Backup CronJob (k3s → R2)
Section titled “3.9 Backup CronJob (k3s → R2)”This CronJob runs daily and exists for failover scenarios: if Servarr goes down and k3s becomes the primary, this ensures k3s writes are backed up to R2.
apiVersion: batch/v1kind: CronJobmetadata: name: vaultwarden-backup namespace: vaultwardenspec: schedule: "0 5 * * *" concurrencyPolicy: Forbid successfulJobsHistoryLimit: 3 failedJobsHistoryLimit: 3 jobTemplate: spec: template: metadata: labels: app: vaultwarden-backup spec: securityContext: fsGroup: 100 initContainers: - name: backup image: postgres:17 command: - bash - -c - | set -e TIMESTAMP=$(date +%Y%m%d-%H%M%S) BACKUP_DIR=/tmp/backup
mkdir -p ${BACKUP_DIR}
echo "Running pg_dump..." pg_dump -Fc --no-owner --no-acl -f ${BACKUP_DIR}/vaultwarden.pgdump "$DATABASE_URL"
# Copy additional data [ -d /data/attachments ] && cp -r /data/attachments ${BACKUP_DIR}/ [ -d /data/sends ] && cp -r /data/sends ${BACKUP_DIR}/ [ -d /data/icon_cache ] && cp -r /data/icon_cache ${BACKUP_DIR}/ [ -f /data/config.json ] && cp /data/config.json ${BACKUP_DIR}/ [ -f /data/rsa_key.pem ] && cp /data/rsa_key.pem ${BACKUP_DIR}/ [ -f /data/rsa_key.pub.pem ] && cp /data/rsa_key.pub.pem ${BACKUP_DIR}/
# Create tarball tar -czf /backups/vaultwarden-pg-${TIMESTAMP}.tar.gz -C ${BACKUP_DIR} . rm -rf ${BACKUP_DIR}
# Prune local backups older than 30 days find /backups -name "vaultwarden-pg-*.tar.gz" -mtime +30 -delete
echo "Backup created: vaultwarden-pg-${TIMESTAMP}.tar.gz" env: - name: DATABASE_URL valueFrom: secretKeyRef: name: vaultwarden-secrets key: DATABASE_URL resources: limits: cpu: "1" memory: 256Mi requests: cpu: 50m memory: 64Mi volumeMounts: - name: data mountPath: /data readOnly: true - name: backups mountPath: /backups containers: - name: upload-r2 image: amazon/aws-cli:2.27.31 command: - sh - -c - | LATEST=$(ls -t /backups/vaultwarden-pg-*.tar.gz 2>/dev/null | head -1) if [ -z "$LATEST" ]; then echo "No backup files found" exit 1 fi echo "Uploading $LATEST to R2..." aws s3 cp "$LATEST" s3://vault/pg-dumps/ \ --endpoint-url https://<your-account-id>.r2.cloudflarestorage.com echo "Upload complete" env: - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: vaultwarden-secrets key: R2_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: vaultwarden-secrets key: R2_SECRET_ACCESS_KEY - name: AWS_DEFAULT_REGION value: auto - name: TZ value: Your/Timezone resources: limits: cpu: 500m memory: 128Mi requests: cpu: 50m memory: 64Mi volumeMounts: - name: backups mountPath: /backups readOnly: true restartPolicy: OnFailure volumes: - name: data persistentVolumeClaim: claimName: vaultwarden-data - name: backups persistentVolumeClaim: claimName: vaultwarden-backupsPart 4: Cloudflare Load Balancer
Section titled “Part 4: Cloudflare Load Balancer”The Cloudflare Load Balancer owns the vault.example.com DNS record and routes traffic to the healthy origin. Both origins are Cloudflare Tunnel CNAMEs — not direct IPs — because neither site has a stable public IP.
4.1 How Tunnel-Based LB Origins Work
Section titled “4.1 How Tunnel-Based LB Origins Work”When using tunnels as LB origins, the pool origin address is the tunnel’s CNAME (format: <UUID>.cfargotunnel.com). The Host header must be set to the application hostname (vault.example.com) so the tunnel knows which ingress rule to match. The health monitor also needs this Host header.
Client → CF LB (vault.example.com) → CF Tunnel CNAME → cloudflared → Traefik/Docker → Vaultwarden4.2 Health Monitor
Section titled “4.2 Health Monitor”The monitor checks GET /alive on port 443 (HTTPS) every 60 seconds. The Host header is required because the tunnel origin is a shared CNAME that serves multiple hostnames.
resource "cloudflare_load_balancer_monitor" "vaultwarden" { account_id = var.cloudflare_account_id allow_insecure = false consecutive_down = 3 consecutive_up = 2 description = "vaultwarden" expected_codes = "200" follow_redirects = false interval = 60 method = "GET" path = "/alive" port = 443 retries = 2 timeout = 5 type = "https" header { header = "Host" values = ["vault.${var.domain_name}"] }}| Parameter | Value | Why |
|---|---|---|
interval | 60s | Password manager — fast failover matters |
consecutive_down | 3 | 3 failures (3 min) before marking unhealthy |
consecutive_up | 2 | 2 successes before marking healthy again |
path | /alive | Vaultwarden’s built-in health endpoint |
header.Host | vault.example.com | Required for tunnel origin routing |
4.3 Origin Pools
Section titled “4.3 Origin Pools”Two pools: Servarr (default) and k3s (fallback).
resource "cloudflare_load_balancer_pool" "vault_servarr" { account_id = var.cloudflare_account_id check_regions = ["ALL_REGIONS"] enabled = true minimum_origins = 1 monitor = cloudflare_load_balancer_monitor.vaultwarden.id name = "vault_servarr" origins { address = module.tunnel_servarr.cname enabled = true header { header = "Host" values = ["vault.${var.domain_name}"] } name = "vault_servarr" weight = 1 }}
resource "cloudflare_load_balancer_pool" "vault_k3s" { account_id = var.cloudflare_account_id check_regions = ["ALL_REGIONS"] enabled = true minimum_origins = 1 monitor = cloudflare_load_balancer_monitor.vaultwarden.id name = "vault_k3s" origins { address = var.k3s_tunnel enabled = true header { header = "Host" values = ["vault.${var.domain_name}"] } name = "vault_k3s" weight = 1 }}The k3s_tunnel variable must contain the correct tunnel CNAME. In this setup it’s aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.cfargotunnel.com. A stale UUID here will cause the pool to show “Unknown host” / Critical in the Cloudflare dashboard.
4.4 Load Balancer Resource
Section titled “4.4 Load Balancer Resource”resource "cloudflare_load_balancer" "vault" { default_pool_ids = [cloudflare_load_balancer_pool.vault_servarr.id] enabled = true fallback_pool_id = cloudflare_load_balancer_pool.vault_k3s.id name = "vault.${var.domain_name}" proxied = true session_affinity = "cookie" steering_policy = "off" zone_id = var.cloudflare_zone_id adaptive_routing { failover_across_pools = true } location_strategy { mode = "pop" prefer_ecs = "proximity" } random_steering { default_weight = 1 } session_affinity_attributes { samesite = "Auto" secure = "Auto" zero_downtime_failover = "temporary" }}| Setting | Value | Why |
|---|---|---|
steering_policy | off | No geographic steering — always Servarr unless it’s down |
session_affinity | cookie | Sticky sessions prevent mid-session failover |
failover_across_pools | true | If Servarr pool is unhealthy, fail over to k3s pool |
zero_downtime_failover | temporary | During failover, temporarily route to fallback without waiting for full health check cycle |
Part 5: Tunnel Configuration
Section titled “Part 5: Tunnel Configuration”Both sites need a tunnel ingress rule for vault.example.com. The DNS CNAME record that previously pointed to a single tunnel must be removed — the LB now owns that DNS record.
5.1 Servarr Tunnel
Section titled “5.1 Servarr Tunnel”The Servarr tunnel module in cloudflare-tf/main_zone/tunnels.tf includes the vault.example.com ingress rule pointing at the Vaultwarden container’s Docker IP:
# cloudflare-tf/main_zone/tunnels.tf (excerpt)module "tunnel_servarr" { source = "./modules/tunnel" account_id = var.cloudflare_account_id name = "servarr" secret = var.tunnel_secret
ingress_rules = [ # ... other services ... { hostname = "vault.${var.domain_name}", service = "http://172.20.0.2:80" }, # ... other services ... { service = "http_status:404" }, ]
vnet_name = "servarr_vnet" route = { network = "172.20.0.0/16" }}5.2 k3s Tunnel
Section titled “5.2 k3s Tunnel”The k3s tunnel in k3s/cloudflare-tunnel-tf/tunnel_config.tf routes vault.example.com through Traefik:
# k3s/cloudflare-tunnel-tf/tunnel_config.tf (excerpt)ingress_rule { hostname = "vault.${var.domain_name}" service = "https://traefik.traefik.svc.cluster.local" origin_request { origin_server_name = "vault.${var.domain_name}" http2_origin = true }}5.3 DNS CNAME Removal
Section titled “5.3 DNS CNAME Removal”The vault CNAME record in the k3s tunnel TF was commented out because the LB now creates and owns the DNS record:
# vault.example.com DNS is now managed by Cloudflare Load Balancer in cloudflare-tf/main_zone# resource "cloudflare_record" "vault" {# zone_id = var.cloudflare_zone_id# name = "vault"# type = "CNAME"# content = cloudflare_zero_trust_tunnel_cloudflared.k3s.cname# proxied = true# tags = ["k3s", "vaultwarden"]# }5.4 Tunnel UUID Fix
Section titled “5.4 Tunnel UUID Fix”The k3s_tunnel variable in secrets.tfvars was stale — it referenced an old tunnel UUID instead of the current one. This caused the k3s LB pool to show “Critical / Unknown host” because Cloudflare couldn’t route to a nonexistent tunnel.
# Before (broken):k3s_tunnel = "11111111-2222-3333-4444-555555555555.cfargotunnel.com"
# After (fixed):k3s_tunnel = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.cfargotunnel.com"To find the correct UUID, check the k3s tunnel TF state or the Cloudflare dashboard under Zero Trust → Networks → Tunnels.
Part 6: Failover Procedure
Section titled “Part 6: Failover Procedure”6.1 Automatic Failover (Servarr Down)
Section titled “6.1 Automatic Failover (Servarr Down)”When Servarr goes down:
- LB monitor detects 3 consecutive
/alivefailures (3 min) - Servarr pool marked unhealthy
failover_across_poolskicks in, routes to k3s fallback pool- k3s Vaultwarden serves traffic from its last R2 restore (at most 15 min stale)
- Bitwarden clients reconnect transparently (cookie session affinity resets)
No manual intervention needed. Data staleness is bounded by the 15-minute restore interval.
6.2 Manual Failover (Planned Maintenance)
Section titled “6.2 Manual Failover (Planned Maintenance)”For planned Servarr maintenance:
- Trigger one final backup on Servarr:
docker exec pg_backup_r2 bash -c "backup_and_upload" - Wait for k3s restore CronJob to run (or trigger manually:
kubectl create job --from=cronjob/vaultwarden-restore manual-restore -n vaultwarden) - Disable the Servarr pool in the LB (set
enabled = falsein TF, or toggle in dashboard) - Verify traffic is flowing through k3s: check Traefik access logs for
vault.example.comrequests
6.3 Failback (Servarr Recovery)
Section titled “6.3 Failback (Servarr Recovery)”When Servarr comes back:
- If k3s received writes during failover, you need to export from k3s first:
- Trigger k3s backup CronJob:
kubectl create job --from=cronjob/vaultwarden-backup manual-backup -n vaultwarden - On Servarr, restore from the k3s-uploaded backup
- Trigger k3s backup CronJob:
- Start Servarr containers:
docker compose up -d - LB monitor detects 2 consecutive successes, re-enables Servarr pool
- Traffic automatically returns to Servarr (it’s the default pool)
Part 7: Verification
Section titled “Part 7: Verification”7.1 Check R2 Backups
Section titled “7.1 Check R2 Backups”# From Servarr (via pg_backup_r2 container)docker exec pg_backup_r2 aws s3 ls s3://vault/pg-dumps/ \ --endpoint-url https://<your-account-id>.r2.cloudflarestorage.com
# Or from any machine with aws-cli configuredaws s3 ls s3://vault/pg-dumps/ \ --endpoint-url https://<your-account-id>.r2.cloudflarestorage.com7.2 Check k3s Restore
Section titled “7.2 Check k3s Restore”# Check recent restore job logskubectl logs -n vaultwarden -l app=vaultwarden-restore --tail=50
# Verify data in k3s PGkubectl exec -n vaultwarden deploy/postgres-vaultwarden -- \ psql -U vaultwarden -c "SELECT COUNT(*) FROM ciphers;"7.3 Check LB Health
Section titled “7.3 Check LB Health”# Cloudflare API (or check the dashboard under Traffic → Load Balancing)curl -s -H "Authorization: Bearer $CF_API_TOKEN" \ "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/load_balancers" | jq '.result[] | select(.name | contains("vault"))'Both pools should show healthy: true. If the k3s pool shows “Unknown host”, the tunnel UUID in secrets.tfvars is wrong.
7.4 Test Failover
Section titled “7.4 Test Failover”Temporarily stop Vaultwarden on Servarr:
ssh servarr docker stop vaultwardenWait 3-4 minutes for the LB monitor to detect the failure. Then:
curl -sI https://vault.example.com/aliveShould return HTTP 200, served by k3s. Restart Servarr after testing:
ssh servarr docker start vaultwardenGotchas and Lessons Learned
Section titled “Gotchas and Lessons Learned”WebSocket Port 3012 Was Removed in v1.31.0
Section titled “WebSocket Port 3012 Was Removed in v1.31.0”Prior to v1.29.0, Vaultwarden served WebSocket notifications on a separate port (3012). Reverse proxy configs often had a dedicated route for /notifications/hub → :3012. Since v1.29.0, WebSocket notifications are served natively on the main HTTP port (80), and the WEBSOCKET_ENABLED / WEBSOCKET_PORT env vars were deprecated. In v1.31.0, port 3012 support was fully removed. If you’re upgrading from an older version, remove the port 3012 mapping and any separate WebSocket proxy rules.
bruceforce/vaultwarden-backup Is a Daemon, Not a Job
Section titled “bruceforce/vaultwarden-backup Is a Daemon, Not a Job”The popular bruceforce/vaultwarden-backup Docker image runs cron internally as a long-lived daemon. It’s designed for docker run --restart always, not as a Kubernetes CronJob or initContainer. Trying to use it as an initContainer will either hang (waiting for the internal cron) or exit before backup completes. Building a custom backup script with pg_dump + aws s3 cp is more predictable in both Docker and K8s contexts.
pgloader and __diesel_schema_migrations
Section titled “pgloader and __diesel_schema_migrations”Vaultwarden uses Diesel ORM for database migrations. The __diesel_schema_migrations table tracks which migrations have been applied. When you bootstrap the PG schema by starting Vaultwarden (step 1.2), Diesel writes its migration records. If pgloader then imports the SQLite version of this table, you get duplicate keys and Vaultwarden may refuse to start or re-run migrations incorrectly. Always exclude this table with --excluding-table-names.
I_REALLY_WANT_VOLATILE_STORAGE
Section titled “I_REALLY_WANT_VOLATILE_STORAGE”Vaultwarden checks for a persistent /data volume at startup. If it doesn’t find one (e.g., no volume mount), it refuses to start to protect against data loss. Setting I_REALLY_WANT_VOLATILE_STORAGE=true overrides this check. Only use it for the one-time schema bootstrap — never in production.
Tunnel CNAME Staleness
Section titled “Tunnel CNAME Staleness”Cloudflare Tunnel UUIDs are assigned at tunnel creation and never change. But if you recreate a tunnel (delete + create), the UUID changes. Any TF variable or config referencing the old UUID will silently break — the LB pool will show “Critical” because it can’t route to a tunnel that no longer exists. Always verify the UUID matches the actual tunnel:
# In k3s tunnel TF directory:tofu state show cloudflare_zero_trust_tunnel_cloudflared.k3s | grep cnameLB Monitor Needs Host Header for Tunnel Origins
Section titled “LB Monitor Needs Host Header for Tunnel Origins”When the LB pool origin is a tunnel CNAME, the health check request arrives at the tunnel with no Host header by default. The tunnel doesn’t know which ingress rule to match, so it returns 404. Adding header { header = "Host" values = ["vault.example.com"] } to the monitor fixes this.