Running PostgreSQL on Kubernetes: Operators, Storage, and Production Pitfalls
The "should I run databases on Kubernetes?" debate is over. Zalando runs 4,000+ PostgreSQL clusters on Kubernetes. Bloomberg, Apple, and dozens of CNCF end-user companies do the same. The question is no longer whether it works -- it is whether your team should do it, and if so, how to do it without losing data or sleep.
This guide covers the three major PostgreSQL operators, the storage decisions that will make or break your deployment, backup strategies, connection pooling, monitoring, and a production checklist you can use before going live. I will be honest about when Kubernetes is overkill for PostgreSQL and when it genuinely earns its complexity.
Why Run PostgreSQL on Kubernetes
The pitch is simple: manage your database the same way you manage everything else.
GitOps for databases. Your PostgreSQL cluster is defined in YAML, version-controlled, reviewed in pull requests. A one-line change to replica count triggers automatic provisioning. No SSH, no runbooks, no "that one person who knows how to set up replication."
Dev/prod parity. Every developer spins up an identical PostgreSQL cluster with kubectl apply. No more "works on my machine" because dev uses SQLite and prod uses PostgreSQL 16 with custom extensions.
Automated failover. Operators detect primary failure and promote a replica in seconds. Patroni-based operators (Zalando) achieve sub-10-second failover. CloudNativePG uses native Kubernetes primitives to do the same. Compare this to manually promoting a standby at 3 AM.
Scaling. Adding a read replica is a one-field change. Operators handle base backup, replication setup, and endpoint configuration automatically.
Consistent operations. Backups, minor version upgrades, certificate rotation, and connection pooling are declared in the same manifest as the cluster itself. One tool, one workflow.
Why NOT to Run PostgreSQL on Kubernetes
I need to be direct here. Kubernetes adds real complexity, and not every team benefits.
Storage is the hard part. Databases need persistent, high-performance storage. Kubernetes was designed for stateless workloads. PersistentVolumeClaims, StorageClasses, CSI drivers, and volume topology constraints add layers between your database and its disk. A misconfigured StorageClass can silently give you network-attached storage with 10x the latency of local NVMe.
The "pets vs cattle" mismatch. Kubernetes treats pods as disposable cattle. Your primary database is a pet -- it has unique state, and losing it without a proper handoff means downtime or data loss. Operators bridge this gap, but they add their own complexity.
Networking overhead. Service meshes, network policies, and CNI plugins introduce latency and failure modes that do not exist on bare metal. A 0.5ms network hop per query adds up at 10,000 queries per second.
Operational complexity ceiling. When something goes wrong -- a stuck failover, a PVC that will not bind, a WAL segment that did not archive -- you need to debug both Kubernetes and PostgreSQL simultaneously. That requires two skill sets.
Small teams should think twice. If you have fewer than five engineers and a single PostgreSQL instance, a managed database (RDS, Cloud SQL, Supabase) will serve you better. You will pay more per month and save hundreds of hours per year.
The honest answer: run PostgreSQL on Kubernetes if your team already operates Kubernetes confidently and you need multiple PostgreSQL clusters. Otherwise, start with managed and migrate later when you hit its limits.
The Three Major Operators
A Kubernetes operator is a controller that understands how to run a specific application. For PostgreSQL, operators handle replication, failover, backups, and connection pooling. Three operators dominate the ecosystem.
CloudNativePG
CloudNativePG is a CNCF Sandbox project and the most actively developed PostgreSQL operator. It takes a Kubernetes-native approach: no external dependencies for high availability, no Patroni, no etcd. Failover uses Kubernetes primitives (pod readiness, leader election) directly.
Key strengths:
- Cleanest Kubernetes integration -- feels like a native resource
- WAL archiving and backup to S3/GCS/Azure built in (via Barman Cloud)
- Declarative tablespace support
- Plugin architecture for extensions
- No sidecar containers needed for HA
Minimal cluster definition:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: my-app-db
spec:
instances: 3
storage:
size: 50Gi
storageClass: fast-ssd
postgresql:
parameters:
shared_buffers: "256MB"
effective_cache_size: "768MB"
work_mem: "8MB"
max_connections: "200"
bootstrap:
initdb:
database: myapp
owner: myapp
backup:
barmanObjectStore:
destinationPath: "s3://my-backups/cnpg/"
s3Credentials:
accessKeyId:
name: aws-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: aws-creds
key: SECRET_ACCESS_KEY
wal:
compression: gzip
retentionPolicy: "14d"
That is a three-node PostgreSQL cluster with streaming replication, automated failover, continuous WAL archiving to S3, and 14-day backup retention. In roughly 30 lines of YAML.
Install:
kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.25/releases/cnpg-1.25.0.yaml
Zalando Postgres Operator
Battle-tested at Zalando, where it manages 4,000+ PostgreSQL clusters serving one of Europe's largest e-commerce platforms. It uses Patroni for high availability -- the same tool that powers PostgreSQL HA at GitLab, Zalando, and dozens of other organizations outside Kubernetes.
Key strengths:
- Patroni HA is the most proven PostgreSQL failover tool
- Spilo container image bundles PostgreSQL + Patroni + WAL-G + PgBouncer
- Connection pooling built in
- Team-based access control (maps Kubernetes namespaces to PostgreSQL roles)
- Clone from S3 backup or existing cluster
Cluster definition:
apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
name: my-app-db
spec:
teamId: "myteam"
numberOfInstances: 3
volume:
size: 50Gi
storageClass: fast-ssd
postgresql:
version: "16"
parameters:
shared_buffers: "256MB"
effective_cache_size: "768MB"
work_mem: "8MB"
users:
myapp:
- superuser
- createdb
databases:
myapp: myapp
patroni:
ttl: 30
loop_wait: 10
retry_timeout: 10
synchronous_mode: false
resources:
requests:
cpu: "1"
memory: 2Gi
limits:
cpu: "2"
memory: 4Gi
Install:
kubectl create namespace postgres-operator
helm install postgres-operator \
oci://ghcr.io/zalando/postgres-operator-charts/postgres-operator \
-n postgres-operator
Crunchy PGO (v5)
CrunchyData's operator, commercially backed with enterprise support available. It integrates deeply with pgBackRest for backup and pgMonitor for Prometheus-based monitoring.
Key strengths:
- pgBackRest integration is the most mature -- differential backups, multi-repository, backup verification
- Built-in Prometheus exporter (pgMonitor/postgres_exporter)
- PgBouncer sidecar managed declaratively
- Enterprise support option
- Extensive documentation
Cluster definition:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: my-app-db
spec:
postgresVersion: 16
instances:
- name: pgha
replicas: 3
dataVolumeClaimSpec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi
resources:
requests:
cpu: "1"
memory: 2Gi
backups:
pgbackrest:
repos:
- name: repo1
s3:
bucket: my-backups
endpoint: s3.amazonaws.com
region: us-east-1
schedules:
full: "0 1 * * 0" # Weekly full
differential: "0 1 * * 1-6" # Daily differential
proxy:
pgBouncer:
replicas: 2
config:
global:
pool_mode: transaction
default_pool_size: "20"
Install:
kubectl apply --server-side -k \
github.com/CrunchyData/postgres-operator-examples/install/default
Storage: The Decision That Matters Most
Storage is where PostgreSQL on Kubernetes succeeds or fails. Get this wrong and no amount of operator tuning will save you.
StorageClass selection
Your StorageClass determines the actual disk type underneath PersistentVolumeClaims. The defaults on most cloud providers are network-attached disks (EBS gp3, GCE pd-ssd). These work, but understand the tradeoffs:
| Storage Type | Latency | Throughput | Survives Node Failure | Cost |
|---|---|---|---|---|
| Local NVMe | ~0.1ms | 3+ GB/s | No | Lowest |
| EBS io2 | ~0.5ms | 4 GB/s | Yes | High |
| EBS gp3 | ~1ms | 1 GB/s | Yes | Medium |
| GCE pd-ssd | ~1ms | 1.2 GB/s | Yes | Medium |
Local NVMe gives the best performance but does not survive node failure. Use it only if your operator handles re-cloning from backup automatically (CloudNativePG and Crunchy PGO both support this).
Network-attached SSD (gp3, pd-ssd) is the safe default. The latency overhead is real but acceptable for most workloads. Enable volumeBindingMode: WaitForFirstConsumer on your StorageClass to ensure volumes are provisioned in the same zone as the pod.
WAL on a separate volume
Write-ahead log (WAL) writes are sequential and latency-sensitive. Data writes are random. Mixing them on a single volume means they compete for IOPS. CloudNativePG supports a dedicated WAL volume:
spec:
storage:
size: 50Gi
storageClass: fast-ssd
walStorage:
size: 10Gi
storageClass: fast-ssd
This is not mandatory, but for write-heavy workloads it can reduce commit latency by 30-50%.
Benchmark before you commit
Run pgbench on your actual StorageClass before committing to production:
# Initialize pgbench
kubectl exec -it my-app-db-1 -- pgbench -i -s 100 myapp
# Run a mixed read-write benchmark
kubectl exec -it my-app-db-1 -- pgbench -c 20 -j 4 -T 300 myapp
Compare the transactions-per-second and latency numbers against your application requirements. If your p99 query time is 50ms and storage adds 5ms of latency, that is acceptable. If your p99 is 5ms and storage adds 5ms, you have a problem.
Backup and Recovery
Losing a database on Kubernetes is no different from losing it anywhere else -- catastrophic. Every operator supports automated backups, but the implementations differ.
pgBackRest (Crunchy PGO) is the most feature-rich: full, differential, and incremental backups, parallel backup and restore, backup verification, multi-repository support. It can back up to S3, GCS, Azure, or a local volume simultaneously.
Barman Cloud (CloudNativePG) handles continuous WAL archiving and base backups to object storage. Simpler than pgBackRest but covers the critical path: point-in-time recovery from any moment in the retention window.
WAL-G (Zalando) is integrated into the Spilo image. Handles WAL archiving and base backups to S3/GCS. Proven at Zalando's scale.
Regardless of which operator you use, verify these are working:
- WAL archiving is continuous. Check for gaps. A single missing WAL segment breaks point-in-time recovery.
- Test restores regularly. A backup you have never restored is not a backup. Schedule monthly restore tests.
- Monitor backup age. If your last successful backup is older than your RPO, you have a problem right now.
# CloudNativePG: check backup status
kubectl get backups -n my-namespace
# Zalando: check WAL-G backup list
kubectl exec my-app-db-0 -- envdir /run/etc/wal-e.d/env wal-g backup-list
# Crunchy PGO: check pgBackRest info
kubectl exec my-app-db-pgha-0 -- pgbackrest info
Connection Pooling
Kubernetes amplifies the connection problem. Pod restarts, rolling deployments, and horizontal scaling all churn connections. You need a pooler between your application pods and PostgreSQL.
Crunchy PGO manages PgBouncer as a first-class resource. Declare it in the PostgresCluster spec and the operator handles deployment, configuration, and TLS certificates. This is the cleanest integration.
Zalando Operator includes PgBouncer in the Spilo image and can deploy a separate connection pooler resource. Configuration is via the connectionPooler spec.
CloudNativePG supports PgBouncer through its pooler resource:
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: my-app-db-pooler-rw
spec:
cluster:
name: my-app-db
instances: 2
type: rw
pgbouncer:
poolMode: transaction
parameters:
default_pool_size: "25"
max_client_conn: "200"
Transaction pooling mode is the right default for most applications. It multiplexes many client connections over fewer PostgreSQL backend connections by reassigning backends between transactions. Session pooling is only needed if your application uses prepared statements, session-level advisory locks, or temporary tables outside explicit transactions.
Monitoring PostgreSQL on Kubernetes
Kubernetes adds a layer of complexity to monitoring. You need to monitor both the PostgreSQL instance and the Kubernetes resources, and the failure modes of each can look identical from the application's perspective -- a slow query could be a missing index or a throttled pod hitting its CPU limit.
Your Kubernetes monitoring stack (Prometheus, Grafana, Datadog) handles pod health, resource utilization, PVC capacity, and node pressure. But these tools have a blind spot: they see PostgreSQL as a black box. They can tell you a pod is using 90% of its CPU limit but not whether that is caused by sequential scans on a table missing an index.
Tools like myDBA.dev focus on the PostgreSQL layer -- health checks that catch configuration drift and anti-patterns, query performance analysis, index recommendations, EXPLAIN plan tracking and regression detection, and replication monitoring. This is the kind of insight you need regardless of whether PostgreSQL runs on Kubernetes, bare metal, or a managed service.
The monitoring setup that works in practice is layered:
- Kubernetes layer: Prometheus + node-exporter + kube-state-metrics for pod, node, and PVC metrics
- PostgreSQL metrics: postgres_exporter (Crunchy PGO includes this) for pg_stat_statements, replication lag, connection counts
- PostgreSQL intelligence: myDBA.dev for health checks, index advisor, EXPLAIN plan analysis, and query-level performance tracking
Each operator also exposes its own metrics. CloudNativePG has a built-in Prometheus endpoint. Zalando's operator exposes metrics for cluster state and failover events. Set up alerts for:
- Replication lag exceeding 30 seconds
- WAL archiving failure (backup is broken)
- Connection utilization above 80%
- PVC usage above 85%
- Pod restart count increasing (crash loop)
Operator Comparison
| Feature | CloudNativePG | Zalando | Crunchy PGO |
|---|---|---|---|
| HA mechanism | Native K8s primitives | Patroni + DCS | Patroni-based |
| Backup tool | Barman Cloud | WAL-G | pgBackRest |
| Connection pooler | PgBouncer (Pooler CRD) | PgBouncer (built-in) | PgBouncer (managed) |
| PG versions | 12-17 | 12-16 | 13-16 |
| CNCF status | Sandbox project | Community | Community |
| Commercial support | EDB | Zalando (internal) | CrunchyData |
| Monitoring | Prometheus endpoint | Prometheus endpoint | pgMonitor built-in |
| Declarative users | Yes (via secrets) | Yes (team-based) | Yes (via secrets) |
| Tablespace support | Yes | No | Yes |
| WAL volume | Dedicated volume | Shared volume | Dedicated volume |
| Sidecar approach | Minimal (no HA sidecar) | Patroni + WAL-G | Patroni + pgBackRest |
| Learning curve | Moderate | Moderate | Steeper |
My recommendation: CloudNativePG for new deployments. It has the most active development, the cleanest Kubernetes integration, and CNCF backing gives confidence in long-term viability. Zalando's operator is the right choice if you are already invested in Patroni or need its team-based access model. Crunchy PGO wins if pgBackRest's advanced backup features are critical or you need commercial support.
Getting Started: Minimal CloudNativePG Setup
The fastest path from zero to a running PostgreSQL cluster on Kubernetes:
# 1. Install the operator
kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.25/releases/cnpg-1.25.0.yaml
# 2. Wait for the operator to be ready
kubectl wait --for=condition=Available deployment/cnpg-controller-manager \
-n cnpg-system --timeout=120s
# 3. Save as cluster.yaml and apply
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: my-first-cluster
spec:
instances: 3
storage:
size: 10Gi
bootstrap:
initdb:
database: app
owner: app
postgresql:
parameters:
shared_buffers: "128MB"
log_statement: "ddl"
log_min_duration_statement: "1000"
# 4. Apply and wait
kubectl apply -f cluster.yaml
kubectl wait --for=condition=Ready cluster/my-first-cluster --timeout=300s
# 5. Connect
kubectl exec -it my-first-cluster-1 -- psql -U app -d app
That gives you a three-instance PostgreSQL cluster with streaming replication and automated failover. No Helm charts, no complex configuration.
To expose it to your application pods:
# The operator creates these services automatically:
# my-first-cluster-rw -> always points to the primary
# my-first-cluster-ro -> load-balances across replicas
# my-first-cluster-r -> load-balances across all instances
# In your application deployment:
env:
- name: DATABASE_URL
value: "postgresql://app:$(PASSWORD)@my-first-cluster-rw:5432/app"
Production Checklist
Before going live with PostgreSQL on Kubernetes, verify every item:
Storage
- [ ] StorageClass uses SSD-backed volumes (not default HDD)
- [ ]
volumeBindingMode: WaitForFirstConsumeris set - [ ] PVC size accounts for 2x expected data (WAL, temp files, bloat)
- [ ] WAL on separate volume for write-heavy workloads
- [ ] Benchmarked with pgbench on actual StorageClass
Resources
- [ ] CPU and memory requests set (not just limits)
- [ ] Memory limit is at least
shared_buffers + work_mem * max_connections + 512MB - [ ] CPU requests reflect actual steady-state usage
High Availability
- [ ] At least 3 instances (primary + 2 replicas)
- [ ] PodDisruptionBudget allows at most 1 unavailable
- [ ] Pod anti-affinity spreads instances across nodes
- [ ] Tested failover by deleting the primary pod
# PodDisruptionBudget example (CloudNativePG creates this automatically)
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-db
spec:
maxUnavailable: 1
selector:
matchLabels:
cnpg.io/cluster: my-app-db
Backup
- [ ] WAL archiving to object storage (S3/GCS) is active
- [ ] Base backup schedule is configured (daily minimum)
- [ ] Backup retention meets your compliance requirements
- [ ] Tested point-in-time recovery from backup
- [ ] Backup monitoring alerts are in place
Security
- [ ] TLS enabled for client connections
- [ ] Database credentials stored in Kubernetes Secrets (or external secret manager)
- [ ] Network policies restrict PostgreSQL port access to application namespaces
- [ ]
pg_hba.confrestricts authentication methods
Monitoring
- [ ] PostgreSQL metrics exported to Prometheus
- [ ] Alerts for replication lag, connection saturation, PVC usage
- [ ] PostgreSQL-level monitoring for query performance, health checks, index recommendations (myDBA.dev works with any PostgreSQL deployment including Kubernetes)
- [ ] Log aggregation from PostgreSQL pods
Connection Pooling
- [ ] PgBouncer deployed (operator-managed or sidecar)
- [ ] Pool mode set to
transaction(unless session features required) - [ ] Application connects through pooler, not directly to PostgreSQL
- [ ] Pool size tuned:
default_pool_size * pooler_replicas < max_connections
Final Thoughts
Running PostgreSQL on Kubernetes is not inherently better or worse than running it on VMs or using a managed service. It is a tradeoff: you gain consistency and automation, you pay with storage complexity and a steeper debugging surface.
If your team already runs Kubernetes and manages multiple PostgreSQL instances, an operator will reduce your operational burden. If you are a small team with one database, use RDS or Cloud SQL and spend your engineering time on your product.
For teams that do take the Kubernetes path, start with CloudNativePG. It is the simplest to set up, the most actively maintained, and it does not carry the dependency weight of a full Patroni stack. Get your storage right, test your backups, and layer your monitoring so you can see both the Kubernetes and PostgreSQL perspectives when something goes wrong.
The operators have matured to the point where "can I run PostgreSQL on Kubernetes?" is no longer the right question. The right question is: "does my team have the Kubernetes expertise to debug a stuck PVC at 3 AM?" If yes, welcome aboard.


Top comments (0)