I'm starting a series of blog posts to explore CloudNativePG (CNPG), a Kubernetes operator for PostgreSQL that automates high availability in containerized environments.
PostgreSQL itself supports physical streaming replication, but doesn’t provide orchestration logic — no automatic promotion, scaling, or failover. Tools like Patroni fill that gap by implementing consensus (etcd, Consul, ZooKeeper, Kubernetes, or Raft) for cluster state management. In Kubernetes, databases are often deployed with StatefulSets, which provide stable network identities and persistent storage per instance. CloudNativePG instead defines PostgreSQL‑specific CustomResourceDefinitions (CRDs), which introduce the following resources:
- ImageCatalog: PostgreSQL image catalogs
- Cluster: PostgreSQL cluster definition
- Database: Declarative database management
- Pooler: PgBouncer connection pooling
- Backup: On-demand backup requests
- ScheduledBackup: Automated backup scheduling
- Publication Logical replication publications
- Subscription Logical replication subscriptions
Install: control plane for PostgreSQL
Here I’m using CNPG 1.28, which is the first release to support (quorum-based failover). Prior versions promoted the most-recently-available standby without preventing data loss (good for disaster recovery but not strict high availability).
Install the operator’s components:
kubectl apply --server-side -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.28/releases/cnpg-1.28.0.yaml
The CRDs and controller deploy into the cnpg-system namespace. Check rollout status:
kubectl rollout status deployment -n cnpg-system cnpg-controller-manager
deployment "cnpg-controller-manager" successfully rolled out
This Deployment defines the CloudNativePG Controller Manager — the control plane component — which runs as a single pod and continuously reconciles PostgreSQL cluster resources with their desired state via the Kubernetes API:
kubectl get deployments -n cnpg-system -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
cnpg-controller-manager 1/1 1 1 11d manager ghcr.io/cloudnative-pg/cloudnative-pg:1.28.0 app.kubernetes.io/name=cloudnative-pg
The pod’s containers listen on ports for metrics (8080/TCP) and webhook configuration (9443/TCP), and interact with CNPG’s CRDs during the reconciliation loop:
kubectl describe deploy -n cnpg-system cnpg-controller-manager
Name: cnpg-controller-manager
Namespace: cnpg-system
CreationTimestamp: Thu, 15 Jan 2026 21:04:25 +0100
Labels: app.kubernetes.io/name=cloudnative-pg
Annotations: deployment.kubernetes.io/revision: 1
Selector: app.kubernetes.io/name=cloudnative-pg
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/name=cloudnative-pg
Service Account: cnpg-manager
Containers:
manager:
Image: ghcr.io/cloudnative-pg/cloudnative-pg:1.28.0
Ports: 8080/TCP (metrics), 9443/TCP (webhook-server)
Host Ports: 0/TCP (metrics), 0/TCP (webhook-server)
SeccompProfile: RuntimeDefault
Command:
/manager
Args:
controller
--leader-elect
--max-concurrent-reconciles=10
--config-map-name=cnpg-controller-manager-config
--secret-name=cnpg-controller-manager-config
--webhook-port=9443
Limits:
cpu: 100m
memory: 200Mi
Requests:
cpu: 100m
memory: 100Mi
Liveness: http-get https://:9443/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:9443/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
Startup: http-get https://:9443/readyz delay=0s timeout=1s period=5s #success=1 #failure=6
Environment:
OPERATOR_IMAGE_NAME: ghcr.io/cloudnative-pg/cloudnative-pg:1.28.0
OPERATOR_NAMESPACE: (v1:metadata.namespace)
MONITORING_QUERIES_CONFIGMAP: cnpg-default-monitoring
Mounts:
/controller from scratch-data (rw)
/run/secrets/cnpg.io/webhook from webhook-certificates (rw)
Volumes:
scratch-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
webhook-certificates:
Type: Secret (a volume populated by a Secret)
SecretName: cnpg-webhook-cert
Optional: true
Node-Selectors: <none>
Tolerations: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: cnpg-controller-manager-6b9f78f594 (1/1 replicas created)
Events: <none>
Deploy: data plane (PostgreSQL cluster)
The control plane handles orchestration logic. The actual PostgreSQL instances — the data plane — are managed via CNPG’s Cluster custom resource.
Create a dedicated namespace:
kubectl delete namespace lab
kubectl create namespace lab
namespace/lab created
Here’s a minimal high-availability cluster spec:
- 3 instances: 1 primary, 2 hot standby replicas
- Synchronous commit to 1 replica
- Quorum-based failover enabled
cat > lab-cluster-rf3.yaml <<'YAML'
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cnpg
spec:
instances: 3
postgresql:
synchronous:
method: any
number: 1
failoverQuorum: true
storage:
size: 1Gi
YAML
kubectl -n lab apply -f lab-cluster-rf3.yaml
CNPG provisions Pods with stateful semantics, using PersistentVolumeClaims for storage:
kubectl -n lab get pvc -o wide
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE VOLUMEMODE
cnpg-1 Bound pvc-76754ba4-e8bd-4218-837f-36aa0010940f 1Gi RWO hostpath <unset> 42s Filesystem
cnpg-2 Bound pvc-3b231dcc-b973-43f8-a429-80222bd51420 1Gi RWO hostpath <unset> 26s Filesystem
cnpg-3 Bound pvc-b8e4c6a0-bbcb-445d-9267-ffe38a1a8685 1Gi RWO hostpath <unset> 10s Filesystem
These PVCs bind to PersistentVolumes provided by their storage class:
kubectl -n lab get pv -o wide
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE VOLUMEMODE
pvc-3b231dcc-b973-43f8-a429-80222bd51420 1Gi RWO Delete Bound lab/cnpg-2 hostpath <unset> 53s Filesystem
pvc-76754ba4-e8bd-4218-837f-36aa0010940f 1Gi RWO Delete Bound lab/cnpg-1 hostpath <unset> 69s Filesystem
pvc-b8e4c6a0-bbcb-445d-9267-ffe38a1a8685 1Gi RWO Delete Bound lab/cnpg-3 hostpath <unset> 37s Filesystem
PostgreSQL instances runs in pods:
kubectl -n lab get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cnpg-1 1/1 Running 0 3m46s 10.1.0.141 docker-desktop <none> <none>
cnpg-2 1/1 Running 0 3m29s 10.1.0.143 docker-desktop <none> <none>
cnpg-3 1/1 Running 0 3m13s 10.1.0.145 docker-desktop <none> <none>
In Kubernetes, pods are typically considered equal, but PostgreSQL uses a single primary node while the other pods serve as read replicas. CNPG identifies which pod is running the primary instance:
kubectl -n lab get cluster
NAME AGE INSTANCES READY STATUS PRIMARY
cnpg 4m 3 3 Cluster in healthy state cnpg-1
As the roles of pods can change with a switchover or failover, application access though services that expose the right instances:
kubectl -n lab get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
cnpg-r ClusterIP 10.97.182.192 <none> 5432/TCP 4m13s cnpg.io/cluster=cnpg,cnpg.io/podRole=instance
cnpg-ro ClusterIP 10.111.116.164 <none> 5432/TCP 4m13s cnpg.io/cluster=cnpg,cnpg.io/instanceRole=replica
cnpg-rw ClusterIP 10.108.19.85 <none> 5432/TCP 4m13s cnpg.io/cluster=cnpg,cnpg.io/instanceRole=primary
Those are the endpoints used to connect to PostgreSQL:
-
cnpg-rwconnects to the primary for consistent reads and writes -
cnpg-roconnects to one standby for stale reads -
cnpg-rconnects the primary or standby for stale reads
The load-balancing of read workloads is round-robin, like a host list, so the same workload runs on all replicas.
Client access setup
CNPG generated credentials in a Kubernetes Secret named cnpg-app for the user app:
kubectl -n lab get secrets
NAME TYPE DATA AGE
cnpg-app kubernetes.io/basic-auth 11 8m48s
cnpg-ca Opaque 2 8m48s
cnpg-replication kubernetes.io/tls 2 8m48s
cnpg-server kubernetes.io/tls 2 8m48s
When needed, the password can be retreived with kubectl -n lab get secret cnpg-app -o jsonpath='{.data.password}' | base64 -d).
Define a shell alias to launch a PostgreSQL client pod with these credentials:
alias pgrw='kubectl -n lab run client --rm -it --restart=Never \
--env PGHOST="cnpg-rw" \
--env PGUSER="app" \
--env PGPASSWORD="$(kubectl -n lab get secret cnpg-app -o jsonpath='{.data.password}' | base64 -d)" \
--image=postgres:18 --'
Use the alias pgrw to run a PostgreSQL client connected to the primary.
PgBench default workload
With the previous alias defined, initialize PgBench tables:
pgrw pgbench -i
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 0.10 s (drop tables 0.02 s, create tables 0.01 s, client-side generate 0.04 s, vacuum 0.01 s, primary keys 0.01 s).
pod "client" deleted from lab namespace
Run for 10 minutes with progress every 5 seconds:
pgrw pgbench -T 600 -P 5
progress: 5.0 s, 1541.4 tps, lat 0.648 ms stddev 0.358, 0 failed
progress: 10.0 s, 1648.6 tps, lat 0.606 ms stddev 0.154, 0 failed
progress: 15.0 s, 1432.7 tps, lat 0.698 ms stddev 0.218, 0 failed
progress: 20.0 s, 1581.3 tps, lat 0.632 ms stddev 0.169, 0 failed
progress: 25.0 s, 1448.2 tps, lat 0.690 ms stddev 0.315, 0 failed
progress: 30.0 s, 1640.6 tps, lat 0.609 ms stddev 0.155, 0 failed
progress: 35.0 s, 1609.9 tps, lat 0.621 ms stddev 0.223, 0 failed
Simulated failure
In another terminal, I checked which is the primary pod:
kubectl -n lab get cluster
NAME AGE INSTANCES READY STATUS PRIMARY
cnpg 40m 3 3 Cluster in healthy state cnpg-1
From the Docker Desktop GUI, I paused the container in the primary's pod:
PgBench queries hang as the primary where it is connected to doesn't reply:
The pod was recovered and PgBench continues without being disconnected:
Kubernetes monitors pod health with liveness/readiness probes and restarts containers when those probes fail. In this case, Kubernetes—not CNPG—restored the service.
Meanwhile, CNPG independently monitors PostgreSQL and triggered a failover before Kubernetes restarted the pod:
franck.pachot@M-C7Y646J4JP cnpg % kubectl -n lab get cluster
NAME AGE INSTANCES READY STATUS PRIMARY
cnpg 3m6s 3 2 Failing over cnpg-1
Kubernetes brought the service back in about 30 seconds, but CNPG had already initiated a failover. A new outage will happen.
A few minutes later, cnpg-1 restarted and PgBench exited with:
WARNING: canceling the wait for synchronous replication and terminating connection due to administrator command
DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.
pgbench: error: client 0 aborted in command 10 (SQL) of script 0; perhaps the backend died while processing
Because cnpg-1 was still there and healthy, it is still the primary, but all connections have been terminated.
Observations
This test shows how PostgreSQL and Kubernetes interact under CloudNativePG. Kubernetes pod health checks and CloudNativePG’s failover logic each run their own control loop:
- Kubernetes restarts containers when liveness or readiness probes fail.
- CloudNativePG (CNPG) evaluates database health using replication state, quorum, and instance manager connectivity.
Pausing the container briefly triggered CNPG’s primary isolation check. When the primary loses contact with both the Kubernetes API and other cluster members, CNPG shuts it down to prevent split-brain. Timeline:
- T+0s — Primary paused. CNPG detects isolation.
- T+30s — Kubernetes restarts the container.
- T+180s — CNPG triggers failover.
- T+275s — Primary shutdown terminates client connections.
Because CNPG and Kubernetes act on different timelines, the original pod restarted as primary (“self-failover”) when no replica was a better promotion candidate. CNPG prioritizes data integrity over fast recovery and, without a consensus protocol like Raft, relies on:
- Kubernetes API state
- PostgreSQL streaming replication
- Instance manager health checks
This can cause false positives under transient faults but protects against split-brain. Reproducible steps:
https://github.com/cloudnative-pg/cloudnative-pg/discussions/9814
Cloud systems can fail in many ways. In this test, I used docker pause to freeze processes and simulate a primary that stops responding to clients and health checks. This mirrors a previous test I did with Yugabyte: YugabyteDB Recovery Time Objective (RTO) with PgBench: continuous availability with max. 15s latency on infrastructure failure
This post starts a CNPG series where I will also cover failures like network partitions and storage issues, and the connection pooler.



Top comments (0)