Abhishek Sharma

Posted on May 10

Zero-Downtime PostgreSQL Major Version Upgrades in Containers: The Problem Nobody Talks About

#postgres #docker #tooling #database

Running PostgreSQL in containers is one of the smartest infrastructure decisions a team can make — until the day you need to upgrade across a major version.

Then it becomes one of the most painful ones.

This post walks through why major PostgreSQL upgrades are uniquely hard in containerized environments, the common approaches teams reach for (and why they hurt), and how pg-upgrade — a Docker-native upgrade toolkit — turns a weekend-long migration into a reproducible, CI-validated three-step process.

Why Containerized PostgreSQL in the First Place?

Before diving into the upgrade problem, it's worth understanding why teams choose self-managed containerized PostgreSQL over managed services like Amazon RDS, Aurora, or Google Cloud SQL.

The cost difference is stark:

Setup	Monthly cost (50 GB, 4 vCPU, 16 GB RAM)
Amazon RDS PostgreSQL (db.r6g.xlarge)	~$400–600/month
Aurora PostgreSQL (db.r6g.xlarge)	~$500–700/month
Self-managed PostgreSQL on EKS (m6g.xlarge node)	~$120–180/month

That's a 3–5x cost difference, which compounds quickly as your data grows. Beyond cost, containerized PostgreSQL gives you:

Full control over PostgreSQL configuration, extensions, and versions
Portability — the same docker-compose.yml works in dev, staging, and production
No vendor lock-in — you're not tied to a cloud provider's upgrade schedule or supported version matrix
Extension freedom — install PostGIS, TimescaleDB, pgvector, or any community extension without waiting for a managed service to support it

The tradeoff? You own the operational complexity. And nowhere does that complexity bite harder than major version upgrades.

The Problem: Major Version Upgrades Are Not Like Container Restarts

Upgrading a PostgreSQL minor version (e.g. 15.3 → 15.7) is trivial — swap the image tag and restart. The data directory format doesn't change.

Major version upgrades (e.g. 13 → 16) are a different beast entirely. PostgreSQL's internal data format changes between major versions. You cannot simply point a PostgreSQL 16 binary at a data directory written by PostgreSQL 13 and expect it to work. It will refuse to start.

The official tool for major version upgrades is pg_upgrade. It migrates the data directory in-place from the old format to the new one. But pg_upgrade has a constraint that makes it awkward in containerized environments:

Both the old and new PostgreSQL binaries must be present on the same machine, with access to the same data directories.

In a container world, this is an unnatural requirement. Your PostgreSQL 13 container has PG 13 binaries. Your PostgreSQL 16 image has PG 16 binaries. They don't share a filesystem, and they were never designed to.

What Teams Actually Do (And Why It Hurts)

Let's look at the common approaches teams reach for when they need to upgrade, and the hidden costs of each.

Option 1: `pg_dumpall` / `pg_dump` + Restore

The most common approach. Dump the entire database with pg_dumpall, spin up a new PostgreSQL 16 container, and restore.

# Dump from old container
docker exec pg13 pg_dumpall -U postgres > full_dump.sql

# Restore into new container
docker exec -i pg16 psql -U postgres < full_dump.sql

The problem:

Downtime scales with database size. A 100 GB database takes 1–3 hours to dump and another 2–5 hours to restore (indexes are rebuilt from scratch). Your application is down for all of it.
No rollback. Once you've promoted the new cluster and your application is writing to it, you can't go back to the dump — it's already stale.
No integrity proof. Did every row make it? Did all sequences reset correctly? Did materialized views survive? You're trusting the process.
Memory and disk pressure. The dump file itself can be enormous. A 500 GB database produces a 200–400 GB SQL file that needs to live somewhere during the migration.

For a production database that your business depends on, multi-hour downtime windows are often simply not acceptable.

Option 2: Logical Replication with `pglogical` or Built-in Slots

A more sophisticated approach: set up logical replication from the old cluster to a new PG 16 cluster, let it catch up, then cut over.

PG 13 (primary) ──logical replication──► PG 16 (replica)
                                            │
                              [catch up, then promote]

The problems:

Replication lag during cutover. Logical replication does not replicate DDL (schema changes). Any schema migrations running during the replication period must be manually applied to both clusters.
Setup complexity is high. You need to configure wal_level = logical, create replication slots, manage slot lag, handle large objects (which logical replication doesn't support), and deal with sequences (which are not replicated).
Slots can accumulate WAL indefinitely. If the replica falls behind, the primary holds WAL segments in reserve. On a busy write-heavy database, this can fill your disk within hours.
Not all data types replicate cleanly. pg_largeobject (lo/bytea blobs stored via lo functions), unlogged tables, and some partitioning configurations don't replicate through logical slots without extra work.
Still requires a cutover window. Even with replication in place, you need a moment where writes stop on the old cluster, you confirm the replica is caught up, and you flip the application over. That window is shorter than a dump/restore — but it's still a window.

Option 3: Snapshot + Restore on a New Node

Cloud-native teams sometimes use volume snapshots: snapshot the EBS/PD volume backing the old cluster, mount it on a new instance with PostgreSQL 16 installed, and run pg_upgrade there.

The problems:

PostgreSQL 16 isn't installed on the new node. You're back to the original problem: pg_upgrade needs both binary sets on the same machine. You either have to install both versions manually or find a way to get them there.
No reproducibility. The upgrade is a manual, unrepeatable operation. If something goes wrong, you start over from the snapshot — but you have no record of what commands were run, in what order, or what state the system was in when they failed.
No integrity check. After pg_upgrade finishes, how do you know the data is intact? You might spot-check a few tables manually, but there's no systematic verification.
Hard to test in staging. The snapshot approach is environment-specific. The commands that worked on production won't necessarily work in your staging environment because the volume setup is different.

Option 4: Manual `pg_upgrade` Inside a Container

Some teams install both PostgreSQL versions inside a single container and run pg_upgrade manually:

FROM ubuntu:22.04
RUN apt-get install postgresql-13 postgresql-16

The problems:

No standard image for this. Every team builds their own one-off image, with different assumptions about data directory paths, binary locations, and user permissions.
Permissions are a minefield. pg_upgrade must be run as the postgres OS user, not root. The data directories must be owned by postgres. Getting this right inside a custom container is non-trivial.
No CI validation. The upgrade is run once, manually, against production. There's no prior test run to prove it would work.
Old Debian/Ubuntu base images hit EOL. PostgreSQL 9.6 was only available on Debian Stretch and Buster, both of which reached end-of-life. Their apt mirrors were moved off the main servers, so apt-get install postgresql-9.6 fails on a fresh install — producing cryptic 404 errors with no obvious fix.

How `pg-upgrade` Solves This

pg-upgrade is a set of Docker images that package both the old and new PostgreSQL binaries, along with three coordinated container steps — init-old, upgrade, and verify — connected by Docker volumes.

The core insight: instead of trying to install two PostgreSQL versions in a running container at migration time, bake both binary sets into a purpose-built upgrade image at build time. Then the upgrade is just docker run.

The Three-Step Pipeline

┌──────────────────────────────────────────────────────────────────┐
│  Step 1 — init-old                                               │
│  Seeds the old cluster with real-world schema: tables, indexes,  │
│  views, sequences, materialized views, FK constraints, sample    │
│  data across multiple databases.                                 │
└──────────────────────┬───────────────────────────────────────────┘
                       │  pg-old-data volume
┌──────────────────────▼───────────────────────────────────────────┐
│  Step 2 — upgrade                                                │
│  Runs pg_upgrade --check (dry run first), then the real upgrade. │
│  Prints before/after directory snapshots, file sizes, structural │
│  renames, and wall-clock duration.                               │
└──────────────────────┬───────────────────────────────────────────┘
                       │  pg-new-data volume
┌──────────────────────▼───────────────────────────────────────────┐
│  Step 3 — verify                                                 │
│  Starts the upgraded cluster and asserts: databases exist, row   │
│  counts match, indexes are intact, views work, sequences are     │
│  preserved, foreign keys survive.                                │
└──────────────────────────────────────────────────────────────────┘

For a production upgrade, you skip init-old and mount your existing data PVC directly into the upgrade step. The old cluster must be scaled to zero first (your downtime window) — but the upgrade itself runs in seconds to minutes, not hours.

Downtime Comparison

Approach	10 GB DB	100 GB DB	1 TB DB
`pg_dump` + restore	30–90 min	3–8 hours	30+ hours
Logical replication cutover	5–30 min	5–30 min	5–30 min
pg-upgrade (copy mode)	~45 sec	~7 min	~70 min
pg-upgrade (link mode `-k`)	< 5 sec	< 5 sec	< 5 sec

Link mode (-k) uses hard links instead of copying data files. The upgrade completes in seconds regardless of cluster size, because no bytes are moved — the old and new clusters share the same underlying files. The tradeoff is that the old data directory is no longer independently valid after the upgrade, so you delete it only after confirming the new cluster is healthy in production.

What the Upgrade Output Looks Like

After the upgrade step, you get a structured report directly in your CI log:

──────────────────────────────────────────────────────────────────────
  Old cluster — PostgreSQL 9.6
──────────────────────────────────────────────────────────────────────
  Path:                /var/lib/postgresql/9.6/main
  Total size:          47M

  Notable structural changes applied during this upgrade:
    pg_xlog/    → pg_wal/    (WAL directory, renamed in PG 10)
    pg_clog/    → pg_xact/   (transaction status, renamed in PG 10)
    pg_log/     → log/       (server log directory, renamed in PG 10)

──────────────────────────────────────────────────────────────────────
  Upgrade complete
──────────────────────────────────────────────────────────────────────
  Cluster size:        47M → 49M  (+4%)
  Upgrade duration:    8s
  PostgreSQL version:  9.6 → 16
──────────────────────────────────────────────────────────────────────

And after the verify step:

──────────────────────────────────────────────────────────────────────
  Verification result — PostgreSQL 16
──────────────────────────────────────────────────────────────────────
  Passed:    9
  Failed:    0
──────────────────────────────────────────────────────────────────────

This is the piece every other approach lacks: a systematic, scripted assertion that the data survived intact.

No Credentials Required

One of the underrated advantages of pg_upgrade over dump/restore or logical replication: it never connects to a running database server. It reads and writes data files directly on disk as the postgres OS user. No database password is passed around, no pg_hba.conf changes are needed, and application credentials (stored in pg_authid) are migrated automatically along with everything else.

CI-Validated Upgrade Paths

Every supported upgrade path — from 9.6→12 all the way through 15→16 — is validated in GitHub Actions on every commit. The matrix runs the full three-step pipeline (init → upgrade → verify) and fails the build if any integrity check doesn't pass.

This means when you pull abhsss/pg-upgrade:13-to-16 and run it against your production data, you're not running an untested script. You're running the same pipeline that passed CI.

Kubernetes Ready, No Changes Required

The same image runs on Kubernetes without modification. Replace Docker volumes with PersistentVolumeClaims and docker run with Jobs:

# Apply the upgrade Job
kubectl apply -f pg-upgrade-job.yaml

# Wait for it to complete
kubectl wait --for=condition=complete job/pg-upgrade-run --timeout=30m

# Check the output
kubectl logs job/pg-upgrade-run

For a production cluster running as a StatefulSet, the workflow is:

Scale the StatefulSet to zero (kubectl scale statefulset postgres --replicas=0)
Run the upgrade Job, mounting the existing PVC
Update the StatefulSet's image tag to PostgreSQL 16
Run the verify Job
Scale the StatefulSet back up

The downtime window is steps 1–5. With link mode, steps 2–4 take under a minute for any database size.

Quick Start

# Pull the image for your upgrade path
docker pull abhsss/pg-upgrade:13-to-16

# Create volumes
docker volume create pg-old-data
docker volume create pg-new-data

# Step 1 — seed test data (skip this in production; mount your existing volume)
docker run --rm \
  -v pg-old-data:/var/lib/postgresql/13/main \
  abhsss/pg-upgrade:13-to-16 init-old

# Step 2 — upgrade
docker run --rm \
  -v pg-old-data:/var/lib/postgresql/13/main \
  -v pg-new-data:/var/lib/postgresql/16/main \
  abhsss/pg-upgrade:13-to-16 upgrade

# Step 3 — verify
docker run --rm \
  -v pg-new-data:/var/lib/postgresql/16/main \
  abhsss/pg-upgrade:13-to-16 verify

# Cleanup
docker volume rm pg-old-data pg-new-data

Supported upgrade paths are published on DockerHub. Images exist for every common path from PG 9.6 to PG 16.

When to Use What

Situation	Recommended approach
Small DB, extended downtime window acceptable	`pg_dump` / restore — simple, no extra tooling
Zero downtime requirement, complex schema with DDL changes during cutover	Logical replication + careful cutover scripting
Containerized PostgreSQL, predictable downtime window	`pg-upgrade` — reproducible, CI-validated, fast
Containerized PostgreSQL, absolute minimum downtime	`pg-upgrade` with link mode `-k`

Want to Contribute?

The project is open source and there are good entry points for contributors at every level. If something above resonated with you, the easiest way to get involved is to pick up one of the open issues:

Good first issues — low scope, well-defined:

#1 — Add pgvector CI matrix entries and fixtures
Add test fixtures that exercise pgvector embeddings through an upgrade, so vector similarity queries survive the migration intact.
#4 — Support --jobs N parallelism for pg_upgrade
pg_upgrade supports parallel catalog processing via --jobs. Wire it through as an environment variable so large clusters with many tables upgrade faster.
#6 — Add a delete-old entrypoint command
After a successful verify, the generated delete_old_cluster.sh needs to be run. Wrapping it as a first-class delete-old command keeps the interface consistent.
#7 — Update the README
Documentation fixes: stale quick-start commands, outdated repo structure, and missing extension docs. Good for a first PR if you want to understand the codebase before touching scripts.
#8 — Fill in missing upgrade matrix paths
Paths like 10→12, 11→13, and 12→13 are absent from the matrix. Adding one requires a new Dockerfile and two lines in the CI workflow — no script changes needed.

Larger contributions — help wanted:

#2 — CI coverage for pg_partman, pg_cron, and pgaudit
Extensions that touch background workers and cron scheduling have different upgrade behavior than data extensions. This adds matrix entries and verify assertions for each.
#3 — Add PG 17 as an upgrade target
PG 17 ships on Debian Bookworm (OpenSSL 3), while older source versions were compiled against OpenSSL 1.1. Getting both to coexist in one image requires a careful COPY --from and ldconfig dance.
#5 — Support --link and --clone upgrade modes
Link mode (-k) reduces upgrade time to under 5 seconds regardless of cluster size. Clone mode offers a middle ground. This wires both through as options without breaking the default copy-mode behavior.

No contribution is too small. Open an issue first if you want to discuss scope before sending a pull request.

Conclusion

Containerized PostgreSQL is a compelling alternative to managed cloud databases — significant cost savings, full control, and no vendor lock-in. But that control comes with responsibility, and major version upgrades are where that responsibility shows up most clearly.

The conventional approaches — dump/restore, logical replication, manual pg_upgrade — all work, but they carry hidden costs: hours of downtime, unrepeatable procedures, no systematic integrity verification, and no way to prove the upgrade would succeed before running it against production.

pg-upgrade addresses each of these. It's a reproducible, Docker-native upgrade pipeline that runs the same three steps in CI that you'll run in production. The upgrade is fast (seconds to minutes, not hours), the output is structured and machine-readable, and the verify step gives you a signed-off assertion that the data survived intact — before you promote the new cluster and scale your application back up.

Source code and contribution guidelines: github.com/abhsss96/postgres-upgrade-kit

Docker images: hub.docker.com/r/abhsss/pg-upgrade

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

Why Containerized PostgreSQL in the First Place?

The Problem: Major Version Upgrades Are Not Like Container Restarts

What Teams Actually Do (And Why It Hurts)

Option 1: pg_dumpall / pg_dump + Restore

Option 2: Logical Replication with pglogical or Built-in Slots

Option 3: Snapshot + Restore on a New Node

Option 4: Manual pg_upgrade Inside a Container

How pg-upgrade Solves This

The Three-Step Pipeline

Downtime Comparison

What the Upgrade Output Looks Like

No Credentials Required

CI-Validated Upgrade Paths

Kubernetes Ready, No Changes Required

Quick Start

When to Use What

Want to Contribute?

Conclusion

Option 1: `pg_dumpall` / `pg_dump` + Restore

Option 2: Logical Replication with `pglogical` or Built-in Slots

Option 4: Manual `pg_upgrade` Inside a Container

How `pg-upgrade` Solves This