---
title: "Zero-Downtime Schema Migrations in Production PostgreSQL"
published: true
description: "A hands-on guide to ghost table swaps, advisory locks, and batched backfills — ALTER TABLE on massive PostgreSQL databases without a maintenance window."
tags: postgresql, devops, architecture, api
canonical_url: https://blog.mvpfactory.co/zero-downtime-schema-migrations-postgresql
---
## What We Are Building
Let me show you the migration pipeline I use on every production PostgreSQL deployment. By the end of this tutorial, you will understand how tools like `pg_osc` and `pgroll` perform online schema changes using ghost table copy-and-swap, and how to wire the whole thing into your Ktor or Spring Boot CI/CD flow — no maintenance window required.
## Prerequisites
- PostgreSQL 11+ in production
- Familiarity with `ALTER TABLE` and basic locking concepts
- A Ktor or Spring Boot service with a CI/CD pipeline
- `pg_osc` or `pgroll` installed in your migration toolchain
## Step 1: Know Which Operations Are Dangerous
Not every `ALTER TABLE` hurts. Here is the minimal reference to keep nearby:
| Operation | Rewrites Table? | Blocks Reads/Writes? |
|---|---|---|
| `ADD COLUMN` (nullable, no default) | No | Sub-second lock |
| `ADD COLUMN DEFAULT val` (PG 11+) | No | Sub-second, catalog-only |
| `ALTER COLUMN TYPE` | **Yes** | **Yes — full rewrite** |
| `VALIDATE CONSTRAINT` | Scans all rows | Blocks writes |
When you run `ALTER TABLE orders ALTER COLUMN id TYPE bigint` on a table with hundreds of millions of rows, PostgreSQL rewrites every row under an `ACCESS EXCLUSIVE` lock. Every concurrent `SELECT`, `INSERT`, and `UPDATE` queues behind it. Connection pool exhaustion hits within seconds, your API returns 503s, health checks fail, and Kubernetes starts cycling pods.
The operations that genuinely hurt — column type changes, constraint validation, pre-PG 11 defaults — are exactly where ghost table tooling pays off.
## Step 2: The Ghost Table Strategy
The core pattern that `pg_osc` and `pgroll` use is straightforward:
1. **Create a shadow table** mirroring the original schema plus your changes.
2. **Install a trigger** on the original table replicating every `INSERT`, `UPDATE`, `DELETE` to the ghost in real time.
3. **Backfill existing rows** in small batches.
4. **Swap tables** via `ALTER TABLE ... RENAME` inside a brief transaction.
5. **Drop the old table** once connections have drained.
## Step 3: Advisory Lock Coordination
Here is the gotcha that will save you hours. Running two migration workers simultaneously against the same table **will corrupt data**. Both `pg_osc` and `pgroll` solve this with advisory locks:
sql
SELECT pg_advisory_lock(hashtext('migrations'), 'orders'::regclass::int);
Advisory locks exist in a separate namespace from table locks — they are non-blocking to application queries. A second migration worker blocks on the advisory lock, not on your table.
## Step 4: Trigger-Based Row Sync
During backfill, a trigger captures concurrent writes. Here is a simplified version for adding a `region_code` column:
sql
CREATE FUNCTION ghost_sync() RETURNS trigger AS $$
BEGIN
INSERT INTO orders_ghost SELECT NEW.*
ON CONFLICT (id) DO UPDATE
SET region_code = EXCLUDED.region_code,
updated_at = EXCLUDED.updated_at,
amount = EXCLUDED.amount;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
The `ON CONFLICT ... DO UPDATE` pattern ensures rows written after the backfill batch already copied them carry the latest state.
## Step 5: Batched Backfill and CI/CD Integration
Backfilling 500 million rows in one transaction would blow out WAL and memory. Process rows in configurable batches with throttle delays:
kotlin
launch(Dispatchers.IO) {
migrationService.backfillInBatches(
sourceTable = "orders",
ghostTable = "orders_ghost",
batchSize = 25_000,
throttleMs = 100
)
}
Expose a status endpoint so your CI/CD pipeline can gate deployments on migration completion:
kotlin
routing {
get("/migrations/status") {
val status = migrationService.currentStatus()
call.respond(status) // { "table": "orders", "progress": 0.73, "phase": "backfill" }
}
}
The deploy flow is: ship backward-compatible code → trigger migration → poll progress → deploy cleanup code after swap completes.
If the backfill encounters errors or replication lag exceeds a threshold, the pipeline drops the ghost table and releases the advisory lock. The original table is untouched. Failure is always safe.
## Gotchas
- **Two workers, one table = corruption.** Always use advisory locks. The docs do not mention this prominently, but I have seen it bite teams in production.
- **Throttle aggressively.** A 100ms pause between 25K-row batches adds minutes but prevents replication lag spikes that cascade into replica failovers.
- **Wall time vs. availability.** A raw `ALTER TABLE` on 500M rows locks for 8-12 minutes. A ghost table swap takes 2-4 hours but your p99 latency only increases 3-5% above baseline. Your users never notice.
- **Do not forget cleanup deploys.** Your backward-compatibility shims need to be removed after the swap completes. Gate that second deploy on the progress endpoint.
- **Long migrations demand long focus sessions.** When I am monitoring a multi-hour backfill, I rely on [HealthyDesk](https://play.google.com/store/apps/details?id=com.healthydesk) to nudge me into breaks — staring at progress bars for hours without moving is a recipe for burnout.
## Wrapping Up
You trade wall-clock time for continuous availability. On any production system serving real users, that tradeoff is not even close. Start by identifying which of your pending migrations actually rewrite the table, wire up `pg_osc` or `pgroll`, and expose a progress endpoint for your pipeline. Zero-downtime migrations are not magic — they are a pattern you can adopt today.
**Resources:** [pg_osc GitHub](https://github.com/shayonj/pg_osc) · [pgroll docs](https://github.com/xataio/pgroll) · [PostgreSQL ALTER TABLE documentation](https://www.postgresql.org/docs/current/sql-altertable.html)
Top comments (0)