---
title: "Zero Downtime Database Migrations: A Practical Guide for PostgreSQL"
published: true
description: "Step-by-step guide to performing schema migrations in PostgreSQL without downtime — covering ALTER TABLE tricks, CREATE INDEX CONCURRENTLY, safe renames, and migration tooling."
tags: postgresql, architecture, devops, cloud
canonical_url: https://blog.mvpfactory.co/zero-downtime-database-migrations-postgresql
---
## What We're Building
In this workshop, I'll walk you through the exact patterns I use in production to run PostgreSQL schema migrations with zero downtime. By the end, you'll know which DDL operations are safe, how to restructure dangerous ones, and how to configure golang-migrate and Flyway to support all of it.
Let me show you a pattern I use in every project — because schema migrations are the #1 cause of unplanned downtime in PostgreSQL-backed services, and most teams don't realize it until they're already paged.
## Prerequisites
- PostgreSQL 11+ (some patterns differ on older versions)
- A migration tool: [golang-migrate](https://github.com/golang-migrate/migrate) or [Flyway](https://flywaydb.org/)
- Familiarity with basic SQL DDL (`ALTER TABLE`, `CREATE INDEX`)
## Step 1: Understand the Lock Problem
PostgreSQL uses an `AccessExclusiveLock` for many DDL operations. This lock blocks **all** concurrent queries — reads and writes.
| Operation | Blocks Reads? | Blocks Writes? | Safe at Scale? |
|---|---|---|---|
| `ADD COLUMN` (no default) | Yes (briefly) | Yes (briefly) | **Yes** |
| `ADD COLUMN ... DEFAULT x` (PG 11+) | Yes (briefly) | Yes (briefly) | **Yes** |
| `ADD COLUMN ... DEFAULT x` (PG <11) | Yes (long) | Yes (long) | **No** |
| `CREATE INDEX` | No | Yes | **No** |
| `CREATE INDEX CONCURRENTLY` | No | No | **Yes** |
| `ALTER COLUMN TYPE` | Yes (long) | Yes (long) | **No** |
The rule is straightforward: any operation that rewrites the table or scans it fully under an exclusive lock is dangerous.
## Step 2: Add Columns Safely
PostgreSQL 11+ stores defaults in the catalog instead of rewriting rows, making this nearly instant:
sql
-- Safe on PostgreSQL 11+
ALTER TABLE orders ADD COLUMN status TEXT NOT NULL DEFAULT 'pending';
On versions below 11, split it:
sql
ALTER TABLE orders ADD COLUMN status TEXT;
UPDATE orders SET status = 'pending' WHERE status IS NULL AND id BETWEEN 1 AND 100000;
ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
ALTER TABLE orders ALTER COLUMN status SET DEFAULT 'pending';
## Step 3: Create Indexes Concurrently
Standard `CREATE INDEX` blocks writes for the entire build. On a large table, that's minutes or hours.
sql
-- Non-blocking -- use this instead
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
Here's the minimal setup to get this working with golang-migrate — annotate the file so it runs outside a transaction:
sql
-- +goose NO TRANSACTION
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
For Flyway, use the `executeInTransaction=false` flag in your versioned migration.
## Step 4: Rename Columns via Expand-Contract
Renaming directly breaks every query using the old name. Use views instead:
sql
-- Phase 1: Add + backfill
ALTER TABLE users ADD COLUMN full_name TEXT;
UPDATE users SET full_name = name WHERE full_name IS NULL;
-- Phase 2: Backward-compatible view
CREATE VIEW users_v AS SELECT *, name AS legacy_name FROM users;
-- Phase 3: Migrate app code to full_name
-- Phase 4: Drop old column
ALTER TABLE users DROP COLUMN name;
## Step 5: Change Column Types Safely
Changing a type (e.g., `INT` to `BIGINT`) rewrites the entire table. The safe path uses a trigger for dual writes:
sql
ALTER TABLE events ADD COLUMN event_id_new BIGINT;
CREATE OR REPLACE FUNCTION sync_event_id() RETURNS TRIGGER AS $$
BEGIN
NEW.event_id_new := NEW.event_id;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_sync_event_id BEFORE INSERT OR UPDATE ON events
FOR EACH ROW EXECUTE FUNCTION sync_event_id();
-- Then: backfill in batches, swap columns, drop old
## Step 6: Set a Lock Timeout
Always guard your migrations. I've seen a single unguarded DDL take down an entire API.
sql
SET lock_timeout = '5s';
ALTER TABLE orders ADD COLUMN priority INT DEFAULT 0;
Five seconds is a cheap insurance policy. If the lock can't be acquired, the statement fails cleanly instead of cascading into an outage.
## Gotchas
- **`CREATE INDEX CONCURRENTLY` cannot run inside a transaction block.** If your migration tool wraps everything in a transaction by default, the statement will fail. Both golang-migrate and Flyway need explicit non-transactional configuration — the docs do not mention this prominently, but it will save you hours.
- **Backfills must be batched.** Running a single `UPDATE` on 50M rows holds a lock just as long as the DDL you're trying to avoid. Process in chunks of 100K or less.
- **Flyway rollback (`undo migrations`) requires the Teams edition.** golang-migrate supports down migrations out of the box.
- **Never change column types in place.** The expand-contract pattern (new column → trigger → backfill → swap) is more work, but nobody gets paged.
## Wrapping Up
The key insight is treating migrations as multi-phase rollouts, not atomic events. Use `CREATE INDEX CONCURRENTLY`, add columns with catalog-stored defaults on PG 11+, rename via views, and change types through expand-contract. Set `lock_timeout` on every statement. Make sure your migration tool — whether golang-migrate or Flyway — supports non-transactional execution.
Get these patterns right once and you'll stop dreading deploy day.
Top comments (0)