You're doing a rolling deployment. Half your instances are running v1, half are
running v2. Then v2 runs a migration that renames a column. Suddenly v1
instances start panicking, your phone starts blowing up with PD alarms for the
service you just deployed and downstream services are affected, because the
column that v1 expects exists is gone.
The expand and contract pattern prevents this.
The Problem
Traditional migrations assume all application instances update simultaneously.
That's fine for recreate deployments with downtime, but breaks badly with
rolling updates where both versions run concurrently.
-- This breaks rolling deployments
ALTER TABLE users RENAME COLUMN name TO full_name;
The moment this runs, every v1 box querying name fails.
The Pattern
Expand and contract splits breaking changes into backward-compatible steps:
- Expand: Add the new structure alongside the old one
- Migrate: Dual-write in application code, backfill existing data
- Contract: Remove the old structure once nothing uses it
Each step is a separate deployment. Each deployment is safe to run while both
application versions are live.
Example: Renaming a Column
Let's rename users.name to users.full_name.
Step 1: Expand
Add the new column. That's it.
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
This is backward-compatible. v1 keeps using name, completely unaware of
full_name.
Step 2: Migrate
Deploy application code that writes to both columns:
func (r *UserRepo) Create(ctx context.Context, user *User) error {
_, err := r.db.ExecContext(ctx, `
INSERT INTO users (name, full_name, email)
VALUES ($1, $1, $2)
`, user.FullName, user.Email)
return err
}
func (r *UserRepo) UpdateName(ctx context.Context, id int, name string) error {
_, err := r.db.ExecContext(ctx, `
UPDATE users SET name = $1, full_name = $1 WHERE id = $2
`, name, id)
return err
}
Read from the new column, falling back to the old one:
func (r *UserRepo) GetByID(ctx context.Context, id int) (*User, error) {
row := r.db.QueryRowContext(ctx, `
SELECT id, COALESCE(full_name, name), email FROM users WHERE id = $1
`, id)
var user User
err := row.Scan(&user.ID, &user.FullName, &user.Email)
return &user, err
}
Backfill existing rows:
UPDATE users SET full_name = name WHERE full_name IS NULL;
At this point both columns stay in sync through your application code. v1
instances still work because name is always populated.
Step 3: Contract
Once all instances are on the new code and you're confident no rollback is
needed, clean up:
ALTER TABLE users DROP COLUMN name;
Update your application code to remove the dual-write logic:
func (r *UserRepo) Create(ctx context.Context, user *User) error {
_, err := r.db.ExecContext(ctx, `
INSERT INTO users (full_name, email) VALUES ($1, $2)
`, user.FullName, user.Email)
return err
}
Example: Splitting a Table
You need to extract users.address_* columns into a separate addresses table.
Step 1: Expand
CREATE TABLE addresses (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id) UNIQUE,
street VARCHAR(255),
city VARCHAR(100),
postal_code VARCHAR(20)
);
Step 2: Migrate
Dual-write to both locations:
func (r *UserRepo) UpdateAddress(ctx context.Context, userID int, addr Address) error {
tx, err := r.db.BeginTx(ctx, nil)
if err != nil {
return err
}
defer tx.Rollback()
// Write to old columns (for v1 instances)
_, err = tx.ExecContext(ctx, `
UPDATE users
SET address_street = $1, address_city = $2, address_postal_code = $3
WHERE id = $4
`, addr.Street, addr.City, addr.PostalCode, userID)
if err != nil {
return err
}
// Write to new table (for v2 instances)
_, err = tx.ExecContext(ctx, `
INSERT INTO addresses (user_id, street, city, postal_code)
VALUES ($1, $2, $3, $4)
ON CONFLICT (user_id) DO UPDATE SET
street = EXCLUDED.street,
city = EXCLUDED.city,
postal_code = EXCLUDED.postal_code
`, userID, addr.Street, addr.City, addr.PostalCode)
if err != nil {
return err
}
return tx.Commit()
}
Read from the new table with fallback:
func (r *UserRepo) GetAddress(ctx context.Context, userID int) (*Address, error) {
// Try new table first
row := r.db.QueryRowContext(ctx, `
SELECT street, city, postal_code FROM addresses WHERE user_id = $1
`, userID)
var addr Address
err := row.Scan(&addr.Street, &addr.City, &addr.PostalCode)
if err == nil {
return &addr, nil
}
if err != sql.ErrNoRows {
return nil, err
}
// Fall back to old columns
row = r.db.QueryRowContext(ctx, `
SELECT address_street, address_city, address_postal_code
FROM users WHERE id = $1
`, userID)
err = row.Scan(&addr.Street, &addr.City, &addr.PostalCode)
return &addr, err
}
Backfill:
INSERT INTO addresses (user_id, street, city, postal_code)
SELECT id, address_street, address_city, address_postal_code
FROM users
WHERE address_street IS NOT NULL
ON CONFLICT (user_id) DO NOTHING;
Step 3: Contract
ALTER TABLE users
DROP COLUMN address_street,
DROP COLUMN address_city,
DROP COLUMN address_postal_code;
Remove the dual-write code and the fallback logic.
Example: Changing a Column Type
Changing price from INTEGER (cents) to DECIMAL (dollars).
Step 1: Expand
ALTER TABLE products ADD COLUMN price_decimal DECIMAL(10,2);
Step 2: Migrate
Dual-write with conversion:
func (r *ProductRepo) UpdatePrice(ctx context.Context, id int, cents int) error {
dollars := float64(cents) / 100.0
_, err := r.db.ExecContext(ctx, `
UPDATE products SET price = $1, price_decimal = $2 WHERE id = $3
`, cents, dollars, id)
return err
}
Backfill:
UPDATE products SET price_decimal = price / 100.0 WHERE price_decimal IS NULL;
Step 3: Contract
ALTER TABLE products DROP COLUMN price;
ALTER TABLE products RENAME COLUMN price_decimal TO price;
When to Use This Pattern
This pattern is a requirement if you need a deployment strategy other than
big-bang, because you'll need to run both application versions at the same time
while the new version is being rolled out.
If your deployment pipeline allows for maintenance windows, or you're making non
breaking-changes, then skip this pattern.
Triggers vs Application Code
The examples above use application level dual-writes, but you can also use
database triggers to keep columns in sync. Triggers simplify application code
and guarantee consistency even for direct database writes or scripts.
Beware though, triggers are harder to test, version, and maintain alongside
your application, and add vendor specific logic to your schema. If all writes
go through your application anyway, keeping the sync logic in code is usually
cleaner.
This pattern pairs well with zero-downtime deployment strategies. Check out Deployment Strategies Visualized for a visual guide to rolling, blue-green, canary, and progressive rollouts.
Top comments (0)