Alex Yarotsky

Posted on Dec 16, 2025 • Edited on Mar 5

Your Diesel Migrations Might Be Ticking Time Bombs

#rust #postgres #database #opensource

Handshake deployed what looked like a routine migration during their regular multi-daily release cycle:

ALTER TABLE table_name
  ADD CONSTRAINT fk_user_id
  FOREIGN KEY (user_id) REFERENCES users(id);

60 seconds later, their entire site was down.

The problem? Adding a foreign key constraint requires an ACCESS EXCLUSIVE lock on the referenced table. Postgres grants locks first-come, first-served. A long-running query on the users table was holding an ACCESS SHARE lock, so the migration queued up waiting for it. But here's the kicker: since ACCESS EXCLUSIVE conflicts with everything, all the regular SELECT queries that came after started queuing behind the migration. The lock queue grew. The site stopped responding. They had to abort the migration to bring the site back up.

GoCardless hit a similar issue. They were recreating foreign key constraints on renamed tables. The tables were empty, so it seemed safe. But adding the constraints required locks on the parent tables, which were heavily used. API timeouts across the board for 15 seconds.

Both incidents came from database migrations that looked completely normal. Both ran fine in staging. Both only showed their true colors in production.

The Postgres Lock Problem

Postgres uses different lock levels to keep your data consistent. The worst one is ACCESS EXCLUSIVE. When something holds this lock, nothing else can touch that table. Not SELECT, not INSERT, nothing.

A bunch of common migration operations take ACCESS EXCLUSIVE locks:

Creating Indexes

CREATE INDEX idx_users_email ON users(email);

This takes a SHARE lock, which blocks all writes (INSERT, UPDATE, DELETE) while the index builds. On a table with millions of rows, this can take minutes.

Adding NOT NULL

ALTER TABLE users ALTER COLUMN email SET NOT NULL;

Postgres has to verify every single row has a non-null value. ACCESS EXCLUSIVE lock the entire time.

Changing Column Types

ALTER TABLE products ALTER COLUMN price TYPE NUMERIC(10,2);

Complete table rewrite. Every row converted to the new type. ACCESS EXCLUSIVE lock throughout.

The Lock Queue Makes It Worse

Here's what makes this really nasty: Postgres's lock queue is first-come, first-served. If a migration is waiting for a lock, all subsequent queries queue behind it, even if they don't conflict with the original lock holder.

So this happens:

Long-running query holds ACCESS SHARE lock on users table.
Migration tries to get ACCESS EXCLUSIVE lock, queues up.
New SELECT query comes in, would normally be fine with ACCESS SHARE.
But since migration is waiting, this SELECT goes to the back of the queue.
Another SELECT comes in, also queued.
Your application starts timing out.

The migration hasn't even started yet, and you're already down.

Why This is Hard to Catch

These migrations look fine. They work great in development where you have 100 test records. They run instantly in staging with 10,000 rows. Then in production with 5 million rows, they lock the table for minutes.

You don't find out until it's running against real data with real traffic.

Enter diesel-guard

diesel-guard was built to catch these problems before they hit production. It's a static analysis tool that scans your Diesel migration files for dangerous operations.

It embeds libpg_query — the C library compiled into Postgres itself. What diesel-guard flags is exactly what Postgres sees. If your SQL has a syntax error, diesel-guard reports that too.

Install it:

cargo install diesel-guard

Run it:

diesel-guard check migrations/

When it finds something risky:

❌ Unsafe migration detected in migrations/2024_01_01_add_fk/up.sql

❌ ADD COLUMN with DEFAULT

Problem:
  Adding column 'status' with DEFAULT on table 'orders' requires a full
  table rewrite on PostgreSQL < 11, which acquires an ACCESS EXCLUSIVE
  lock. On large tables, this can take significant time and block all
  operations.

Safe alternative:
  1. Add the column without a default:
     ALTER TABLE orders ADD COLUMN status TEXT;

  2. Backfill data in batches (outside migration):
     UPDATE orders SET status = 'pending' WHERE status IS NULL;

  3. Add default for new rows only:
     ALTER TABLE orders ALTER COLUMN status SET DEFAULT 'pending';

  Note: For Postgres 11+, this is safe if the default is a constant.

You get:

What's dangerous about it
What lock it takes
Step-by-step fix with SQL

Safe Alternatives for Common Operations

Creating Indexes

Instead of:

CREATE INDEX idx_orders_created_at ON orders(created_at);

Do this:

CREATE INDEX CONCURRENTLY idx_orders_created_at ON orders(created_at);

CONCURRENTLY means writes can continue while the index builds. You need to add a metadata.toml:

# migrations/2024_01_01_add_order_index/metadata.toml
run_in_transaction = false

Adding NOT NULL

Instead of:

ALTER TABLE users ALTER COLUMN email SET NOT NULL;

Do this:

-- Add CHECK constraint without validating existing rows
ALTER TABLE users
  ADD CONSTRAINT users_email_not_null_check
  CHECK (email IS NOT NULL) NOT VALID;

-- Validate separately (lighter lock)
ALTER TABLE users VALIDATE CONSTRAINT users_email_not_null_check;

-- Add NOT NULL (fast since we validated)
ALTER TABLE users ALTER COLUMN email SET NOT NULL;

-- Clean up
ALTER TABLE users DROP CONSTRAINT users_email_not_null_check;

The VALIDATE step uses SHARE UPDATE EXCLUSIVE, which allows reads and writes to continue.

Adding UNIQUE Constraints

Instead of:

ALTER TABLE users ADD CONSTRAINT users_email_key UNIQUE (email);

Do this:

-- Build index concurrently
CREATE UNIQUE INDEX CONCURRENTLY users_email_idx ON users(email);

-- Add constraint using existing index (instant)
ALTER TABLE users
  ADD CONSTRAINT users_email_key
  UNIQUE USING INDEX users_email_idx;

Adding Foreign Keys

This is what bit Handshake. Instead of:

ALTER TABLE posts
  ADD CONSTRAINT posts_user_id_fkey
  FOREIGN KEY (user_id) REFERENCES users(id);

Do this:

-- Add constraint without validating existing rows
ALTER TABLE posts
  ADD CONSTRAINT posts_user_id_fkey
  FOREIGN KEY (user_id) REFERENCES users(id)
  NOT VALID;

-- Validate separately (lighter lock)
ALTER TABLE posts VALIDATE CONSTRAINT posts_user_id_fkey;

NOT VALID means it doesn't scan existing rows. VALIDATE happens separately with a lighter lock.

Setting Timeouts

One thing both Handshake and GoCardless learned: set lock_timeout in your migrations.

-- At the top of your migration
SET lock_timeout = '2s';

ALTER TABLE users ADD COLUMN email TEXT;

If the migration can't get a lock within 2 seconds, it fails instead of queuing indefinitely. Your app stays up, and you can retry during lower traffic.

When You Know It's Safe

Sometimes you know a migration is safe (tiny table, maintenance window, etc.):

-- safety-assured:start
-- Safe because: table has 50 rows, deploying during maintenance window
ALTER TABLE countries ADD COLUMN flag_emoji TEXT DEFAULT '🏳️';
-- safety-assured:end

diesel-guard will skip anything in these blocks.

Configuration

Put a diesel-guard.toml in your project:

# Skip checking migrations before this date
start_after = "2024_01_01_000000"

# Check down.sql too
check_down = true

# Disable specific checks if needed
disable_checks = ["CreateExtensionCheck"]

CI/CD Integration

Add to GitHub Actions:

name: Check Migrations
on: [pull_request]

jobs:
  check-migrations:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ayarotsky/diesel-guard@v0.3.0

Now dangerous migrations get caught in PR review, not production.

What It Checks

diesel-guard currently detects 24 different issues:

Check	Risk
ADD COLUMN with DEFAULT	Table rewrite on Postgres < 11 (ACCESS EXCLUSIVE)
ADD INDEX without CONCURRENTLY	Blocks writes (SHARE lock)
ADD NOT NULL constraint	Full table scan (ACCESS EXCLUSIVE)
ADD PRIMARY KEY	Blocks all operations during index creation
ADD UNIQUE constraint	ACCESS EXCLUSIVE during index build
ALTER COLUMN TYPE	Table rewrite (ACCESS EXCLUSIVE)
ADD COLUMN with SERIAL	Table rewrite to populate sequence
ADD COLUMN with GENERATED STORED	Table rewrite to compute expressions
DROP COLUMN	ACCESS EXCLUSIVE lock
DROP INDEX without CONCURRENTLY	ACCESS EXCLUSIVE lock
DROP PRIMARY KEY	Breaks FK relationships
DROP TABLE	Irreversible, ACCESS EXCLUSIVE
DROP DATABASE	Irreversible
REINDEX without CONCURRENTLY	ACCESS EXCLUSIVE lock
RENAME COLUMN	Breaks running app references immediately
RENAME TABLE	Breaks running app references, ACCESS EXCLUSIVE
TRUNCATE TABLE	ACCESS EXCLUSIVE, cannot be batched
ADD COLUMN with JSON	Breaks DISTINCT/GROUP BY
ADD COLUMN with CHAR	Storage waste, comparison bugs
ADD COLUMN with TIMESTAMP	DST/timezone hazards
PRIMARY KEY with INT/SMALLINT	ID exhaustion risk
CREATE EXTENSION	Requires superuser
CONSTRAINT without name	Auto-names break future migrations
CREATE INDEX with 4+ columns	Ineffective, high storage overhead

Plus custom checks via Rhai scripting.

Why This Matters for Rust

The Rust ecosystem has great tooling. clippy lints your code. cargo audit catches security problems. But we didn't have anything for database migrations.

I've seen too many production incidents from migrations. The fix is usually obvious in hindsight, but you only find out when it's causing downtime.

diesel-guard brings the fix forward to development time.

Should You Use This?

Maybe you're thinking "my tables are small."

Tables grow. The users table with 100 rows today might have a million rows next year. The migration that's instant now might take minutes then.

Building safe migrations from the start is way easier than fixing them during an incident. And diesel-guard takes a couple seconds to run.

Wrapping Up

Database migrations can be tricky. Operations that look perfectly safe can cause serious production issues. The gap between how they behave in staging versus production can be huge, and often you only find out when it's too late.

Static analysis can catch these problems early. Whether you use diesel-guard or build your own checks, having something review your migrations before they reach production is worth it. The patterns are well-documented and you just need tooling to enforce them.

Building safety into your database migration workflow pays off.

DEV Community

Your Diesel Migrations Might Be Ticking Time Bombs

The Postgres Lock Problem

Creating Indexes

Adding NOT NULL

Changing Column Types

The Lock Queue Makes It Worse

Why This is Hard to Catch

Enter diesel-guard

Safe Alternatives for Common Operations

Creating Indexes

Adding NOT NULL

Adding UNIQUE Constraints

Adding Foreign Keys

Setting Timeouts

When You Know It's Safe

Configuration

CI/CD Integration

What It Checks

Why This Matters for Rust

Should You Use This?

Wrapping Up

Top comments (0)