DEV Community

Cover image for The “Traffic Jam” That Kills Your Database (And How to Fix It)
Fabio Hiroki
Fabio Hiroki

Posted on • Originally published at Medium

The “Traffic Jam” That Kills Your Database (And How to Fix It)

You write a perfect banking API. The logic is sound. The math is flawless. The unit tests pass with green checkmarks every time.

Then, you launch.

Traffic spikes. Two users try to send money to each other simultaneously. Suddenly, your logs explode with red text.

Deadlock detected.

The database didn’t crash because of a bug in your math. It crashed because of a bug in your timing.

Here is how deadlocks happen, how to simulate them with Python, and the one-line logic change that fixes them forever.

TL;DR: You can reproduce this deadlock locally using Docker. Get the full code, including the API and the attack.py script, here: [Link to GitHub Repository]

The “Mexican Standoff”

A database deadlock is exactly like a standoff in a western movie.

Imagine two friends, Alice and Bob. They decide to send $10 to each other at the exact same millisecond.

Here is what happens in the database:

  • Transaction A (Alice paying Bob): Locks Alice’s row. It now needs to lock Bob’s row to deposit the money.
  • Transaction B (Bob paying Alice): Locks Bob’s row. It now needs to lock Alice’s row to deposit the money.

Transaction A waits for Bob. Transaction B waits for Alice.

Neither can move. Neither will quit. They will wait for eternity.

Fortunately, modern database engines are smart. They see this circle, realize it will never end, and play the role of the referee. The database brutally kills one of the transactions to let the other survive.

But for your user, that looks like a generic 500 Internal Server Error.

A picture simulating a database traffic jam

The Crime Scene (The Unsafe Code)

Let’s look at the Python code that causes this disaster.

We are using FastAPI and psycopg to talk to our database. In our “unsafe” transfer endpoint, we lock the sender first, then the receiver.

This feels logical. If I am sending money, I should check my balance first.

@app.post("/transfer/unsafe")
def transfer_unsafe(req: TransferRequest):
    with pool.connection() as conn:
        with conn.cursor() as cur:
            # 1. Lock the sender's account
            # We MUST lock here to safely check if they have funds.
            cur.execute(
                "SELECT balance FROM accounts WHERE id = %s FOR UPDATE", 
                (req.from_account,)
            )
            current_balance = cur.fetchone()[0]

            if current_balance < req.amount:
                raise HTTPException(status_code=400, detail="Insufficient funds")

            # ARTIFICIAL DELAY (Simulating latency)
            time.sleep(0.1) 

            # 2. Perform the Transfer
            cur.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s", (req.amount, req.from_account))

            # --- THE TRAP IS HERE ---
            # This UPDATE statement implicitly tries to lock the receiver's row.
            # If Bob is the receiver, and Bob is running a transaction... deadlock.
            cur.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s", (req.amount, req.to_account))

            conn.commit()
Enter fullscreen mode Exit fullscreen mode

The critical command here is FOR UPDATE.

This tells the database: “I am reading this row, but don’t let anyone else touch it until I’m done.”

If Alice runs this (Locking 1, wanting 2) and Bob runs this (Locking 2, wanting 1) at the same time, the application breaks.

The Attack (Proving It)

You should never trust theory until you break it in practice.
Become a member

I wrote a script, attack.py, that uses ThreadPoolExecutor to launch concurrent requests. It forces Alice to pay Bob and Bob to pay Alice simultaneously.

Here is the result of the attack on the unsafe endpoint:

--- Starting Attack ---
Mode: UNSAFE (Deadlock prone)
Users: 8

[1] ✅ Success: Alice -> Bob
[3] ❌ DEADLOCK: Bob -> Alice
[2] ✅ Success: Alice -> Bob
[5] ❌ DEADLOCK: Bob -> Alice
[4] ✅ Success: Alice -> Bob
...
Success:   6
Deadlocks: 14
Enter fullscreen mode Exit fullscreen mode

The database throws a DeadlockDetected error. The transaction is rolled back. The money doesn’t move. The user is angry.

You cannot fix this with faster hardware.

You cannot fix this by optimizing your SQL queries.

You fix this with Geometry.

A mobile phone showing a baking application

The Solution: Lock Ordering

To prevent a cycle, you must ensure that everyone acquires locks in the exact same order.

It doesn’t matter who is paying whom.

You must always lock the account with the smaller ID before the account with the larger ID.

  • Scenario A (Alice pays Bob): Lock ID 1 (Alice), then Lock ID 2 (Bob).
  • Scenario B (Bob pays Alice): Lock ID 1 (Alice), then Lock ID 2 (Bob).

Notice the difference? In Scenario B, Bob wants to send money. But our code forces him to lock Alice’s account (ID 1) before he locks his own account (ID 2).

If Alice and Bob click “Send” at the same time, they both race to lock ID 1.

One wins. The other waits.

The winner finishes, releases the lock, and the loser proceeds. No circle. No deadlock.

The Safe Code

Here is the fix. It is surprisingly simple. We sort the IDs before we touch the database.

@app.post("/transfer/safe")
def transfer_safe(req: TransferRequest):
    # Determine lock order: Low ID first, High ID second
    first_lock_id = min(req.from_account, req.to_account)
    second_lock_id = max(req.from_account, req.to_account)

    with pool.connection() as conn:
        with conn.cursor() as cur:
            # 1. Lock accounts in a fixed, consistent order
            cur.execute("SELECT balance FROM accounts WHERE id = %s FOR UPDATE", (first_lock_id,))

            # Even with a delay, the second transaction is just WAITING for the first lock.
            # It hasn't locked anything yet, so no deadlock is possible.
            time.sleep(0.1) 

            cur.execute("SELECT balance FROM accounts WHERE id = %s FOR UPDATE", (second_lock_id,))

            # 2. Now that we have both locks, we check balances and transfer...
            # (Logic proceeds as normal)
Enter fullscreen mode Exit fullscreen mode

By enforcing a global standard for locking (Lowest -> Highest), we physically prevent the circular dependency that causes the crash.

The Aftermath

I ran the exact same attack.py script against the /transfer/safe endpoint.

The latency was slightly higher (because processes had to wait in line), but look at the reliability:

--- Starting Attack ---
Mode: SAFE (Fixed)
Users: 8

[0] ✅ Success: Bob -> Alice
[1] ✅ Success: Alice -> Bob
[2] ✅ Success: Bob -> Alice
[3] ✅ Success: Alice -> Bob
...
Success:   20
Deadlocks: 0
Enter fullscreen mode Exit fullscreen mode

Zero crashes. Perfect consistency.

Summary

Deadlocks aren’t random bad luck. They are predictable consequences of inconsistent locking.

If you touch multiple rows in a single transaction, always touch them in the same order.

Sort your IDs. Save your database.

Top comments (0)