DEV Community

Cover image for Your Hand-Rolled Two-Phase Commit Between Two Databases Isn't Atomic
scubaDEV
scubaDEV

Posted on

Your Hand-Rolled Two-Phase Commit Between Two Databases Isn't Atomic

I had a write that spanned two physically separate databases. A rename that had to propagate across several tables in one database and a couple of tables in another, and the two had to stay consistent. No distributed transaction coordinator was available to me. So I did the obvious thing: opened a transaction on each, did the work, and committed them one after the other inside a try/catch with rollbacks on both sides. It felt safe. It compiled. It passed tests.

Then I drew the failure on a whiteboard, and the safety evaporated.

The window that ruins everything

Here's the structure, simplified:

await using var txA = await dbA.Database.BeginTransactionAsync();
await using var txB = await dbB.Database.BeginTransactionAsync();

await DoWorkOnA(dbA);
await DoWorkOnB(dbB);

await txA.CommitAsync();   // <-- succeeds
await txB.CommitAsync();   // <-- what if this throws?
Enter fullscreen mode Exit fullscreen mode

Two transactions do not make one atomic operation. CommitAsync is a point of no return, and there are two of them. Between the first commit returning and the second one starting, there is a window. If txB fails in that window — the connection drops, the process is killed, the database hiccups — then A is permanently committed and B never happens. Your rollback in the catch block is useless: you can't roll back txA, it's already durable. The two databases now disagree, and nothing in your code will heal that on its own.

This is the dual-write problem, and it's not a bug you can fix by being more careful with try/catch. The atomicity you want simply isn't available from two independent commits. Ordering them, nesting them, wrapping them — none of it closes the window, because the window is inherent to having two commit points.

Why "it's never failed" isn't reassurance

The seductive thing about this pattern is that the window is small, so in practice it almost never triggers. You can run it for a year and never see an inconsistency. That's exactly what makes it dangerous: it trains you to trust it, and then it fails during the one incident — a deploy, a failover, an OOM kill — when you're least able to notice a quietly desynced record. Rare and catastrophic beats frequent and visible, from the bug's point of view.

What actually addresses it

None of these are as convenient as two commits in a row. That's the point — the convenience was the illusion.

Make one database the source of truth. Write atomically to a single primary, and treat the second database as a projection you update asynchronously and idempotently afterward. The primary is always correct; the secondary catches up. You trade instant consistency for the ability to actually guarantee the important write.

Use the outbox pattern. In the same transaction as your primary write, insert a row describing the change into an "outbox" table. A separate process reads the outbox and applies the change to the second database, retrying until it succeeds. Because the outbox row is committed atomically with the real work, you can never lose the intent — you can only delay it. This is the standard answer for a reason.

Detect and reconcile. If you truly can't restructure, at least stop pretending the write is atomic. Make the second write idempotent and retryable, and run a reconciliation pass that finds and repairs divergence. It's a patch over the gap, not a closing of it — but an honest patch beats a false guarantee.

The mindset shift

The real fix isn't a code change, it's dropping a belief: that committing two transactions back-to-back is "basically" atomic. It isn't, and no arrangement of them will be. The moment a write spans two systems, you've left the world of database transactions and entered distributed systems, where the tools are outboxes, sagas, idempotency, and reconciliation — not a second CommitAsync and some hope. Recognizing which world you're in is most of the battle. The pseudo-2PC is sometimes good enough; just never mistake "good enough" for "atomic."

Top comments (0)