In Part 1, we explored the four pillars of ACID: Atomicity
, Consistency
, Isolation
, and Durability
. But let's be honest — concepts like these can feel a bit textbook-ish until you hit a real problem and think, "Wait, how does this work under the hood?"
That's exactly what happened to me.
I found myself wondering:
How does the database make sure I don't end up with half-finished data?
That question led me to discover one of the most fascinating pieces of database architecture:
Atomicity, and its behind-the-scenes hero — the Write-Ahead Log (WAL).
💥 The Banking Disaster That Never Happened
Let's re-look at a simple example: you're transferring ₹100 from Alice to Bob.
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE name = 'Alice';
UPDATE accounts SET balance = balance + 100 WHERE name = 'Bob';
COMMIT;
Looks clean, right?
But what if the server crashes right after debiting Alice, before Bob gets anything?
😱 Alice loses money. Bob gets nothing. Nobody wins.
That's where atomicity steps in — the database's promise that either everything happens or nothing does. No half-cooked data, no side effects.
But how?
🔍 Peeking Behind the Curtain: Write-Ahead Logging
One might think that databases simply update data files directly—just write to disk and it's done.
Turns out, that would be a disaster during a crash. It's like scribbling on a paper while someone's shaking the table — good luck keeping it neat.
Instead, databases use a strategy that's equal parts paranoid and brilliant:
Write down your plans first, then act.
That's the Write-Ahead Log (WAL). It's like the database's diary:
Heres exactly what I'm going to do. If I get interrupted, I'll just retrace my steps.
⚒️ How It Works?
Here's what really happens when you run a transaction:
- You start the transaction.
- The database logs every change — not to the table, but to the WAL.
- When all the changes are logged, the WAL is flushed to disk.
- Only then does the database respond with COMMIT successful.
- The actual table data? That might get updated later.
Yes — the real data might still be unchanged after your transaction commits. Mind-bending, right? This completely flipped my understanding. I always thought COMMIT meant "okay, it's definitely written to the database now." But really, it means "I've made an unbreakable promise to write it, and I have the receipt to prove it."
💡 When Does the Actual Data Get Updated?
This puzzled me at first. I assumed "commit" meant the database had written the final result.
But really, the WAL is the source of truth. As long as those logs are safe, the database can recreate the transaction later.
If the server crashes before it updates the real data? No problem. On restart, it reads the WAL and says:
Hmm, looks like I promised to credit Bob ₹100. Let me go finish that now.
It's like waking up and checking your to-do list: "Oh right, I never did step two."
🔄 Flushing the WAL – What Does That Really Mean?
Here's where I hit another wall.
Everyone says "WAL is flushed to disk" but what does that actually mean?
Turns out, it's a three-step process:
- The WAL is written to an in-memory buffer.
- The OS queues that buffer to be written to the disk.
- The database calls
fsync()
(or equivalent) to say:
"No, don't just promise to write it. Physically write it now."
That last step is critical. Without it, your data could still be floating around in memory buffers, and a power cut would wipe it out.
🗂️ What Happens After the WAL is Applied?
Okay, so the transaction is written to the WAL file… and later, it's applied to the actual database. But then what?
Does the WAL file get deleted?
Is the entry removed after the transaction is committed?
Not immediately.
In most databases—like PostgreSQL—the WAL (Write-Ahead Log) files are retained even after the changes they describe have been applied to the data files.
Why?
Because they serve more purposes than just crash recovery:
- 🔄 They support replication to standby servers
- 📂 They're used for point-in-time recovery (PITR)
- 📦 They can help with logical decoding and incremental backups
The database maintains checkpoints to track which changes have been fully flushed to disk. Once it's sure that:
- All data changes described in the WAL are safely written to disk, and
- The WAL is no longer needed for replication or recovery...
...then, and only then, the old WAL segments are marked for removal or archiving, depending on your settings.
So yes, even after a transaction is committed, the WAL entry sticks around for a while, unless you're in a minimalist config.
🤔 Okay, Hold On — If WAL is kept around, won't it apply the same transaction twice if the DB crashes again and again?
This was my next big question.
If WAL is replayed after a crash, how does the database avoid applying the same changes more than once?
The answer: LSNs — Log Sequence Numbers.
Each WAL record has an LSN. Every data page also stores the LSN of the last WAL entry it applied.
During recovery:
- If the WAL's LSN > page's LSN → apply
- If not → skip it, it's already done
It's like the database saying, "Oh, I already covered this. Moving on."
🔁 The Crash Recovery Sequence
Let's put it all together:
- You run transaction #101. WAL gets written.
-
COMMIT
is issued. WAL is flushed to disk. - 💥 Crash! Data pages never got updated.
- Database restarts. Sees WAL for TX 101.
- Applies the changes — based on LSNs.
- Done. The system is back where it was, no data lost.
That's atomicity in action. Clean, recoverable, foolproof.
❌ What WAL Doesn't Do
Here's something I misunderstood for a while:
WAL doesn't enforce constraints or business logic.
If you try to transfer ₹10,000 from an account that only has ₹5, the database rejects the transaction — before it ever touches the WAL.
WAL's job starts after the transaction passes all checks.
So:
- Atomicity is: "Apply all-or-nothing"
- Consistency is: "Only valid transactions are allowed"
Mixing those two up is easy — but important to separate.
🧪 PostgreSQL vs MySQL: WAL in the Wild
Different databases use different flavors of WAL:
✅ PostgreSQL
- Redo log only — applies committed changes during recovery
- Background process updates actual data files
- Controlled with
synchronous_commit
setting
✅ MySQL (InnoDB)
- Uses both redo and undo logs
- Redo = for committed transactions (like PostgreSQL)
- Undo = for rolling back partial changes
- Controlled with
innodb_flush_log_at_trx_commit
🧠 Why This Actually Matters (No, Really)
If you think COMMIT
is just a magic spell that makes your data safe — think again. Understanding WAL flips the script:
- 💣 Stop with the micro-transactions — every
COMMIT
slams a WAL flush. Your database isn’t a vending machine. - 🧩 Bundle your writes — batch related changes into one clean, atomic move. It’s not just tidy — it’s smart.
- ⚠️ Delayed durability? That’s code for “please lose my data if something crashes.” Use with extreme caution.
- 🔍 Know what
COMMIT
actually does — it's not fairy dust. It’s a disk write, a sync, and a promise. Respect it.
Treat your transactions like they matter — because they do.
📌 TL;DR
Question | Answer |
---|---|
What is WAL? | A durable log of all pending transaction ops |
When is a transaction "committed"? | After WAL is flushed to disk (fsync ) |
What if we crash after commit? | WAL is replayed — data is recovered |
Can WAL be applied twice? | No — LSNs prevent duplicate application |
Does WAL enforce data validity? | No — it only logs valid transactions |
🕒 Next Up: Isolation
Now that we've uncovered the "A" in ACID, it's time to dive into the most slippery one — Isolation.
In Part 3, we'll explore:
- What really happens when two users update the same data at the same time
- The difference between repeatable read and serializable
- And what a "phantom read" actually looks like
I promise — it's weirder and more fun than it sounds.
Top comments (0)