DEV Community

Cover image for Lost Updates, Dirty Reads, and Phantom Rows: A Simple Guide to SQL Transaction Problems
hinlocaesar
hinlocaesar

Posted on

Lost Updates, Dirty Reads, and Phantom Rows: A Simple Guide to SQL Transaction Problems

Ever wondered why your banking app doesn't let two people withdraw the same money twice? Or why sometimes database queries give you inconsistent results? Welcome to the world of transaction isolation problems!

When multiple users access a database simultaneously, things can get messy. Let's explore the four classic concurrency problems that keep database engineers up at night. πŸŒ™

πŸ”„ Lost Updates

What happens: Two transactions update the same data, but one overwrites the other's changesβ€”causing the first update to vanish into thin air.

Real-world example:

Imagine two people trying to add money to the same bank account at the exact same time:

Timeline:
1. Alice reads balance: $100
2. Bob reads balance: $100
3. Alice adds $50 β†’ writes $150 to database
4. Bob adds $30 β†’ writes $130 to database

Final balance: $130 ❌
Expected balance: $180 βœ…
Enter fullscreen mode Exit fullscreen mode

What went wrong: Alice's $50 deposit was completely lost! Bob read the balance before Alice's transaction committed, so he had stale data. When Bob wrote his result, he overwrapped Alice's update.

SQL example:

-- Transaction A
BEGIN;
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
COMMIT;

-- Transaction B (running simultaneously)
BEGIN;
UPDATE accounts SET balance = balance + 30 WHERE id = 1;
COMMIT;
Enter fullscreen mode Exit fullscreen mode

The fix: Use proper locking or higher isolation levels to ensure transactions see each other's changes.


🧹 Dirty Reads

What happens: You read data that another transaction has changed but hasn't committed yet. If that transaction rolls back, you've read data that never officially existed!

Real-world example:

An online store temporarily changes a product's price:

Timeline:
1. Transaction A: Changes iPhone price from $999 to $1 (not committed)
2. Transaction B: Reads price β†’ sees $1
3. Transaction A: Realizes mistake and ROLLBACK
4. Transaction B: Shows customer the $1 price
5. Customer tries to buy at $1, but actual price is still $999 😑
Enter fullscreen mode Exit fullscreen mode

SQL example:

-- Transaction A
BEGIN;
UPDATE products SET price = 1 WHERE id = 42;
-- Oops, that was a mistake!
ROLLBACK;

-- Transaction B (running at the same time)
BEGIN;
SELECT price FROM products WHERE id = 42;  -- Sees $1!
-- Uses this price for calculations...
COMMIT;
Enter fullscreen mode Exit fullscreen mode

The fix: Use READ COMMITTED or higher isolation level to prevent reading uncommitted data.


πŸ”€ Non-Repeating Reads (Fuzzy Reads)

What happens: You read the same data twice within one transaction and get different results because another transaction modified it in between.

Real-world example:

Checking your bank balance while someone else is transferring money:

Timeline:
1. You start checking account summary
2. You read checking account: $1,000
3. Someone transfers $500 from checking to savings (commits)
4. You read savings account: $1,500
5. Total shown: $2,500 (actual total: $2,000) ❌
Enter fullscreen mode Exit fullscreen mode

Same transaction, same query, different results each time!

SQL example:

-- Your transaction
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns $1,000

-- Someone else's transaction (completes)
BEGIN;
UPDATE accounts SET balance = 500 WHERE id = 1;
COMMIT;

-- Back to your transaction
SELECT balance FROM accounts WHERE id = 1;  -- Now returns $500!
COMMIT;
Enter fullscreen mode Exit fullscreen mode

The fix: Use REPEATABLE READ isolation level to lock rows you've read.


πŸ‘» Phantom Reads

What happens: New rows appear (or disappear) between queries in the same transaction, like ghosts materializing out of nowhere.

Real-world example:

Generating a report on employee count:

Timeline:
1. Transaction A: SELECT COUNT(*) FROM employees WHERE dept = 'Sales'
   Result: 10 employees

2. Transaction B: INSERT INTO employees (name, dept) VALUES ('Alice', 'Sales')
   COMMIT;

3. Transaction A: SELECT COUNT(*) FROM employees WHERE dept = 'Sales'
   Result: 11 employees (Wait, where did this person come from? πŸ‘»)

4. Transaction A: COMMIT
Enter fullscreen mode Exit fullscreen mode

SQL example:

-- Transaction A (generating a report)
BEGIN;
SELECT COUNT(*) FROM employees WHERE dept_id = 5;  -- Returns 10
-- ... doing other calculations ...

-- Transaction B (runs and completes)
BEGIN;
INSERT INTO employees (name, dept_id) VALUES ('New Hire', 5);
COMMIT;

-- Back to Transaction A
SELECT COUNT(*) FROM employees WHERE dept_id = 5;  -- Now returns 11!
-- Report is now inconsistent!
COMMIT;
Enter fullscreen mode Exit fullscreen mode

The fix: Use SERIALIZABLE isolation level to prevent phantom reads.


πŸ›‘οΈ The Solution: Isolation Levels

SQL databases offer different isolation levels that trade off between consistency and performance:

Isolation Level Lost Updates Dirty Reads Non-Repeating Reads Phantom Reads
READ UNCOMMITTED ❌ Possible ❌ Possible ❌ Possible ❌ Possible
READ COMMITTED ❌ Possible βœ… Prevented ❌ Possible ❌ Possible
REPEATABLE READ βœ… Prevented βœ… Prevented βœ… Prevented ❌ Possible
SERIALIZABLE βœ… Prevented βœ… Prevented βœ… Prevented βœ… Prevented

Setting isolation level in SQL:

-- PostgreSQL/MySQL
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
-- Your queries here
COMMIT;

-- SQL Server
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
-- Your queries here
COMMIT;
Enter fullscreen mode Exit fullscreen mode

🎯 Quick Decision Guide

Use READ COMMITTED (default in most DBs) when:

  • Building typical web applications
  • You need good performance
  • Slight inconsistencies are acceptable

Use REPEATABLE READ when:

  • Generating financial reports
  • Processing batch operations
  • Reading data multiple times in one transaction

Use SERIALIZABLE when:

  • Money transfers between accounts
  • Inventory management (prevent overselling)
  • Anything where absolute consistency matters

πŸ’‘ Key Takeaways

  1. Lost Updates = One transaction overwrites another's changes
  2. Dirty Reads = Reading data that might be rolled back
  3. Non-Repeating Reads = Same query, different results within one transaction
  4. Phantom Reads = Rows appearing or disappearing mid-transaction

Higher isolation levels give you more consistency but slower performance. Choose wisely based on your application's needs!

Have you encountered any of these problems in your applications? Share your war stories in the comments below! πŸ‘‡


Want to learn more? Check out your database's documentation on transaction isolation levels. Each database (PostgreSQL, MySQL, SQL Server) implements these slightly differently!

sql #database #transactions #concurrency #programming

Top comments (0)