Table of Contents
- 1. Overview: Key Architectural Differences
- 2. Isolation Levels and Concurrency Control
- 3. MVCC and Transaction Isolation
- 4. SERIALIZABLE Isolation: Pessimistic vs Optimistic Strategies
Transaction processing reveals fundamental architectural differences between MySQL and PostgreSQL. MySQL prioritizes performance and predictability through pessimistic locking. PostgreSQL prioritizes consistency and concurrency through optimistic conflict detection. Understanding these differences will help you write better applications and avoid subtle data integrity issues.
1. Overview: Key Architectural Differences
This section provides a high-level comparison of how MySQL and PostgreSQL handle transactions. We'll cover three fundamental areas where they differ: ACID compliance, default isolation levels, and MVCC implementation.
Core Architecture Comparison
Aspect | MySQL 8.4 | PostgreSQL 17 |
---|---|---|
ACID Compliance | Engine-dependent - InnoDB provides full ACID, MyISAM does not | Built-in - ACID compliance is part of core architecture |
Default Isolation Level | REPEATABLE READ - Stricter by default | READ COMMITTED - More concurrent by default |
MVCC Implementation | Undo log - Automatic cleanup, zero maintenance | Tuple versioning - Requires VACUUM maintenance |
Phantom Read Prevention | Gap locking (at REPEATABLE READ) | Snapshot isolation (at REPEATABLE READ) |
What Each Difference Means
ACID Compliance:
MySQL's ACID support depends on which storage engine you use. InnoDB (the default) provides full ACID compliance, but MyISAM does not. PostgreSQL has ACID compliance built into its core architecture—every storage mechanism supports it.
Default Isolation Levels:
MySQL defaults to REPEATABLE READ, providing stronger isolation out of the box. PostgreSQL defaults to READ COMMITTED, favoring higher concurrency. Both prevent phantom reads at REPEATABLE READ level, but through different mechanisms (gap locking vs snapshot isolation).
MVCC Maintenance:
MySQL's undo log approach means old row versions are automatically cleaned up by a background purge thread—zero operational overhead. PostgreSQL's tuple versioning approach stores multiple row versions in the table itself, requiring VACUUM to reclaim space from dead tuples.
2. Isolation Levels and Concurrency Control
💡 Key Insight: Both MySQL and PostgreSQL are overachievers—they prevent phantom reads at REPEATABLE READ even though the SQL standard doesn't require it! 🎓 But here's the plot twist: MySQL is like that friend who locks ALL the doors (gap locking), while PostgreSQL takes a snapshot and says "trust, but verify" (snapshot isolation). Different philosophies, different trade-offs!
2.1 Isolation Levels Comparison
The SQL standard defines four isolation levels, but MySQL and PostgreSQL interpret them quite differently. The key difference? Their default choices reveal their priorities.
Isolation Level | MySQL 8.4 Behavior | PostgreSQL 17 Behavior |
---|---|---|
READ UNCOMMITTED | Rarely Used - In practice behaves like READ COMMITTED in InnoDB | Not truly supported - Behaves exactly like READ COMMITTED |
READ COMMITTED | Available - Allows non-repeatable reads and phantom reads | Default - Allows non-repeatable reads and phantom reads |
REPEATABLE READ | Default - Gap locking + MVCC prevents phantoms, allows write-skew | Available - Snapshot isolation prevents phantoms, allows write-skew |
SERIALIZABLE | Pessimistic - Converts plain SELECTs to FOR SHARE, prevents conflicts via locking | Optimistic - SSI detects conflicts at commit, can abort transactions |
Notice the defaults? MySQL ships with REPEATABLE READ—it's saying "I'll protect you even if it means blocking more." PostgreSQL ships with READ COMMITTED—it's saying "Let's be fast and concurrent; upgrade isolation when you need it."
Here's the fascinating part: Both databases go beyond the SQL standard at REPEATABLE READ by preventing phantom reads (when new rows appear in repeated queries). The standard doesn't require this! But they achieve it in completely opposite ways:
- MySQL's approach: Use gap locks to physically block inserts that would create phantoms. It's like putting a "Reserved" sign on every empty seat at a restaurant.
- PostgreSQL's approach: Take a snapshot and ignore any new rows added after. It's like taking a photo of the restaurant—even if new people arrive, your photo stays the same.
Critical insight for your app: Neither prevents write-skew at REPEATABLE READ (we'll explain this next). If you need protection against sophisticated anomalies, both databases require SERIALIZABLE—but they implement it VERY differently.
2.2 Write-Skew Anomaly
What is Write-Skew?
Write-skew is the sneakiest of database anomalies—it's like two people independently making perfectly reasonable decisions that somehow create chaos when combined. 🤔
Let me explain with a real-world scenario that'll make this crystal clear.
The Doctor On-Call Scheduling Problem
Picture this: You're running a hospital with one iron-clad rule—at least 2 doctors must be on-call at all times. Right now, Alice, Bob, and Charlie are all on-call, so you're safely above the minimum. 🏥
It's Friday evening, and both Alice and Bob are exhausted. They each want to go off-call, but they're responsible professionals—they'll only leave if there are still enough doctors remaining.
Here's what happens at REPEATABLE READ isolation level:
Timeline:
Time Transaction 1 (Alice) Transaction 2 (Bob)
---- --------------------------- ---------------------------
T1 BEGIN BEGIN
T2 SELECT COUNT(*) FROM doctors
WHERE on_call = true;
→ Returns 3 ✅
T3 SELECT COUNT(*) FROM doctors
WHERE on_call = true;
→ Returns 3 ✅
T4 "Great! 3 doctors on-call,
safe for me to leave."
UPDATE doctors
SET on_call = false
WHERE name = 'Alice';
T5 "Great! 3 doctors on-call,
safe for me to leave."
UPDATE doctors
SET on_call = false
WHERE name = 'Bob';
T6 COMMIT ✅ COMMIT ✅
The Result: Both transactions succeed! But now only Charlie is on-call. 😱 The "at least 2" constraint is violated!
Why did this happen?
This is write-skew. Here's what makes it so tricky:
- Both transactions read the same data (count of on-call doctors = 3)
- Both made reasonable decisions based on what they saw
- Both updated DIFFERENT rows (Alice's record vs Bob's record)
- No direct conflict occurred—traditional locks see no problem!
- The constraint was violated because neither transaction saw the other's changes
In database terms: Write-skew occurs when two concurrent transactions read overlapping data sets, make decisions based on what they read, then write to disjoint (non-overlapping) sets of rows in a way that violates an integrity constraint.
Why don't traditional locks catch this?
Because there's no write-write conflict! Alice updates Alice's row. Bob updates Bob's row. Different rows = no lock conflict. The database doesn't know that these two independent updates, when combined, violate a business rule. 🤷♂️
It's like two people checking the fridge, seeing 3 beers, each taking one—nobody's stealing the other's beer, but somehow the third roommate ends up with nothing. The reads overlap (both see the same count), but the writes don't (different rows).
The Key Difference:
Aspect | MySQL REPEATABLE READ | PostgreSQL REPEATABLE READ |
---|---|---|
Write-Skew Protection | ❌ Not prevented | ❌ Not prevented |
Why It Occurs | Gap locks only prevent phantom reads, not cross-row constraint violations | Snapshot Isolation doesn't detect conflicts on different rows |
Solution | Must use SERIALIZABLE or explicit locking (SELECT ... FOR UPDATE ) |
Must use SERIALIZABLE (SSI detects and aborts) or explicit locking |
Bottom Line: Both databases allow write-skew at REPEATABLE READ. The real difference appears at SERIALIZABLE level—PostgreSQL's SSI can detect and abort write-skew patterns automatically, while MySQL's gap locking only prevents conflicts through blocking, not intelligent detection.
2.3 Gap Locking vs Predicate Locks
Remember how we said both MySQL and PostgreSQL prevent phantom reads at REPEATABLE READ? This is how they do it—and the approaches couldn't be more different. Understanding this difference is crucial because it directly impacts your application's concurrency under load.
The Core Problem: Phantom Reads
Imagine you're counting inventory:
SELECT COUNT(*) FROM products WHERE category = 'Electronics';
-- Returns 100
During your transaction, someone inserts a new product with category = 'Electronics'
. If you run the same query again and suddenly get 101, that's a phantom read—a row that "appeared" out of nowhere. 👻
Both databases prevent this at REPEATABLE READ, but with radically different strategies.
MySQL's Gap Locking: The Pessimistic Gatekeeper
MySQL uses gap locks—it literally locks the "gaps" (empty spaces) in the index between existing records. Think of it like reserving not just the occupied tables at a restaurant, but also the empty spaces between them.
How it works:
When you run a query with FOR UPDATE
, MySQL doesn't just lock the rows that match—it locks the index ranges those rows occupy, including the gaps. This physically prevents anyone from inserting new rows that would fall into those gaps.
Example: Lock products with IDs between 100-200:
SELECT * FROM products WHERE id BETWEEN 100 AND 200 FOR UPDATE;
MySQL locks:
- All existing rows with IDs 100-200 (obviously)
- The gap before 100 (e.g., 95-99)
- The gap after 200 (e.g., 201-205)
- All gaps between existing rows in the range
The catch: Gap locks are range-based, not logic-based. They don't understand your WHERE clause's full meaning—they just lock index ranges. This can block inserts that are logically unrelated to your query.
PostgreSQL's Predicate Locks: The Precision Specialist
PostgreSQL uses predicate locks (also called SIREAD locks)—it remembers the actual predicate (WHERE clause) you used and only blocks operations that would violate that specific predicate.
How it works:
Instead of locking physical gaps in an index, PostgreSQL tracks the logical condition of your query. It says "I'm watching for any inserts that match WHERE category = 'Electronics'
"—and only those get blocked.
Example: Lock electronics products:
SELECT * FROM products WHERE category = 'Electronics' FOR UPDATE;
PostgreSQL blocks:
- Only inserts where
category = 'Electronics'
- Inserts with other categories proceed freely
The advantage: Higher concurrency because only actual predicate matches get blocked. If your query was WHERE category = 'Electronics' AND price > 100
, PostgreSQL only blocks inserts matching both conditions.
The Key Difference:
Aspect | MySQL (Gap Locking) | PostgreSQL (Predicate Locks) |
---|---|---|
Locking Scope | 🔴 Range-based - Locks index gaps, may block unrelated inserts | 🟢 Predicate-based - Only locks rows matching WHERE clause |
Blocking Behavior | Blocks inserts in or near locked range, even outside query conditions | Blocks only inserts that match the exact query predicate |
Concurrency Impact | Lower - Can block logically unrelated operations | Higher - Only blocks actual conflicts |
Example | Query WHERE date BETWEEN '2024-02-01' AND '2024-06-30' may block insert at 2024-01-31 (adjacent gap) |
Same query only blocks inserts matching both date range AND other conditions |
Example Scenario:
-- Lock contracts: WHERE start_date BETWEEN '2024-02-01' AND '2024-06-30' AND office_id = 1
Insert Attempt | MySQL Gap Locking | PostgreSQL Predicate Locks |
---|---|---|
office_id=1, date='2024-03-15' |
❌ BLOCKS (in range) | ❌ BLOCKS (matches predicate) |
office_id=1, date='2024-01-31' |
❌ BLOCKS (adjacent gap) | ✅ SUCCEEDS (outside date range) |
office_id=2, date='2024-03-15' |
✅ SUCCEEDS (different office) | ✅ SUCCEEDS (different office) |
Bottom Line: MySQL's gap locks are pessimistic and broader (blocking adjacent ranges), while PostgreSQL's predicate locks are precise (only blocking exact predicate matches). This makes PostgreSQL more concurrent at REPEATABLE READ level.
3. MVCC and Transaction Isolation
Note: For detailed MVCC storage implementation, see Part 2: Storage Architecture, Section 3.
Both databases use MVCC (Multi-Version Concurrency Control) to enable concurrent transactions, but their different MVCC implementations directly impact transaction isolation behavior. Understanding how MVCC affects transactions helps explain why certain isolation anomalies occur.
How MVCC Enables Isolation
MVCC allows readers to see consistent snapshots of data without blocking writers. The key question: When does a transaction see changes made by other transactions?
MySQL's Approach:
- Uses undo log to reconstruct old row versions
- At REPEATABLE READ: Creates a consistent snapshot at first read
- Readers see the snapshot, even if other transactions commit changes
- Writers acquire locks, creating potential blocking
PostgreSQL's Approach:
- Uses tuple versioning with transaction IDs
- At REPEATABLE READ: Creates a snapshot at transaction start
- Readers see the snapshot, completely isolated from concurrent changes
- Writers don't block readers (true snapshot isolation)
Transaction Behavior Implications
Scenario: Two concurrent transactions updating the same row
Event | MySQL (REPEATABLE READ) | PostgreSQL (REPEATABLE READ) |
---|---|---|
T1: BEGIN | Snapshot created on first read | Snapshot created at BEGIN |
T1: SELECT balance | Sees 1000, creates snapshot | Sees 1000 |
T2: UPDATE balance = 900 | T2 acquires row lock | T2 proceeds |
T2: COMMIT | Lock released | Commits successfully |
T1: SELECT balance again | Still sees 1000 (snapshot) | Still sees 1000 (snapshot) |
T1: UPDATE balance = 800 | Must wait if T2 holds lock | Proceeds, creates new version |
Key Difference: PostgreSQL's tuple versioning allows both transactions to proceed without blocking, potentially leading to lost updates. MySQL's locking approach blocks T1's UPDATE until T2 commits.
Why This Matters for Isolation Levels
The MVCC implementation explains why:
- Both prevent phantom reads at REPEATABLE READ - MySQL uses gap locks, PostgreSQL uses snapshots
- Both allow write-skew at REPEATABLE READ - Neither detects cross-row constraint violations
- PostgreSQL's SSI at SERIALIZABLE is more powerful - Tuple versioning enables dependency tracking
- MySQL requires more explicit locking - Undo log approach is coupled with pessimistic locking
Bottom Line: MVCC isn't just about storage—it fundamentally shapes how transactions interact. MySQL's undo log approach favors predictability through locking. PostgreSQL's tuple versioning favors concurrency but requires VACUUM maintenance.
4. SERIALIZABLE Isolation: Pessimistic vs Optimistic Strategies
💡 Key Insight: This is the ultimate showdown! 🥊 At SERIALIZABLE isolation level, the philosophical differences between MySQL and PostgreSQL reach their peak. MySQL is the bouncer who doesn't let anyone suspicious near the door (pessimistic locking). PostgreSQL is the cool host who lets everyone in, then kicks out troublemakers at the end (optimistic SSI). MySQL says "better safe than sorry," while PostgreSQL says "let's roll the dice and see what happens!" Both approaches work—just depends on whether you prefer blocking or retrying.
SERIALIZABLE is the strictest isolation level—it guarantees that concurrent transactions produce the same result as if they ran one at a time, in some serial order. Sounds great, right? The catch is how you achieve this. Both databases get there, but the experience is totally different.
Note: For detailed locking mechanisms, see Section 2.3.
The Key Differences
Aspect | MySQL (Pessimistic) | PostgreSQL (Optimistic - SSI) |
---|---|---|
Strategy | Block conflicts before they happen | Detect conflicts at commit time |
SELECT Behavior | Plain SELECTs become SELECT ... FOR SHARE (acquire shared locks) |
Plain SELECTs don't acquire locks |
Blocking Point | Early - On reads (FOR UPDATE/FOR SHARE) | Late - On commit (conflict detection) |
Concurrency | ❌ Lower - Extensive blocking | ✅ Higher - Concurrent execution allowed |
Transaction Failures | ✅ Rare (blocking prevents conflicts) | ❌ More common (serialization errors) |
Write-Skew Detection | ❌ No - Only prevents through blocking | ✅ Yes - SSI detects and aborts |
Application Code | ✅ Simpler (no retry logic) | ❌ More complex (must handle retries) |
What this means in practice:
When you run transactions at SERIALIZABLE in MySQL, even plain SELECT
statements automatically acquire shared locks (as if you wrote SELECT ... FOR SHARE
). This means Transaction 2 trying to read what Transaction 1 is reading? Blocked. Transaction 2 trying to update what Transaction 1 read? Blocked. Everything waits politely in line.
PostgreSQL does the opposite. Transactions run freely, reading and writing concurrently. PostgreSQL's SSI (Serializable Snapshot Isolation) tracks dependencies between transactions. Only at commit time does PostgreSQL ask: "Would this ordering violate serializability?" If yes, one transaction gets aborted with a serialization error. If no, everyone commits happily.
This is why PostgreSQL can detect write-skew anomalies (remember the doctor scheduling problem?) while MySQL can't—SSI is smart enough to see the dangerous pattern. MySQL just blocks aggressively and hopes for the best.
Behavior Comparison
Scenario: Two concurrent transactions reading and updating different rows
Event | MySQL SERIALIZABLE | PostgreSQL SERIALIZABLE |
---|---|---|
T1: SELECT | Acquires shared locks | No locks, proceeds |
T2: SELECT FOR UPDATE | ❌ BLOCKS waiting for T1 | ✅ Proceeds concurrently |
T1: UPDATE + COMMIT | Completes, T2 unblocks | Completes |
T2: UPDATE + COMMIT | Completes after wait | May ABORT if SSI detects conflict |
The Trade-off
Choose MySQL When | Choose PostgreSQL When |
---|---|
✅ Need predictable behavior (rare failures) | ✅ Need maximum concurrency |
✅ Want simpler application code (no retries) | ✅ Need write-skew detection |
✅ Low contention workload | ✅ Have retry logic implemented |
✅ Can tolerate blocking delays | ✅ High-contention workload benefits from optimism |
Bottom Line: MySQL = fewer failures but more blocking. PostgreSQL = more failures but higher throughput.
Top comments (0)