In a previous guide, we discussed the high-level concepts of how databases manage multiple users—touching on transactions, locks, and connection pools. But if you're a backend engineer or preparing for a system design interview, you need to know what’s actually happening under the hood.
When 10,000 users hit a database simultaneously, how does the engine ensure data integrity without grinding to a halt? Let's dive deep into the architecture of database concurrency.
1. Concurrency Control Mechanisms
At the core of any relational database (like PostgreSQL or MySQL) is its Concurrency Control system. There are three primary strategies databases use to manage simultaneous access.
Pessimistic Concurrency Control (Locking)
Pessimistic concurrency assumes that conflicts will happen frequently. Therefore, it locks resources before modifying them.
- 2PL (Two-Phase Locking): A strict protocol where a transaction acquires all the locks it needs (Expanding Phase) and then releases them (Shrinking Phase).
- The Problem: Deadlocks. If Transaction A locks Row 1 and needs Row 2, while Transaction B locks Row 2 and needs Row 1, both freeze forever. The database handles this using a background Deadlock Detector that periodically scans for circular waits and forcefully kills one of the transactions (rolling it back).
Optimistic Concurrency Control (OCC)
Optimistic concurrency assumes conflicts are rare. It doesn't use locks when reading or updating. Instead, it reads data, modifies it locally, and right before committing, it checks a version number or timestamp.
- If the version hasn't changed since it was read, the commit succeeds.
- If the version has changed (someone else modified it), the transaction is aborted and must be retried.
- Use case: Great for read-heavy workloads (like Elasticsearch), but terrible for high-contention systems (like a ticket booking app) because the CPU wastes time rolling back and retrying.
MVCC (Multi-Version Concurrency Control)
This is the holy grail used by modern databases like PostgreSQL and MySQL (InnoDB).
The core philosophy of MVCC is: "Readers do not block writers, and writers do not block readers."
Instead of updating a row in place, MVCC creates a new version of the row.
- Every transaction is assigned a unique, monotonically increasing Transaction ID (XID).
- When a row is updated, the database writes the new data as a completely new tuple (row version) and marks the old tuple with an expiration XID.
- When User A runs a
SELECTquery, the database looks at User A's XID and only shows them row versions that existed before their transaction started. They see a completely consistent "snapshot" of the database, even if User B is currently modifying those exact rows!
(Note: Because MVCC keeps old versions around, databases need a background process—like PostgreSQL's **Vacuum—to eventually clean up dead rows to prevent storage bloat.)
2. The Buffer Pool and Latches (Not Locks!)
Databases do not read and write directly to the hard drive for every query—that would be catastrophically slow. Instead, they use a massive block of RAM called the Buffer Pool.
When a user requests data, the database loads an entire "Page" (usually 8KB of data) from the disk into the Buffer Pool. But what happens if two threads try to modify the exact same 8KB page in RAM at the exact same microsecond?
They use Latches.
- Locks protect logical database concepts (Rows, Tables) for the duration of a transaction.
- Latches protect physical memory structures (Pages, B-Trees) for the duration of a CPU instruction.
Latches are incredibly lightweight and fast. To handle thousands of users, the database relies on read/write latches (shared vs. exclusive) to ensure that in-memory data structures don't get corrupted by multi-threading.
3. Write-Ahead Logging (WAL)
If thousands of users are modifying data purely in the in-memory Buffer Pool, what happens if the server loses power? All that data is gone, right?
Enter the Write-Ahead Log (WAL).
Before the database modifies any page in the Buffer Pool, it must write a record of the intended change to a sequential log file on the disk (the WAL).
- Writing sequentially to a disk is blazing fast compared to random I/O.
- If the power fails, upon reboot, the database simply replays the WAL to reconstruct the exact state of the Buffer Pool before the crash.
This is how databases achieve extreme concurrent throughput while still guaranteeing Durability (the 'D' in ACID).
4. Transaction Isolation Levels Explained
When users run concurrently, they can experience "Phenomena" (anomalies). The SQL standard defines Isolation Levels to combat these:
- Read Uncommitted: You can see data that another transaction hasn't saved yet (Dirty Reads). Extremely fast, but dangerous. Rarely used.
- Read Committed: You only see saved data. However, if you query a row, wait, and query it again, the data might have changed (Non-Repeatable Reads). This is the default in PostgreSQL.
-
Repeatable Read: If you read a row, it is guaranteed to look the exact same for the duration of your transaction. However, a concurrent user might insert new rows that suddenly appear in your
COUNT(*)query (Phantom Reads). This is the default in MySQL InnoDB. - Serializable: The strictest level. The database uses complex locking (or Serializable Snapshot Isolation) to mathematically guarantee that concurrent transactions yield the exact same result as if they were executed one-by-one in a single-file line. It prevents all anomalies but severely limits concurrent throughput.
5. Threading and Connection Architectures
Finally, how does the database server OS handle the network connections of thousands of users?
- Process-per-connection (PostgreSQL): Every new connection forks a completely new OS process. This provides great memory isolation but uses a lot of RAM. This is why PostgreSQL highly recommends using a connection pooler like PgBouncer to multiplex 10,000 client connections onto a small pool of 100 actual database processes.
- Thread-per-connection (MySQL): Uses a single process but spawns a lightweight thread for every connection. It consumes less memory per connection than Postgres but still suffers from context-switching overhead at high scale.
🏁 Summary
Behind the scenes, a modern relational database is a marvel of concurrent engineering.
- It uses MVCC to ensure readers and writers don't block each other.
- It protects memory structures with microsecond Latches while protecting logical rows with Locks.
- It ensures durability without sacrificing speed via Write-Ahead Logging.
- It balances speed vs. correctness using configurable Isolation Levels.
Understanding these concepts is the key to tuning databases, debugging deadlocks, and architecting systems that scale gracefully to millions of users.
Top comments (0)