Vipul Gaikwad

Posted on Jun 28

How Postgres Writes Data: WAL, Shared Buffers, and the Illusion of Speed

#architecture #database #performance #postgres

As developers, we interact with databases every single day. We write a line of code like UPDATE users SET status = 'active' WHERE id = 42;, hit enter, and within a fraction of a millisecond, our terminal or API backend says Success! It feels like magic. But if you stop and think about how physical hardware works, that speed is actually a massive engineering paradox.

Your server's hard drive or SSD is relatively slow at doing "random writes"—which means jumping around to update data inside specific, scattered tables. If Postgres forced your physical hard drive to find and rewrite User #42's data block the exact millisecond you sent the query, your application would grind to a halt under high traffic.

So, how does PostgreSQL return a success message instantly while guaranteeing that your data won't be lost if the power suddenly cuts out?

Postgres is a master illusionist. It tricks your application into thinking the data has been organized into your tables on disk, while it actually hid it in two strategic places: Fast Memory (Shared Buffers) and an Append-Only Insurance Log (The WAL).

Let's lift the hood and see exactly how Postgres pulls off this performance trick.

1. The Fast Lane: Shared Buffers (RAM Cache)
PostgreSQL doesn't look at your data as individual rows. Internally, it breaks your tables down into Pages—which are standard 8KB blocks of memory.

When you boot up PostgreSQL, it claims a large chunk of your server's RAM. This playground in memory is called the Shared Buffers.

When you execute that UPDATE query for User #42, Postgres calculates exactly which 8KB page that user lives on. It then executes a simple choice:

The Cache Hit: If someone recently read or updated that user, the 8KB page is already sitting in the Shared Buffers RAM. Postgres updates the data bits right there in memory.
The Cache Miss: If the page isn't in RAM, Postgres runs down to the slow storage disk, copies that entire 8KB page up into the Shared Buffers RAM, and then modifies it in memory.

Once a page is modified in RAM but hasn't been written back to the permanent table files on disk yet, Postgres marks it as a Dirty Page.

Because modifying RAM takes nanoseconds, your server gets an almost instant response. But this leaves us with a massive problem: RAM is volatile. If someone pulls the power plug right now, that "dirty page" vanishes, and your data modification is gone forever.

2. The Insurance Policy: The Write-Ahead Log (WAL)
To protect your data without slowing down your queries, Postgres uses a core design pattern called the Write-Ahead Log (WAL).

The WAL is a continuous, append-only binary file saved directly onto your physical disk. Before the Postgres engine is allowed to tell your application "Success!", it hands the transaction details to a background worker called the WAL Writer.

The WAL Writer appends the raw transaction bytes to the very end of the WAL file on disk.

Sequential vs. Random I/O
This is the ultimate engineering trade-off that makes databases work:

Modifying actual table storage files is Random I/O. Because data rows are scattered, the disk has to jump around to different locations. This is incredibly slow.
Appending to the WAL is Sequential I/O. The disk arm or SSD controller doesn't have to jump; it just writes data to the very end of a continuous log file. This is blindingly fast.

Postgres guarantees durability by ensuring the log hit the physical disk before it tells your backend app that the query succeeded.

What if the database crashes?
If the server loses power while your data is sitting as a "dirty page" in RAM, Postgres doesn't panic. When it boots back up, it triggers a Crash Recovery process. It opens the WAL file on disk, reads the recent transactions, and replays them sequentially (a process called Redo). This perfectly rebuilds the state of the database up to the exact millisecond of the crash.

3. The Clean-Up Crew: The Checkpointer
So far, our updated data is safely written to the sequential WAL file on disk, and the live application is reading the fast, updated "dirty page" in the RAM Shared Buffers. But the actual, permanent table files (the .data files) still contain the old data.

When does the real table storage get updated? Enter The Checkpointer.

The Checkpointer is a background process that acts like a cleaning crew. Periodically, it wakes up, scans the Shared Buffers for all "dirty pages," and flushes them in a batch down to the permanent .data storage files on disk.

By default, Postgres triggers a checkpoint based on two configurations:

Time Limit (checkpoint_timeout): Every 5 minutes.
Data Volume Limit (max_wal_size): If your app is writing so much data that your unwritten WAL files hit a threshold (like 1 GB), Postgres runs an early checkpoint to free up disk space.

Once the Checkpointer finishes writing the dirty pages to the table files and verifies the disk has safely locked them in, those old WAL files are no longer needed for crash recovery. Postgres can now safely delete or recycle them.

** 4. The Concurrency Strategy: Why Postgres Uses Processes, Not Threads**
If you compare PostgreSQL to other databases like MySQL, you will notice a radical architectural difference: Postgres chooses Operating System Processes over Threads.

When an application server connects to Postgres, the database doesn't spin up a lightweight thread. Instead, it triggers an operating system fork() command to clone a completely independent, dedicated OS process for that specific connection.

Every single database connection gets its own completely isolated memory sandbox.

Why choose heavy processes over light threads?

Absolute Fault Isolation: In a threaded database, all threads share the same memory space. If a single thread encounters a catastrophic memory bug or a corrupted extension, it can accidentally overwrite a neighboring thread's memory or crash the entire database instance. In Postgres, if a connection process crashes, only that single user's connection dies. The rest of the database remains completely online.
No Memory Leaks: Complex queries require temporary memory allocations to sort and join data. When a Postgres connection process closes, the operating system instantly wipes that entire process sandbox clean, leaving zero chance for insidious memory leaks. How do isolated processes share the RAM cache? If processes are completely sandboxed by the operating system, how do they all look at the same tables inside the Shared Buffers?

Postgres solves this using POSIX Shared Memory. When Postgres starts up, it requests a giant block of shared memory from the operating system. Every single isolated connection process maps this shared memory segment into its own address space, allowing them to collaborate on the same data cache securely.

5. The Scaling Solution: Connection Pooling
Because operating system processes are resource-heavy, running a process-per-connection design means Postgres has a hard ceiling. If you try to open 5,000 or 10,000 direct active processes on a standard database server, the CPU will buckle under intense context-switching overhead.

How do applications with millions of active users scale on Postgres without melting the server? They decouple users from database processes using a Connection Pooler.

In a production environment, you place an incredibly lightweight proxy like PgBouncer right in front of your database.

The Edge: Your thousands of application servers open 5,000 lightweight network connections to PgBouncer. Because PgBouncer uses an asynchronous event-loop (similar to Node.js or Nginx), it can hold tens of thousands of connections open using almost zero CPU.
The Core: PgBouncer connects to PostgreSQL using a tiny, fixed pool of heavy OS processes—typically just 50 to 100 processes.
The Multiplexing: When a user triggers an API request, PgBouncer grabs the query, assigns it to one of the 100 idle Postgres processes for a couple of milliseconds, completes it, and instantly hands that process to the next query waiting in line.

The database processes never multiply by the number of users. The database operates at peak mathematical efficiency because it only ever handles 100 things at a time, while servicing millions of users at the edge of your architecture.

Conclusion: The Beautiful Balance
PostgreSQL's architecture is a masterclass in elegant system design. By separating its concerns across specialized layers:

Shared Buffers handle lightning-fast execution in memory.
The WAL handles unbreakable durability on disk using sequential writes.
The Checkpointer manages background table organization without hurting live query performance.
Process Isolation guarantees absolute stability, while Connection Pooling provides unlimited scale.

The next time you write a basic SQL query, you can appreciate the beautiful dance of operating system memory and hardware optimization happening beneath your code.

DEV Community

How Postgres Writes Data: WAL, Shared Buffers, and the Illusion of Speed

Top comments (0)