Athreya aka Maneshwar

Posted on Jan 31

Page Cache and Pager State: Where Concurrency Becomes Concrete

#webdev #programming #database #architecture

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building "one place for all dev tools, cheat codes, and TLDRs" — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.

Up to now, we’ve treated the pager mostly as an abstract transaction manager, something that enforces journaling, locking, and atomicity behind the scenes.

Today’s learning grounds that abstraction in memory and state.

This is the layer where SQLite’s clean design choice becomes obvious:
each process lives in its own reality, yet correctness still holds.

Page Cache Lives in User Space — Not the OS

SQLite’s page cache lives inside the address space of each application process.

Yes, the operating system also has its own page cache, but SQLite doesn’t rely on it for correctness. When a process reads from disk:

The OS may cache disk blocks
SQLite copies the data into its own page cache
All database logic works on that in-memory copy

This separation is intentional. Two different processes opening the same database file do not share page cache or in-memory pages and do not coordinate through memory

Even if the same database is opened twice inside the same process or across threads

SQLite allocates separate page caches, each owned by its own Pager object.

The only shared surface is the database file itself, coordinated via locks.

Why This Matters

This design avoids:

Cross-process memory synchronization
Shared-memory corruption risks
Subtle cache coherency bugs

Instead, SQLite pushes all coordination to file locks, journals and pager state transitions

It’s slower than shared memory but vastly simpler and safer.

Cache State Is Pager State

A page cache doesn’t exist independently.

Its behavior is entirely driven by two pager variables:

Pager.eState → what the pager is doing
Pager.eLock → what the pager is allowed to do

Together, they form a small but strict state machine.

Pager Execution States (`eState`)

The pager (and its cache) is always in exactly one of these states:

1. `PAGER_OPEN`

The initial, neutral state.

No pages cached
No transaction active
Database may or may not be locked

This is where a pager is born and where it eventually returns.

2. `PAGER_READER`

At least one read transaction is active.

Pages may be read into cache
No page modification allowed
Multiple pagers may coexist here

This is the normal steady state for read-heavy workloads.

3. `PAGER_WRITERLOCKED`

A write transaction exists but no changes yet.

Reserved lock is held
Pages can still be read
Nothing is dirty

Think of this as intent declared, but not acted upon.

4. `PAGER_WRITER_CACHEMOD`

Cached pages are now being modified.

Tree module has permission to write
Journaling has already happened
Changes exist only in memory

Disk is still untouched.

5. `PAGER_WRITER_DBMOD`

Disk writes have begun.

Pages are being flushed to the database file
Exclusive lock is required
No other pager may read or write

This is the most dangerous phase, crashes matter here.

6. `PAGER_WRITER_FINISHED`

All modified pages are written.

No more updates allowed
Commit is imminent
Waiting for finalization

The data is durable, but the transaction isn’t complete yet.

7. `PAGER_ERROR`

Something went wrong.

I/O failure
Out-of-memory
Disk full

From here, only rollback or shutdown paths remain.

Lock States (`eLock`): What the Pager Is Allowed to Do

Separately from execution state, the pager tracks file lock strength.

There are only four possibilities:

`NO_LOCK`

Pager is idle
No reads or writes
Cache may still exist

`SHARED_LOCK`

Reading pages
Multiple pagers allowed
No modification permitted

This is the most common state in real systems.

`RESERVED_LOCK`

Pager intends to write
No disk changes yet
Other pagers may still read

Only one pager can hold this lock at a time.

`EXCLUSIVE_LOCK`

Database file is being modified
No other pager may read or write
Required for actual disk writes

This lock is held for the shortest time possible.

How State Transitions Actually Happen

Here’s the flow in practice:

Pager starts in NO_LOCK
First sqlite3PagerGet → SHARED_LOCK
All pages unref’d → back to NO_LOCK
First sqlite3PagerWrite → RESERVED_LOCK
First actual disk write → EXCLUSIVE_LOCK
Commit or rollback completes → NO_LOCK

The key point:

Locks follow usage, not intent.

SQLite only escalates when it absolutely must.

Special Case: Temporary & In-Memory Databases

Temporary and in-memory databases are different:

They cannot be accessed by other processes
There is no concurrency to manage

So their pager lock state is always:

EXCLUSIVE_LOCK

This avoids unnecessary checks and simplifies execution.

The Big Picture Takeaway

Today’s learning reveals something subtle but powerful:

Page caches are private
Pager state is global per connection
Locking is external, minimal, and deliberate
State transitions encode correctness rules

SQLite doesn’t rely on clever tricks, it relies on explicit, boring state machines and that’s exactly why it works.

My experiments and hands-on executions related to SQLite will live here: lovestaco/sqlite

References:

SQLite Database System: Design and Implementation. N.p.: Sibsankar Haldar, (n.d.).

👉 Check out: FreeDevTools

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

⭐ Star it on GitHub: freedevtools

DEV Community

Page Cache and Pager State: Where Concurrency Becomes Concrete

Page Cache Lives in User Space — Not the OS

Why This Matters

Cache State Is Pager State