Athreya aka Maneshwar

Posted on Jan 28

The Pager Interface: How Higher Layers Touch Storage

#webdev #programming #database #architecture

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building "one place for all dev tools, cheat codes, and TLDRs" — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.

In the last post, we zoomed out and understood what the pager is and why it exists.
Today, we zoom in and look at how other SQLite modules actually talk to it.

This is where things get a bit interesting, because the pager doesn’t just expose functions, it enforces a discipline. A very strict one.

A Strict Contract Between Pager and Tree

Everything above the pager, especially the B-tree (tree) module is completely insulated from low-level chaos.

The tree module:

Does not know about file locks
Does not know about journal formats
Does not care how ACID is enforced

From its point of view, the world looks simple:

“I am inside a transaction.
Give me page N.
I might modify it.
I’ll tell you when I’m done.”

That’s it.

All the complexity of locking, logging, crash recovery etc lives entirely inside the pager.

This separation keeps correctness centralized and prevents higher layers from accidentally violating invariants.

The Pager–Client Interaction Protocol

The interaction between the tree module and the pager follows a very clear protocol.

1. Page Request

The tree module asks for a page by page number.

The pager:

Locates the page in the cache (if present)
Otherwise reads it from the database file
Returns a pointer to in-memory page data

At this point, the tree module is working entirely in memory.

2. Intent to Modify

Before the tree module modifies a page, it must notify the pager.

This is a critical rule.

Once notified, the pager may:

Save a before-image of the page into the rollback journal
Acquire the appropriate database file locks
Transition the pager into a write-capable state

The tree module never journals anything itself.
It simply declares intent.

3. Page Usage

The tree module reads or modifies the page freely in memory.

No disk I/O happens here.
No locks are taken here.
Everything has already been prepared by the pager.

4. Page Release

When the tree module is done, it releases the page back to the pager.

If the page was modified, the pager:

Marks it dirty
Decides when and how it will be written back
Ensures ordering constraints required for correctness

Why This Protocol Matters

This protocol enforces a powerful invariant:

All persistent state transitions pass through the pager.

No page can be modified without:

Journaling being set up
Locks being acquired
Recovery guarantees being in place

This is why SQLite can afford to keep higher layers simple and fast.

The Pager Object: One Handle per Database

At the implementation level, everything revolves around a structure called Pager.

Each open database file is managed by exactly one Pager object.

Conceptually:

One Pager == one database file handle
The pager is the database, from SQLite’s point of view

When the tree module wants to use a database:

It creates a Pager object
It keeps that object as a handle
Every pager level operation goes through it

Multiple Connections, Multiple Pagers

If a process opens the same database file multiple times:

Each connection gets its own Pager
Each Pager has its own cache
They are treated as independent

This avoids shared-state complexity by default.

(Shared-cache mode exists, but that’s a carefully controlled exception.)

In-memory databases follow the exact same rule they also have Pager objects, just without a backing file.

What the Pager Tracks Internally

The Pager object is not small.

It tracks:

Database file handle
Journal file handle(s)
Lock state
Transaction state
Page cache
Journal status
Database filename and journal filename
Savepoints

Immediately after the Pager structure in memory lives a variable-sized region that holds:

Page cache handlers
OS file handles (e.g., unixFile on Linux)
Journal metadata

The Pager object becomes the single source of truth for everything related to persistence.

Savepoints: Nested Transactions, Pager-Style

SQLite executes each SQL update inside an implicit savepoint, even within a user transaction.

On top of that, applications can define their own savepoints.

The pager tracks these using an array of savepoint objects.

Each savepoint records:

Where it began in the rollback journal
Whether new journal headers were written after it started
How far recovery should roll back if the savepoint is undone

When a savepoint is created:

Its offsets start at zero

If a new journal segment is written while the savepoint is active:

The pager records the boundary precisely

This allows SQLite to:

Roll back part of a transaction
Without disturbing earlier work
And without reopening files or redoing locking

All of this logic lives squarely inside the pager.

A Subtle but Important Pattern

Notice what’s missing in all of this:

The tree module never writes to disk
The tree module never locks files
The tree module never touches journals
The tree module never worries about crashes

Yet correctness still holds.

That’s because the pager enforces a single choke point for all persistent changes.

My experiments and hands-on executions related to SQLite will live here: lovestaco/sqlite

References:

SQLite Database System: Design and Implementation. N.p.: Sibsankar Haldar, (n.d.).

👉 Check out: FreeDevTools

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

⭐ Star it on GitHub: freedevtools

DEV Community