Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building "one place for all dev tools, cheat codes, and TLDRs" — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.
Databases exist in a hostile world. Disks may fail mid-write. Power cuts interrupt operations. Memory corrupts. Multiple processes access the same file simultaneously.
Yet somehow, SQLite manages to maintain correctness by ensuring committed transactions persist and incomplete ones vanish even when hardware behaves unpredictably.
This post explores how SQLite transforms unreliable hardware into a reliable database system.
It is a concise summary of what I’ve learned over the past month while diving into SQLite’s internals, distilled from a longer ongoing series (Understanding SQLite) and grounded primarily in my study of SQLite Database System: Design and Implementation.
We'll trace the journey from physical storage to the sophisticated mechanisms that guarantee ACID properties, revealing how engineering discipline overcome chaos.
The Foundation: Understanding Hardware Reality
To appreciate SQLite's correctness guarantees, we must first understand what it's working against.
Disks
At the physical level, persistent storage is fundamentally unreliable.
Storage devices expose a byte-addressable interface, but provide no guarantees that multi-byte writes are atomic or durable across power failures.
While filesystems often operate using blocks (commonly 4 KB, but not guaranteed), SQLite cannot rely on any specific block size or atomic-write behavior.
A write operation can fail midway during a power loss, leaving a page only partially updated.
This results in corrupted data that may not be detected by the operating system or hardware.
Consider what happens when a database tries to update data:
If power fails between steps 3 and 4, the data may be lost.
If it fails during step 3, the database page may be partially written, corrupting the database.
The Non-Atomic Write Problem
Database systems cannot ignore this reality. Disk writes are not atomic.
A write operation can fail midway, leaving persistent storage in an inconsistent state. This is why databases:
- Use write-ahead logging (WAL)
- Avoid overwriting data blindly
- Rely on journaling, recovery protocols, and defensive I/O ordering
- Treat disks as unreliable, even when hardware claims otherwise
Database designers must explicitly plan for partial writes and silent corruption.
Concurrency: Multiple Processes, One File
SQLite stores an entire database in a single native file.
This design choice heavily influences how concurrency is implemented.
Unlike server databases that can use row-level or table-level locking, SQLite coordinates concurrency using a database-level locking protocol built on top of operating-system file-locking primitives.
Multiple processes may try to:
- Read the database simultaneously
- Write to different tables at the same time
- Modify the same data concurrently
Without proper coordination, these operations would corrupt the database file.
SQLite's Defense Strategy: Layered Correctness
SQLite doesn't fight hardware chaos with complexity.
Instead, it uses a carefully designed set of mechanisms that work together to guarantee correctness:
- Journals for crash recovery
- File locking for concurrency control
- Transaction management for atomicity
- Checksums and validation for corruption detection
Let's examine each layer.
Layer 1: Journals: The Safety Net
SQLite's first line of defense against hardware failures is the journal.

A journal is a crash recovery file that records database changes so SQLite can roll back incomplete transactions.
Rollback Journal: Before Images for Recovery
For every database file, SQLite uses a corresponding rollback journal:
- Database:
emoji-db-v4.db - Journal:
emoji-db-v4.db-journal
The rollback journal stores before-images of database pages, complete copies of pages as they existed before modification.
This allows SQLite to restore the database to its previous state if something goes wrong.
How Rollback Journals Work
When a write transaction begins:
- SQLite creates a rollback journal file
- Before modifying any database page, SQLite writes the entire original page to the journal
- SQLite flushes the journal to stable storage
- SQLite modifies the database page in memory
- When the transaction commits, SQLite finalizes the journal (delete, truncate, or zero the header depending on journal mode)
Journal finalization (after database pages are synced) marks the commit point
If a crash occurs before commit:
- The journal file still exists
- On next startup, SQLite detects the journal
- SQLite validates the journal header and checksums
- SQLite reads the before-images from the journal
- SQLite restores all modified pages
- The database returns to its pre-transaction state
This is physical (age-based) logging: every log record stores the entire database page, even if only one byte is modified.

This approach is intentionally inefficient in space and I/O, but it dramatically simplifies recovery — restoring pages is deterministic and requires no logical replay.
Journal Structure: Segments and Headers
A rollback journal is divided into log segments. Each segment consists of:
- A journal header (designed to fit within a single disk sector)
- One or more log records, each containing a before-image page
The header begins with eight magic bytes:
D9 D5 05 F9 20 A1 63 D7
These provide sanity checking and help detect torn or invalid writes.
- Number of page records (
nRec) - Random seed used for checksum calculations
- Original database page count
- Disk sector size
- Database page size
All values are stored in big-endian format, ensuring cross-platform compatibility.
Statement Journal: Fine-Grained Rollback
A statement journal exists for one specific purpose: undoing the partial effects of a single SQL statement that fails midway, without aborting the surrounding transaction.
Example:
An INSERT statement attempts to insert 1000 rows. If a constraint violation occurs at row 500, the first 499 inserts must be undone, but the user transaction must remain active.
The statement journal:
- Is a separate rollback journal file
- Stored as a temporary file (random
etilqs_*name) - Lives in the system temp directory
- Exists only during statement execution
- Is deleted immediately after the statement completes
Like rollback journals, statement journals store before-image pages on disk.
However:
- They are not used for crash recovery
- They are ignored during database startup
- They exist only to support statement-level atomicity
SQLite tracks statement journal metadata in memory, allowing it to roll back a single statement without aborting the user transaction.
Master Journal: Multi-Database Atomicity
SQLite allows multiple databases to be attached to a single library connection:
A single transaction may modify multiple databases. Each database has its own rollback journal, but atomicity must be preserved across all of them.
To achieve this, SQLite introduces the master journal.
The master journal:
- Is created during commit preparation
- Exists only for multi-database write transactions
- Is deleted once commit completes
- Contains no page images
- Stores the names of participating rollback journals

Each participating rollback journal becomes a child journal. During commit, each child journal appends a master journal record containing.
If SQLite crashes during commit:
- Recovery examines child journals
- Child journals reference the master journal
- The master journal confirms which journals belong together
- Either all databases are rolled back, or all are committed
This guarantees global atomicity across attached databases.
Layer 2: File Locking
Journals protect against crashes, but they don't solve concurrency.
SQLite relies on file locking to coordinate access across processes.
DB Level Locking Model (Rollback-Journal Mode)
In rollback-journal mode, SQLite enforces strict rules:
- Multiple readers may read concurrently
- Only one writer may exist at a time
- Readers block writers
- Writers block both readers and other writers
This coarse-grained locking model simplifies correctness and eliminates deadlocks, at the cost of reduced concurrency.
Lock States and Transitions
- UNLOCKED: No locks held
- SHARED: Multiple readers allowed
- RESERVED: Writer intends to write
- PENDING: Writer waiting for readers to finish
- EXCLUSIVE: Writer actively modifying db
Typical transitions:
- Readers acquire SHARED
- Writers acquire SHARED → RESERVED → PENDING → EXCLUSIVE
- Write locks are released at transaction end
- Read locks may be released earlier, depending on execution flow
Continue reading the rest of article
My experiments and hands-on executions related to SQLite will live here: lovestaco/sqlite
References:
SQLite Database System: Design and Implementation. N.p.: Sibsankar Haldar, (n.d.).
👉 Check out: FreeDevTools
Any feedback or contributors are welcome!
It’s online, open-source, and ready for anyone to use.
⭐ Star it on GitHub: freedevtools








Top comments (0)