Athreya aka Maneshwar

Posted on Oct 31

Deep Dive into SQLite’s Internal Architecture

#webdev #programming #beginners #database

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building *one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.

SQLite is one of the most widely used embedded databases, powering everything from browsers to mobile apps.

Your program links SQLite and calls functions; SQLite translates SQL into bytecode and executes it inside a virtual machine that manipulates B-trees stored in a single database file.

The main logical components are:

SQL layer — parser, analyzer, and code generator (turns SQL into VDBE bytecode).
VDBE (Virtual DataBase Engine) — the virtual machine that executes the bytecode.
B-Tree layer — data storage abstraction: tables and indexes are B-trees stored in the file.
Pager — manages pages in memory, journal files, atomic commit/rollback, and crash-safety.
VFS (OS interface) — file-system abstraction layer that performs the low-level I/O and locking.
Cache and page cache — memory pages for the DB file; interacts with pager and B-tree.

From SQL to work: parser → bytecode → VM

When you execute SQL, SQLite first tokenizes and parses it, runs analysis/semantic checks, then emits portable bytecode (the VDBE program). The VDBE has opcodes for scanning B-trees, reading/writing columns, doing joins, sorting, aggregations, etc.
The VM is a stack/register machine implemented in vdbe.c and friends. Execution of this bytecode is how SELECT/INSERT/UPDATE/DELETE actually access the B-trees and pages.

Why this matters: Because SQLite is an embedded library, compiling SQL to a compact bytecode and using an in-process VM avoids IPC/serialization overhead and improves latency for small queries.

On-disk layout: pages and B-trees

Single file: an SQLite database (schema, tables, indices, data) is stored in one cross-platform file.
The file is split into fixed-size pages (default 4096 bytes but configurable). Each table/index is stored in a separate B-tree structure mapped onto those pages.
Interior pages store keys/ranges, leaf pages hold row payloads (SQLite uses B+tree-like organization).
The file format (header, page types, cell encodings, variable-length integers) is precisely defined to be stable across versions.
Variable-length records and compact storage are used to pack rows efficiently; overflow pages are used when payloads exceed one page.

Pager and crash-safety (journaling)

The pager is the subsystem responsible for reading/writing pages and ensuring atomic commit/rollback across crashes. Two primary journaling models:

Rollback journal (classic journaling)

When a transaction begins, the pager writes the original contents of pages that will be modified into a rollback journal file. If a crash happens before commit, the journal is replayed to restore the original DB. On successful commit, the journal is deleted (or overwritten). This provides atomicity and durability.

Write-Ahead Log (WAL) mode (recommended for many concurrent workloads)

Instead of writing old page images, changes are appended to a WAL file. Readers read the original DB file plus WAL frames to see the latest committed state; writers append to WAL and commit by marking a frame header. WAL allows concurrent readers and a writer (readers never block writers, and writers don’t block readers) which is a big concurrency improvement over the rollback journal. Periodically, a checkpoint copies WAL frames back into the main DB file and truncates the WAL. The WAL file format, index, and locking rules are explicitly specified.

Key practical differences:

Rollback journal: simpler, writer blocks readers during the write/commit phases.
WAL: better read/write concurrency, requires shared memory (on many OSes) for coordination and produces .wal and .shm files alongside the DB. WAL files grow until checkpoints run (auto or manual).

Locking & concurrency

SQLite coordinates multi-process/thread access via file locks implemented by the VFS layer.
Lock states like SHARED, PENDING, EXCLUSIVE govern who can read or write. In classic journal mode, a writer will obtain locks that prevent other writers and sometimes readers; in WAL mode, locks are different to allow concurrent readers during a write.
The docs describe the precise lock transitions and semantics. If a requested lock is unavailable, SQLite returns SQLITE_BUSY (or waits up to a timeout).
Shared-cache mode: an optional mode where multiple connections in the same process can share page cache; this changes locking semantics and is rarely needed for typical apps.

Page cache, memory management, and performance

Pages read from disk are cached in the page cache. The pager controls what stays in memory and how dirty pages are flushed. PRAGMA settings (e.g., cache_size, synchronous, journal_mode, locking_mode) affect durability vs. speed tradeoffs.
SQLite is designed to be efficient for small to medium databases and many embedded workloads. For heavy concurrent writes or huge datasets, a client/server RDBMS may scale better. (See pragmas & performance sections).

Transactions, atomicity, and isolation

Transactions are ACID:
- Atomic: via rollback journal or WAL+checkpoint.
- Consistent: via SQL constraints and transactional semantics.
- Isolated: default isolation level is serializable for separate connections (though read_uncommitted pragma and shared-cache can let a connection see uncommitted data). WAL mode lets readers see a consistent snapshot while a writer appends WAL frames.

Important nuance: because SQLite is file-based, multiple processes must coordinate via the filesystem; thus concurrency semantics depend on OS-level locking and the chosen journal mode.

Virtual tables, extensions, and VFS

Virtual tables (e.g., FTS, custom modules) let you expose non-standard backends (search indices, CSV files, remote data) via the SQL table abstraction.
VFS layer is pluggable; you can write a custom VFS to change how SQLite talks to the OS (useful to embed SQLite on exotic platforms, store DB on custom block devices, or implement custom locking semantics).

Putting It All Together

When you run a SQL command:

SQLite parses and compiles it into bytecode.
The VDBE executes the bytecode step-by-step.
It interacts with B-tree structures to locate or modify data.
The Pager ensures that changes are cached and journaled correctly.

Finally, the VFS writes data safely to the actual .sqlite file on disk.

I’ve been building for FreeDevTools.

A collection of UI/UX-focused tools crafted to simplify workflows, save time, and reduce friction in searching tools/materials.

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

👉 Check it out: FreeDevTools
⭐ Star it on GitHub: freedevtools

DEV Community