DEV Community

Cover image for Inside SQLite Backend: Virtual Machine, Storage, and the Build Process
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

Inside SQLite Backend: Virtual Machine, Storage, and the Build Process

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building **one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.*

In the previous post, I explored how SQLite’s frontend transforms SQL text into executable bytecode through the tokenizer, parser, and code generator.

That process ends with a prepared statement, but the real work begins only after that.

Today’s learning moves firmly into the backend of SQLite, where bytecode programs are executed, data is stored and retrieved, transactions are enforced, and the entire system is stitched together into a compact, embeddable library.

Image

The Virtual Machine (VDBE)

Once the frontend finishes compilation, it hands over a bytecode program to the Virtual Database Engine, commonly called the VM or VDBE.

A bytecode program is:

  • A linear sequence of instructions
  • Each instruction has an opcode and up to five operands
  • Executed sequentially, one instruction at a time

The VM behaves like a custom CPU, designed specifically for database operations scanning tables, comparing values, managing cursors, and enforcing transactional semantics.

The Tree Module

While the VM defines what to do, the tree module defines how data is organized.

SQLite stores:

  • Tables as B+ trees
  • Indexes as B-trees

Each table and index has its own independent tree structure.

Implementation details:

  • btree.c → tree logic
  • btree.h → public interface

The tree module supports searching, insertion, deletion, updates, and structural changes such as creating or dropping tables and indexes.

The Pager: SQLite’s Core Infrastructure

The pager is one of the most critical components in SQLite.

The tree module never accesses the database file directly. Instead, it works with fixed-size pages requested from the pager.

Responsibilities of the Pager

The pager:

  • Reads and writes database pages
  • Maintains an in-memory page cache
  • Handles file locking
  • Manages rollback journals
  • Enforces transaction boundaries

In effect, the pager acts as:

  • Data manager
  • Lock manager
  • Log manager
  • Transaction manager

All of this logic lives in:

  • pager.c
  • pager.h

The pager is the backbone that allows SQLite to deliver ACID guarantees using a single database file.

lovestaco@i3nux-mint:~/pers/sqlite$ ll /home/lovestaco/pers/sqlite/bld/sqlite3
-rwxrwxr-x 1 lovestaco lovestaco 6.9M Jan 11 17:14 /home/lovestaco/pers/sqlite/bld/sqlite3
Enter fullscreen mode Exit fullscreen mode

SQLite Build Process

SQLite’s build process reflects its philosophy of self-containment and reproducibility.

The build consists of six major steps:

  1. Generate sqlite3.h
  2. Build the SQL parser
  3. Generate VM opcodes
  4. Generate opcode names
  5. Generate SQL keyword tables
  6. Compile the library

Generated Files and Tools

During the build:

  • lemon.c generates parse.c and parse.h
  • mkkeywordhash.c generates keywordhash.h
  • awk and sed generate:
    • sqlite3.h
    • opcodes.h
    • opcodes.c

image

The opcodes.h file assigns numeric values to VM instructions, while opcodes.c maps opcodes to human readable names which areuseful for debugging and diagnostics.

Amalgamation Build

Modern SQLite releases provide a single amalgamation file, sqlite3.c, along with sqlite3.h.

Advantages of using the amalgamation:

  • 5–10% better performance
  • More aggressive compiler optimizations
  • Simplified build process
  • Easier embedding into applications

The SQLite team strongly recommends this approach. The command-line utility additionally requires shell.c.

Chapter-Level Summary

At this point, the SQLite landscape comes into full focus.

  • SQL is compiled into bytecode and executed by a purpose-built VM
  • SQLite ensures serializable execution using database-level locking
  • Journaling guarantees atomicity and recovery
  • Each database lives in a single native file anchored by sqlite_master
  • The architecture is modular and cleanly layered
  • The entire system is open source and in the public domain

This chapter has been a guided tour of SQLite’s internals, from API usage to execution, storage, recovery, and build mechanics.

In the next chapter, the focus shifts even deeper into database and journal file storage structures, where SQLite’s on-disk layout reveals how these abstractions are made real.

My experiments and hands-on executions related to SQLite will live here: lovestaco/sqlite

References:

SQLite Database System: Design and Implementation. N.p.: Sibsankar Haldar, (n.d.).

FreeDevTools

👉 Check out: FreeDevTools

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

⭐ Star it on GitHub: freedevtools

Top comments (0)