DEV Community

Cover image for The Tree Module Functionalities in SQLite
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

The Tree Module Functionalities in SQLite

Hello, I'm Maneshwar. I'm working on git-lrc: a Git hook for Checking AI generated code.
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

We now arrive at the structure that actually walks the tree.

We’ve seen:

  • Btree β†’ connection-level handle
  • BtShared β†’ shared database state
  • MemPage β†’ decoded page representation

Now we look at the structure that turns all of that into motion.

BtCursor Structure

To operate on a specific tree, the VM must first β€œopen” it.
It does this by invoking sqlite3BtreeCursor.

This creates a BtCursor object.

A cursor (sometimes called a scan) is an abstraction over tree traversal.

It represents a logical pointer to a specific entry inside a tree. Every SQL statement that interacts with a table or index uses a cursor under the hood.

If you think about it carefully:

A cursor is SQLite’s embodiment of β€œwhere we are” inside a tree.

Each open tree gets its own BtCursor object. It cannot be shared across database connections. It is tied to a specific Btree instance.

The Structural Anatomy of a Cursor

Figure 6.11 shows some of the key fields inside BtCursor:

fig 6.11

The most important members are:

  • apPage[] β†’ array of MemPage pointers
  • aiIdx[] β†’ index into cell pointer arrays
  • eState β†’ cursor state

The apPage[] array is particularly elegant.

It holds the chain of pages from the root, down through internal nodes to the current leaf page

In other words, it stores the entire traversal path.

Each corresponding entry in aiIdx[] tells us:

Which cell index we chose at that level of the tree.

So at any moment, a cursor doesn’t just point at a leaf entry.
It carries with it the entire path from root to leaf.

That’s why operations like search-next, splits, and merges can walk upward efficiently. The parent pages are already remembered.

Cursor States

The eState field indicates what the cursor currently represents:

  • Valid β†’ points to a valid entry
  • Invalid β†’ not currently pointing to an entry
  • RequiresSeek β†’ tree modified; must reposition
  • Fault β†’ error occurred

Read vs Write Cursors

A tree may have many cursors open at once β€” but with restrictions.

A cursor can be a read-cursor or a write-cursor but never both

Read-cursors:

  • Can only read cells
  • Guarantee repeatable reads
  • Act like logical read locks on the tree

Write-cursors:

  • Can both read and modify cells
  • Are used for insert, delete, update operations

SQLite does not allow read and write cursors to coexist on the same tree.

This is crucial.

It ensures that a read cursor performing a range scan does not suddenly observe inserted or deleted entries halfway through traversal.

That’s why read cursors provide repeatable read semantics even without requiring an exclusive database lock.

When the VM wants to execute a range query (like SELECT ... ORDER BY), it opens a read cursor. Most other operations use write cursors.

Key Comparison Logic

Each tree has an associated key comparison function.

  • Table B+-trees β†’ integer comparison (rowid-based)
  • Index B-trees β†’ memcmp-based comparison for binary keys

Every cursor on the same tree must use the same comparison logic. Otherwise ordering guarantees would break.

This is how SQLite maintains strict sorted order invariants across inserts, deletes, and scans.

Integrated Control Structures

Now that we’ve seen all four structures β€” Btree, BtShared, MemPage, and BtCursor β€” we can understand how they fit together.

Figure 6.12 captures the full integration:

fig 6.12

Let’s walk through it conceptually.

  • The VM holds a Btree handle.
  • The Btree references a BtShared.
  • The BtShared references the Pager.
  • The Pager manages page cache and disk I/O.
  • The tree module overlays MemPage objects on cached pages.
  • Each SQL operation uses a BtCursor to traverse MemPage objects.

This layering forms a clean stack:

VM
β†’ Btree
β†’ BtShared
β†’ Pager
β†’ Page Cache
β†’ MemPage
β†’ BtCursor

Each layer owns a distinct responsibility.

Btree summarizes connection state.
BtShared summarizes shared database state.
MemPage decodes raw page bytes.
BtCursor navigates the tree.

The separation is what makes SQLite both small and powerful.

Why This Matters

Now, when you think of a SQL statement like:

SELECT * FROM table WHERE id > 100 ORDER BY id;
Enter fullscreen mode Exit fullscreen mode

Under the hood:

  • The VM opens a read cursor.
  • The cursor searches down to the first qualifying leaf.
  • apPage[] stores the traversal path.
  • aiIdx[] tracks exact cell positions.
  • Sequential traversal uses in-memory structure without re-searching.
  • The pager ensures all pages are safe and consistent.

What feels like a simple query is actually an orchestration of layered control structures working in harmony.

What Comes Next

We’ve now fully explored:

  • Tree physical layout
  • Tree page organization
  • Cell structure
  • Control structures
  • Cursor traversal

The next natural step is:

Space management

Specifically:

  • Management of free pages
  • Page space management
  • Cell deallocation

That’s where structural cleanup and reuse happen.

git-lrc

πŸ‘‰ Check out: git-lrc
Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub:

GitHub logo HexmosTech / git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

  • πŸ€– AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
  • πŸ” Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
  • πŸ” Build a habit, ship better code. Regular review β†’ fewer bugs β†’ more robust code β†’ better results in your team.
  • πŸ”— Why git? Git is universal. Every editor, every IDE, every AI…




Top comments (0)