Hello, I'm Maneshwar. I'm working on git-lrc: a Git hook for Checking AI generated code.
In the previous sections, we saw how a table’s B+-tree works:
- The key is an integer (
rowid). - The data is the full table record.
Indexes reverse this design.
For an index entry:
- The key is the combined value(s) of the indexed column(s).
- The data is a single integer — the
rowidof the corresponding table row.
This inversion is the foundation of how SQLite accelerates lookups.
How an Index Lookup Works
Suppose we have an index on column x of table t1.
When you execute:
SELECT y FROM t1 WHERE x = 456;
The VM does not scan the entire table.
Instead:
- It searches the index B-tree for entries where
x = 456. - For each matching entry, it retrieves the stored
rowid. - It uses that
rowidto search the table’s B+-tree. - It extracts column
yfrom the full table record.
So an indexed lookup typically involves two tree searches:
- One in the index B-tree
- One in the table B+-tree
Even though this means two searches instead of one, it is vastly faster than scanning every row of the table.
Indexes Are Separate B-Trees
SQLite implements every index as its own independent B-tree — essentially an “index-table.”
This B-tree maps:
(search key) → (rowid)
The index has its own key comparator function. The VM provides a pointer to the appropriate comparison logic (including collation rules) when opening the index cursor.
So ordering inside the index B-tree is determined by:
- Column types
- Collation sequences
- ASC/DESC rules
The tree module does not know SQL semantics. It just calls the comparison function supplied by the VM.
How Index Records Are Stored
Unlike table records, index records do not have a separate data section.
The entire record serves as the B-tree key.
The encoding is almost identical to table record encoding:
- Header (with serial type codes)
- Record image (column values)
But with one important difference:
The
rowidis appended at the end of the record.
And unlike normal columns, the rowid’s type does not appear in the header. It is always a signed integer encoded as a varint.
So structurally:
[indexed column values][rowid]
That entire byte sequence becomes the B-tree key.
There is no separate payload section in index B-trees.
Example: Single-Column Index
Consider an index on column x.
Its B-tree contents might look like this:
Each entry contains:
- The value of
x - The
rowidof the corresponding row
If multiple rows have the same x, the rowid acts as a tie-breaker to ensure uniqueness in the B-tree key space.
Multi-Column Indexes
SQLite also supports composite (multi-column) indexes.
For example, an index on (y, x) stores entries ordered first by y, then by x, and finally by rowid as a tie-breaker.
Illustration:
The ordering rule is simple:
- Compare first indexed column.
- If equal, compare second.
- Continue for additional columns.
- If all equal, compare rowid.
This lexicographic ordering ensures that:
- Range scans on the left-most column are efficient.
- Prefix matching works naturally.
- Duplicate column values remain uniquely identifiable.
Automatic and Explicit Index Creation
SQLite automatically creates indexes in certain cases:
- Every
UNIQUEconstraint - Every
PRIMARY KEY(except the special case below)
You cannot drop these implicit indexes without dropping the table itself.
You can also create indexes explicitly:
CREATE TABLE T1(a, b, c);
CREATE INDEX idx1 ON T1(a, b);
Or indirectly:
CREATE TABLE T1(a, b, c, UNIQUE(a, b));
Or:
CREATE TABLE T1(a, b, c, PRIMARY KEY(a, b));
All three approaches generate an index on (a, b).
Special Case: INTEGER PRIMARY KEY
If a column is declared as:
INTEGER PRIMARY KEY
SQLite does not create a separate index.
Why?
Because that column becomes an alias for rowid.
The table’s B+-tree is already ordered by rowid, so no extra index is required.
This is an elegant optimization — no duplication of structure.
Multiple Indexes on the Same Column
SQLite allows multiple indexes on the same column.
Consider:
CREATE TABLE T2(
x VARCHAR(5) UNIQUE PRIMARY KEY,
y BLOB
);
CREATE INDEX idx2 ON T2(x);
This creates three independent indexes on x:
- Implicit index from
UNIQUE - Implicit index from
PRIMARY KEY - Explicit index
idx2
All three are stored as separate B-trees.
This works — but it is inefficient.
Every INSERT, UPDATE, or DELETE must update all relevant indexes. More indexes mean:
- Slower writes
- Larger database files
- More disk I/O
Schema designers should be careful.
Transient Indexes
SQLite can also create temporary indexes on the fly.
These are used for:
ORDER BYGROUP BYDISTINCT-
UNION,EXCEPT,INTERSECT
When needed, SQLite builds a temporary B-tree in a temporary file to perform sorting or deduplication efficiently.
These indexes are not part of the schema. They exist only during query execution.
The Bigger Picture
With tables and indexes together, the storage model now becomes clear:
Table B+-tree:
rowid → full row recordIndex B-tree:
(indexed columns + rowid) → nothing
The VM uses indexes to narrow down candidate rowids.
It then uses those rowids to retrieve full rows from the table B+-tree.
It’s simple. Predictable. Uniform.
And because everything is just B-trees with slightly different key layouts, the tree module itself remains generic and reusable.
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.
Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub:
HexmosTech
/
git-lrc
Free, Unlimited AI Code Reviews That Run on Commit
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.
See It In Action
See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements
git-lrc-intro-60s.mp4
Why
- 🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
- 🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
- 🔁 Build a habit, ship better code. Regular review → fewer bugs → more robust code → better results in your team.
- 🔗 Why git? Git is universal. Every editor, every IDE, every AI…




Top comments (0)