DEV Community

Cover image for Index Key Format in SQLite
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

Index Key Format in SQLite

Hello, I'm Maneshwar. I'm working on git-lrc: a Git hook for Checking AI generated code.

In the previous sections, we saw how a table’s B+-tree works:

  • The key is an integer (rowid).
  • The data is the full table record.

Indexes reverse this design.

For an index entry:

  • The key is the combined value(s) of the indexed column(s).
  • The data is a single integer — the rowid of the corresponding table row.

This inversion is the foundation of how SQLite accelerates lookups.

How an Index Lookup Works

Suppose we have an index on column x of table t1.

When you execute:

SELECT y FROM t1 WHERE x = 456;
Enter fullscreen mode Exit fullscreen mode

The VM does not scan the entire table.

Instead:

  1. It searches the index B-tree for entries where x = 456.
  2. For each matching entry, it retrieves the stored rowid.
  3. It uses that rowid to search the table’s B+-tree.
  4. It extracts column y from the full table record.

So an indexed lookup typically involves two tree searches:

  • One in the index B-tree
  • One in the table B+-tree

Even though this means two searches instead of one, it is vastly faster than scanning every row of the table.

Indexes Are Separate B-Trees

SQLite implements every index as its own independent B-tree — essentially an “index-table.”

This B-tree maps:

(search key) → (rowid)
Enter fullscreen mode Exit fullscreen mode

The index has its own key comparator function. The VM provides a pointer to the appropriate comparison logic (including collation rules) when opening the index cursor.

So ordering inside the index B-tree is determined by:

  • Column types
  • Collation sequences
  • ASC/DESC rules

The tree module does not know SQL semantics. It just calls the comparison function supplied by the VM.

How Index Records Are Stored

Unlike table records, index records do not have a separate data section.

The entire record serves as the B-tree key.

Figure 7.9: Format of index record.

The encoding is almost identical to table record encoding:

  • Header (with serial type codes)
  • Record image (column values)

But with one important difference:

The rowid is appended at the end of the record.

And unlike normal columns, the rowid’s type does not appear in the header. It is always a signed integer encoded as a varint.

So structurally:

[indexed column values][rowid]
Enter fullscreen mode Exit fullscreen mode

That entire byte sequence becomes the B-tree key.

There is no separate payload section in index B-trees.

Example: Single-Column Index

Consider an index on column x.

Its B-tree contents might look like this:

Figure 7.10: An index on column x

Each entry contains:

  • The value of x
  • The rowid of the corresponding row

If multiple rows have the same x, the rowid acts as a tie-breaker to ensure uniqueness in the B-tree key space.

Multi-Column Indexes

SQLite also supports composite (multi-column) indexes.

For example, an index on (y, x) stores entries ordered first by y, then by x, and finally by rowid as a tie-breaker.

Illustration:

image

The ordering rule is simple:

  1. Compare first indexed column.
  2. If equal, compare second.
  3. Continue for additional columns.
  4. If all equal, compare rowid.

This lexicographic ordering ensures that:

  • Range scans on the left-most column are efficient.
  • Prefix matching works naturally.
  • Duplicate column values remain uniquely identifiable.

Automatic and Explicit Index Creation

SQLite automatically creates indexes in certain cases:

  • Every UNIQUE constraint
  • Every PRIMARY KEY (except the special case below)

You cannot drop these implicit indexes without dropping the table itself.

You can also create indexes explicitly:

CREATE TABLE T1(a, b, c);
CREATE INDEX idx1 ON T1(a, b);
Enter fullscreen mode Exit fullscreen mode

Or indirectly:

CREATE TABLE T1(a, b, c, UNIQUE(a, b));
Enter fullscreen mode Exit fullscreen mode

Or:

CREATE TABLE T1(a, b, c, PRIMARY KEY(a, b));
Enter fullscreen mode Exit fullscreen mode

All three approaches generate an index on (a, b).

Special Case: INTEGER PRIMARY KEY

If a column is declared as:

INTEGER PRIMARY KEY
Enter fullscreen mode Exit fullscreen mode

SQLite does not create a separate index.

Why?

Because that column becomes an alias for rowid.

The table’s B+-tree is already ordered by rowid, so no extra index is required.

This is an elegant optimization — no duplication of structure.

Multiple Indexes on the Same Column

SQLite allows multiple indexes on the same column.

Consider:

CREATE TABLE T2(
    x VARCHAR(5) UNIQUE PRIMARY KEY,
    y BLOB
);
CREATE INDEX idx2 ON T2(x);
Enter fullscreen mode Exit fullscreen mode

This creates three independent indexes on x:

  1. Implicit index from UNIQUE
  2. Implicit index from PRIMARY KEY
  3. Explicit index idx2

All three are stored as separate B-trees.

This works — but it is inefficient.

Every INSERT, UPDATE, or DELETE must update all relevant indexes. More indexes mean:

  • Slower writes
  • Larger database files
  • More disk I/O

Schema designers should be careful.

Transient Indexes

SQLite can also create temporary indexes on the fly.

These are used for:

  • ORDER BY
  • GROUP BY
  • DISTINCT
  • UNION, EXCEPT, INTERSECT

When needed, SQLite builds a temporary B-tree in a temporary file to perform sorting or deduplication efficiently.

These indexes are not part of the schema. They exist only during query execution.

The Bigger Picture

With tables and indexes together, the storage model now becomes clear:

  • Table B+-tree:
    rowid → full row record

  • Index B-tree:
    (indexed columns + rowid) → nothing

The VM uses indexes to narrow down candidate rowids.
It then uses those rowids to retrieve full rows from the table B+-tree.

It’s simple. Predictable. Uniform.

And because everything is just B-trees with slightly different key layouts, the tree module itself remains generic and reusable.

git-lrc

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub:

GitHub logo HexmosTech / git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

git-lrc logo

git-lrc

Free, Unlimited AI Code Reviews That Run on Commit


git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

  • 🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
  • 🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
  • 🔁 Build a habit, ship better code. Regular review → fewer bugs → more robust code → better results in your team.
  • 🔗 Why git? Git is universal. Every editor, every IDE, every AI…




Top comments (0)