Hello, I'm Maneshwar. I'm working on git-lrc: a Git hook for Checking AI generated code.
Yesterday we discussed internal datatypes and how the VM composes values into records. Today we look at what those records actually look like on disk.
When SQLite stores a row inside a table B+-tree, it does not write columns one by one in some naive layout. It constructs a carefully structured byte sequence.
Table Record Format
A table record (row data) consists of two logical parts:
- Header
- Record Image (Data Section)
The header begins with a size field. This value tells us how many bytes exist before the first data item (Data1). Importantly, this size includes the bytes used to encode the size field itself.
The size is stored as a variable-length 64-bit integer using SQLiteβs varint encoding (a Huffman-style compact representation). In effect, the size field acts as a pointer to where the actual data begins.
After the size field, the header contains one type field per column, in the same order as declared in the CREATE TABLE statement. SQLite never rearranges column order internally.
Each Type i field is also stored as a variable-length unsigned integer. This value encodes the storage class and size of the corresponding Data i.
So the header is essentially:
[header size][type1][type2][type3]...
Then comes the record image:
[data1][data2][data3]...
The VM reads the header first, learns how many bytes to skip to reach the data section, and understands how to interpret each column based on its type code.
Zero-Length Data
Certain type codes represent values that occupy zero bytes in the data section.
For example, type values:
- 0
- 8
- 9
- 12
- 13
do not store any payload in the record image.
This is possible because the type code itself fully determines the value. For example:
- NULL requires no data bytes.
- Some integer constants have special encodings.
This design saves space and keeps records compact.
Why Column Order Matters
Since SQLite does not reorder columns internally, the physical order of columns in the record exactly matches the schema definition.
Because records may overflow onto additional pages, placing:
- Smaller
- Frequently accessed
columns earlier in the schema can reduce the need to chase overflow chains.
This is a practical storage optimization β one that many developers overlook.
Table Key Format
Every B+-tree in SQLite must have a unique key.
Relational theory says tables do not contain duplicate rows. But in practice, users may insert identical rows unless a UNIQUE constraint exists.
SQLite must still differentiate them internally.
To do this, every table has a unique primary key:
- Either explicitly defined by the schema
- Or automatically created by SQLite
If not specified, SQLite assigns a hidden 64-bit signed integer key called the rowid.
The rowid guarantees uniqueness for every row in the table B+-tree.
So even if two rows contain identical column values, their rowids will differ β and therefore their keys in the B+-tree will differ.
This rowid is not just a logical identifier. It is physically used as the key in the tableβs B+-tree structure.
What Comes Next
Weβve now seen:
- How table records are structured
- How headers encode type and size
- Why variable-length encoding keeps files compact
- Why every table must have a unique key
In the next post, we will dive into the rowid itself:
- The Rowid column
- How rowid values are generated
- How rowid is represented on disk
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.
Any feedback or contributors are welcome! Itβs online, source-available, and ready for anyone to use.
β Star it on GitHub:
HexmosTech
/
git-lrc
Free, Unlimited AI Code Reviews That Run on Commit
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.
See It In Action
See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements
git-lrc-intro-60s.mp4
Why
- π€ AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
- π Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
- π Build a habit, ship better code. Regular review β fewer bugs β more robust code β better results in your team.
- π Why git? Git is universal. Every editor, every IDE, every AIβ¦



Top comments (0)