Athreya aka Maneshwar

Posted on Dec 16, 2025

Mini-Pages: Rethinking the Leaf Page Boundary

#programming #database #architecture #algorithms

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building **one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.*

In yesterday’s article, we looked at why both B-Trees and LSM-Trees struggle to fully exploit modern hardware, one weighed down by cache pollution, the other by compaction, amplification, and operational complexity.

Today, we move past why these structures fall short and look at what replaces the page itself.

Bf-Tree’s answer is not another cache, buffer, or tuning trick, but a structural rethink of the leaf node.

At the center of this redesign is a new primitive: the mini-page.

A mini-page is a slim, in-memory version of a leaf page, tightly scoped and purposely built.

Unlike buffer pool pages or LSM memtables, a mini-page exists only for leaf nodes and only one mini-page can exist per leaf page.

This design choice is intentional. Mini-pages are not a general cache.

They are a precise extension of the leaf page, designed to absorb writes, cache hot records, and adapt dynamically to workload patterns.

At a high level, mini-pages serve two roles:

Buffering recent updates
Caching frequently accessed records

Crucially, they do this without pulling full pages into memory.

Absorbing Writes Without Page-Level Amplification

When a write targets a leaf page, Bf-Tree does not immediately touch the on-disk page.

Instead, the write is inserted into the leaf’s mini-page:

If no mini-page exists, a minimal one is created (as small as 64 bytes, aligned to a cache line).
Records inside the mini-page are kept sorted, preserving spatial locality.
Inserts use binary search, not pointer chasing.

As updates accumulate, the mini-page grows dynamically:

It doubles in size when full
Growth continues until it reaches an upper bound (up to 4 KB)

At that point—or when the mini-page must be evicted. The system batch-merges the mini-page into the base leaf page on disk.

This achieves two things simultaneously:

Writes are absorbed and coalesced in memory
Disk writes happen in larger, more efficient batches

There is no delta chain, no per update logging into multiple structures, and no background compaction pipeline.

Updates stay local, ordered, and cache-friendly.

Caching Hot Records Without Caching Pages

Reads follow a similarly disciplined path.

Before touching disk, a lookup binary-searches the mini-page:

If the record is found, the read completes immediately
If not, the corresponding leaf page is read from disk

After loading the record, Bf-Tree may cache it back into the mini-page, but only probabilistically (for example, 1% of the time).

This avoids flooding the mini-page with cold data while allowing hot records to naturally accumulate.

The key distinction here is subtle but critical:

Mini-pages cache records, not pages.

Traditional B-Trees cache entire pages, even if only one record is hot.

LSM systems split caching into row caches and block caches, each with separate memory budgets and tuning knobs.

Mini-pages avoid both extremes.

Adapting to Range Scans with Cached Gaps

Record-level caching alone is not enough.

Range scans depend on spatial locality, and caching isolated records actively works against that. Bf-Tree addresses this with an elegant pivot.

When a mini-page grows to full size, it can be merged and promoted into a full leaf page representation, mirroring the on-disk layout.

At that point:

The mini-page effectively becomes a page-level cache
Range scans regain spatial locality
No separate cache tier is required

This means the same memory budget can fluidly support:

Point lookups (via small mini-pages)
Range scans (via full-sized pages)

Unlike systems that require static partitioning between row cache and block cache, Bf-Tree adapts automatically as workload patterns shift.

One Layout, Two Roles

Mini-pages and leaf pages share the exact same physical layout.

The only difference is size.

Both store sorted key–value pairs and support efficient lookups.

Mini-pages simply allow variable lengths, while leaf pages remain fixed-size on disk.

This unified layout:

Reduces implementation complexity
Avoids duplicated logic
Allows optimizations to apply uniformly to memory and disk

Internally, each page begins with a compact node metadata header, followed by metadata entries and the actual key–value data packed from opposite ends.

Inserts shift metadata, not full records keeping operations fast and cache-friendly.

Mapping Pages Without Breaking Concurrency

Leaf pages always live on disk. Mini-pages live in memory. Both are referenced through the same logical page ID.

Bf-Tree uses an in-memory mapping table to translate this page ID to either:

A memory address (mini-page), or
A disk offset (leaf page)

The mapping table also embeds a reader–writer lock alongside the page address, packed into a single 64-bit word.

Because the mini-page and its leaf page share the same page ID:

Locking one implicitly locks the other
The concurrency model stays simple
No extra coordination layer is required

This indirection decouples page location from page identity, enabling movement between memory and disk without rewriting pointers across the tree.

Why Mini-Pages Matter

Mini-pages are not a cache tweak. They are a fundamental redefinition of what a page is.

They blur the boundary between memory and disk while preserving order, locality, and predictability.

Writes become local. Reads become selective. Range scans regain efficiency—without introducing background compaction, multi-level probing, or cache partitioning.

This is where Bf-Tree stops being a “better B-Tree” and starts looking like a new storage primitive one that finally aligns data structures with modern hardware realities.