DEV Community

Cover image for How Bf-Tree Keeps Mini-Pages Small, Hot, and Cheap to Evict
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

How Bf-Tree Keeps Mini-Pages Small, Hot, and Cheap to Evict

Hello, I'm Maneshwar. I'm building git-lrc, a Micro AI code reviewer that runs on every commit. It is free and source-available on Github. Star git-lrc to help devs discover the project. Do give it a try and share your feedback for improving the project.

Mini-pages are central to Bf-Tree performance, but they cannot grow forever.

If every insert, update, and tombstone remained in memory indefinitely, insertion cost would increase and memory would fill with stale records.

Bf-Tree continuously reorganizes mini-pages through merges, copying, and eviction.

When Does a Mini-Page Merge?

The paper describes two situations where a mini-page is merged back into its leaf page:

  1. The mini-page becomes too large
  2. The mini-page becomes cold (rarely accessed)

Since records inside a mini-page remain sorted, insertion cost rises as it grows.

Mini-page size increases

512B → 1KB → 2KB → ...

Insertion overhead rises
Enter fullscreen mode Exit fullscreen mode

When a mini-page exceeds roughly 2KB, Bf-Tree merges it with the leaf page and produces a larger 4KB mini-page mirroring the leaf.

The merge process is roughly:

Locate leaf page
      ↓
Check available space
      ↓
Split leaf if needed
      ↓
Merge mini-page records
      ↓
Discard old mini-page
Enter fullscreen mode Exit fullscreen mode

The old memory can then be reused.

This keeps write buffering useful without letting mini-pages become expensive secondary indexes.

Copy-On-Access Prevents Hot Data From Being Evicted

Bf-Tree stores mini-pages in a circular buffer.

Suppose a mini-page is close to eviction but suddenly becomes active again.

Instead of keeping it in place:

Old mini-page
      ↓
Copy to buffer tail
      ↓
Mark old copy as tombstone
Enter fullscreen mode Exit fullscreen mode

The frequently accessed mini-page effectively receives a second life.

Hot data moves away from eviction boundaries.

Cold data moves toward disk.

This resembles a lightweight cache promotion strategy.

Evicting Records Instead of Evicting Entire Pages

One interesting detail in the paper is that Bf-Tree can evict individual cold records.

Each record maintains a reference bit:

Accessed recently → keep
Not accessed → evict
Enter fullscreen mode Exit fullscreen mode

During copy-on-access:

  • Hot records remain in memory
  • Cold cache records are discarded immediately
  • Dirty records (insertions/tombstones) trigger merge with leaf pages

So a mini-page gradually evolves:

Before:
[A][B][C][D][E]

Only A and D are hot

After:
[A][D]
Enter fullscreen mode Exit fullscreen mode

Memory becomes concentrated around active keys.

This is much finer-grained than traditional page caches, where an entire page often survives because one record is frequently accessed.

Leaf Splits Still Exist

Despite the redesign, Bf-Tree has not eliminated classic B-Tree behavior.

Leaf pages still split when full.

The difference is where inserts land first:

Traditional B-Tree:

Insert → Leaf page → Possible split
Enter fullscreen mode Exit fullscreen mode

Bf-Tree:

Insert → Mini-page
            ↓
       Merge/Evict
            ↓
       Possible leaf split
Enter fullscreen mode Exit fullscreen mode

A split happens later, after buffering absorbs multiple writes.

Remaining mini-page records are compared against the split key and distributed across new leaf pages.

This delays structural modification and reduces immediate disk work.

The Bigger Pattern

Across merges, copy-on-access, record eviction, and splits, Bf-Tree repeatedly applies the same idea:

Keep hot data close, push cold data out, and postpone expensive disk operations as long as possible.

Mini-pages are not just a cache—they behave more like a continuously self-cleaning write buffer.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

GitHub logo HexmosTech / git-lrc

Free, Micro AI Code Reviews That Run on Commit




AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

  • 🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
  • 🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
  • 🔁 Build a

Top comments (0)