DEV Community

Cover image for Negative Lookups in Bf-Tree: Caching Things That Don't Exist
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

Negative Lookups in Bf-Tree: Caching Things That Don't Exist

Hello, I'm Maneshwar. I'm building git-lrc, a Micro AI code reviewer that runs on every commit. It is free and source-available on Github. Star git-lrc to help devs discover the project. Do give it a try and share your feedback for improving the project.

Most caching systems optimize for finding records quickly.

But what happens when a query repeatedly asks for a key that does not exist?

SELECT * WHERE id = 999999;
Enter fullscreen mode Exit fullscreen mode

Traditional record caches struggle here. If a record is missing from cache, the system cannot tell whether:

  • the record was never cached, or
  • the record truly does not exist

So it falls back to disk and checks the leaf page again.

Repeated negative searches become repeated I/O.

Bf-Tree Caches Missing Records Too

Bf-Tree treats a failed lookup as useful information.

When a key is searched and confirmed absent, it inserts a phantom record into the mini-page:

Search key=42
      ↓
Not found on disk
      ↓
Store phantom record
Enter fullscreen mode Exit fullscreen mode

Future lookups:

Search key=42
      ↓
Phantom record found
      ↓
Return "not found"
Enter fullscreen mode Exit fullscreen mode

No leaf-page access needed.

The absence of data becomes cached state.

Four Record Types Inside a Mini-Page

By this point, mini-pages can contain multiple record types:

Type Dirty Exists
Insert Yes Yes
Cache No Yes
Tombstone Yes No
Phantom No No

A phantom record is essentially:

"We already checked. This key doesn't exist."

This is surprisingly useful for workloads with frequent failed lookups.

Recovery Still Looks Familiar

Despite mini-pages and phantom records, durability remains conventional.

Bf-Tree uses:

  • Write-ahead logging (WAL) before commits
  • Checkpointing/snapshots to persist state
  • Recovery via WAL replay

Crash recovery roughly becomes:

Load snapshot
      ↓
Rebuild in-memory pages
      ↓
Replay WAL
      ↓
Restore latest state
Enter fullscreen mode Exit fullscreen mode

So while Bf-Tree changes how pages are cached and merged, persistence still resembles traditional database systems.

The unusual idea is not recovery.

The unusual idea is this:

Bf-Tree treats "record does not exist" as something worth caching.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

GitHub logo HexmosTech / git-lrc

Free, Micro AI Code Reviews That Run on Commit




AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

  • 🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
  • 🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
  • 🔁 Build a

Top comments (0)