Stephanie Bergamo

Posted on Jan 14 • Edited on Jan 16

Building a B-tree in Go: The gap between theory and implementation

#database #go #btree #internals

I've been obsessed with database internals lately 🤓. It started with Alex Petrov's "Database Internals": one of those books where every page makes you think "wait, THAT'S how it works?!"

But there's a gap between understanding the theory and really getting it. I could explain what a B-tree does to a 6yo, draw the diagrams and the likes. But could I actually build one? That itch needed scratching.

So I dedicated a bit of my time off coding a B-tree from scratch in Go. About 15 hours total, spread across a few days between coffee, sport, rest and debugging sessions. I definitely didn't need one for production (PostgreSQL does this infinitely better), but I wanted to really understand what happens when I type CREATE INDEX.

And honestly? It was one of the most fun coding challenges I've done in a while.

What's a B-tree, anyway?

If you've used a database, you've used a B-tree: they're the data structure behind most database indexes. The key insight: unlike binary trees where each node has 2 children, B-tree nodes can have many children (determined by the "order"). This makes them perfect for disk storage where reading a block is expensive.

I won't explain B-trees from scratch here: Alex Petrov's book does that brilliantly. If you're not familiar with the basics, start with chapter 2 of Database Internals. This article assumes you know what a B-tree is and focuses on what I learned building one.

The Implementation: Easy, Interesting, and "Wait, What?"

The Easy Part: Search

Binary search within nodes, recurse to children. With good tests, this was maybe 2 hours including edge cases. The satisfying kind of easy where you know exactly what to do.

The Interesting Part: Split

Understanding why nodes split (keep tree balanced) vs how to split them (median, two halves, promote) took some head-scratching. Drawing it out on paper helped.

The surprise: root splits are special. You create a brand new root with just the promoted key. The tree grows taller, not wider.

The "Wait, What?" Part: Cascade

Oh wow. This is where I learned the difference between "I understand B-trees" and "I can build a B-tree."

When a split cascades up, you're not just inserting a key: you're managing a whole graph of parent-child relationships. Split an internal node? Better redistribute its children too. And update every child's parent pointer. Miss one? Silent corruption. The debugging session was ...fun...

The full implementation is on GitLab, but the key insight was: when you split an internal node, you're not just moving one or two keys: you're rewiring an entire graph. Get one pointer wrong and your tree silently corrupts. Tests caught this. A lot. And when they didn't and I had to discover that by myself, I wrote new tests to confirm and made sure I fixed the bug.

What Building It Taught Me (That Reading Didn't)

B-tree vs B+tree: Not Just Academic

I knew the theory: B-trees store data everywhere, B+trees only in leaves. But implementing it made me feel the difference.

In my B-tree, search can stop at any level when it finds the key. Elegant! But range scans would be a nightmare (you'd have to traverse the entire tree structure). B+trees sacrifice that early-stop elegance for better range queries (linked leaves).

Now when I see PostgreSQL using B+trees for indexes, I get why, not just what.

Sequential inserts are... weird 🤔

Testing with INSERT 1, 2, 3, 4... created a hilariously unbalanced tree: 7 levels deep with just 50 keys, all growing to the right.

Random inserts? Beautiful 3-level balanced tree.

This is why databases have bulk loading strategies and why REINDEX exists. Seeing my skewed tree made me appreciate those optimizations viscerally.

TDD Isn't Optional (or at least not for me anymore)

I've been building software with TDD for so long now that coding without tests feels... naked? Uncomfortable? Like I'm missing something fundamental.

Could I have built this B-tree without tests? Sure. Would I have shipped a dozen subtle bugs? Absolutely.

Writing tests first isn't about following best practices anymore: it's how I think through implementation. The test tells me what "done" looks like before I start coding.

For this project, that discipline caught:

Duplicate key insertion (oops, forgot to check)
Index out of bounds on edge cases (classic off-by-one)
Children count mismatch after splits (n keys → n+1 children, remembered that in the test)
Parent pointers pointing to the wrong parent (caught by checking parent relationships in tests)

When cascade finally worked and all 20+ tests turned green simultaneously? That's the confidence TDD gives you. I wasn't hoping it worked: I knew it did. That's the payoff: shipping with certainty, not crossed fingers. (by now I guess you might have understand that I'm a strong believer in the increased productivity and overall quality TDD is bringing).

And yes, 100% coverage throughout. Some call it obsessive. I call it professional 😇.

The "It Just Clicked" Moment

There was this moment, probably 10 hours in, debugging why searches failed after cascade. I was staring at my tree visualization (can't help but thanks Copilot for that helper function!), and suddenly I saw it: the children weren't following their parent after the split.

That moment when the entire structure suddenly made sense (not just conceptually, but in memory, pointers, the whole graph of relationships) that's what hands-on gets you. No amount of reading gives you that.

The Numbers

Time: ~15 hours across 4 days
Code: ~200 lines of implementation
Tests: 20+ unit tests, 100% coverage
Bugs found: Too many to count (but TDD caught them all)
Coffee consumed: Significant

What's Next

This was way more fun than I expected. There's something deeply satisfying about implementing a data structure you've used for years without really understanding it.

Next up: LSM-trees (Log-Structured Merge trees). They represent the opposite tradeoff: B-trees optimize for reads (in-place updates), LSM-trees optimize for writes (append-only). Understanding both will show me why Cassandra chose LSM while PostgreSQL chose B+trees.

Reading about compaction strategies is one thing. Implementing them will be another level entirely 🤩.

If you're curious about database internals, start with "Database Internals" by Alex Petrov. Then pick something and build it. The gap between reading and implementing is where the real ~magic~ learning happens.

You can see the full implementation on my self-hosted GitLab (yes, I run my own instance: infra nerd life). Pull requests welcome if you spot ways to improve it! 😊

DEV Community