DEV Community

Zackery Sayers
Zackery Sayers

Posted on

I built an x86_64 kernel from scratch, and it made me hate AI documentation tools. So I built my own.

Six months ago, I started building TaterTOS64, an x86_64 kernel. As any systems dev knows, once you hit the 10,000-line mark across a mix of C, Assembly, and Linker scripts, your brain starts to leak. I needed a way to document the architectural "why"—how the interrupt vectors hand off to the scheduler, how the paging logic relates to the physical memory map.

Naturally, I tried the modern approach: I fed the code to LLMs.

The Result was a Disaster.
Generic "AI Doc" tools failed me in three specific ways:

  1. The Context Amnesia: They'd understand a single .c file but completely hallucinate the #include chain. They had no idea where the paging.h constants were actually defined in my repo structure.
  2. The Hallucination Loop: They would confidently explain my scheduler's "logical flow" while citing methods that didn't exist, or worse, misinterpreting raw Assembly entry points as high-level C signatures.
  3. The SaaS Tax: I'm building a local kernel. I don't want to pay $20/mo to a cloud service to "rent" access to my own local documentation pipeline.

Building the Solution: TaterBookBuilder
I decided to stop building the kernel for two weeks and build the documentation compiler I actually wanted. I call it TaterBookBuilder.

Instead of a simple "text-to-prompt" wrapper, I built a deterministic analysis engine first.

How it actually works:

  • Physical Inclusion Graphing: Before the LLM ever sees a prompt, the engine walks the repo and maps every #include (C) and %include (Assembly) to its canonical repository node. No more guessing where types come from.
  • AST-Aware Ingestion: Using Roslyn and custom regex parsers, it builds a logical hierarchy of your system. It identifies "Kernel Boundaries" vs "User Space" based on the directory topology and hot-path signals (like syscall entry points).
  • The "Evidence Map" (The Game Changer): I was tired of second-guessing the LLM. I implemented an Evidence Map system. Every claim the book makes is backed by a deterministic ID that points to a specific file and line range in the repo. If the book says "The scheduler uses a Round-Robin approach," there is a footnote pointing exactly to src/kernel/sched.c:L45-L120.

The Philosophy: Local-First and Perpetual
Documentation is a permanent asset. It shouldn't depend on a cloud subscription.

I'm shipping TaterBookBuilder as a 77MB Linux AppImage. It's completely turnkey—I even bundled a static binary of Pandoc inside it so you don't have to install a single dependency.

And for the pricing? I'm using the JetBrains Model. You buy it once, you own that version forever. You get a year of maintenance, and if you don't want to renew, your documentation pipeline keeps working exactly as it did on day one.

Documentation should be as rock-solid and local as the code it describes.

Check out the workbench and download the trial here:
https://taterlabs.shop/taterbook.html

I'd love to hear from other systems devs—how are you handling the "trust gap" with AI-generated architecture maps?

C #Assembly #SystemProgramming #BuildInPublic #LocalFirst #DotNet

Top comments (0)