How I Survived 7 Rebuilds of the Same SaaS by Building a Control Layer Around Claude Code

#agents #ai #claude #softwaredevelopment

I'm a solo founder. Over the past month I rebuilt the same B2B SaaS codebase seven times. Not because the code was bad — because the AI assistant kept fabricating completion reports, drifting in long sessions, and giving me three different answers to the same prompt.

After rebuild number seven I stopped writing product code for two days. I built a control framework around the assistant instead of around the codebase. It's been running every day since, and the difference is hard to overstate.

This post is the short version of what's inside.

The core problem

The failure mode wasn't "the model is dumb." It was that the model is convincing when it's wrong. Three patterns kept killing me:

Fabricated completion reports. "Task complete, all tests pass." No tests were run. No tests existed.
Long-session drift. At 60% context, the model started contradicting decisions it had made 20 turns earlier.
Prompt-layer guardrails get ignored. I tried system prompts, custom instructions, every prompt trick. The model agreed politely and then did whatever it wanted.

The insight that changed everything: if prompts get ignored, move the guardrails out of the prompt layer and into the protocol layer. Hooks. Not instructions.

What's in the kit

Single release. No subscription. Designed for solo developers who are 3+ rebuilds in and tired.

3 hooks that block fabricated reports at the protocol layer. Pre-tool-use, post-tool-use, subagent-stop. They don't ask the model to be honest. They refuse to accept output that fails an objective check.
17 sub-agent definitions in 4 teams: research, build, verify, ship. Each one has a single-line goal, a fixed output format, an allowed tool list, and an explicit boundary. If a sub-agent tries to step outside its boundary, the hook stops it.
5 SSOT files (single source of truth): wiki, wiki-detail, status, decisions, false-report log. The false-report log is the most important — it's where every "I lied" event gets recorded so the same lie can't slip through twice.
2 RAG embedding scripts for pulling project memory back into context without bloating the prompt.
13 case studies from the seven-rebuild retrospective. What I tried, what failed, what the fingerprint of each failure looks like.
Setup script with an explicit consent prompt before it touches anything.

A taste of what the hook layer looks like

This isn't the actual hook (the real ones are in the archive) — it's the shape. The point is that hooks live outside the model's reach:

#!/bin/bash
# subagent-stop hook: reject reports that claim completion
# without referencing any verification step

report="$1"

if grep -qEi "(complete|finished|done|100%)" "$report"; then
  if ! grep -qEi "(test|verified|checked|ran)" "$report"; then
    echo "blocked: completion claim with no verification reference"
    exit 2
  fi
fi

The model can be as creative as it wants inside the prompt. The hook doesn't read the prompt. It reads the output and checks structure. If the structure doesn't match the contract, exit code 2, and the report is rejected at the protocol layer.

Numbers from my own project

Unverified, single-author, on one codebase. Take with a grain of salt:

Metric	Before	After
Weekly task time	~40h	~8h
Output accuracy	~40%	~95%
Fabricated reports / week	~13	0-1

The "0-1" line matters more than the "95%." A loud failure I can catch. A silent fabrication is what burns a week.

Why I'm sharing

I was going to keep it private. Then I kept getting the same message from other solo devs hitting the same wall — "I'm on rebuild 4, what do I do." So I packaged the framework I was already using.

The Gumroad page is a free preview — full README, ICP, before/after, FAQ. Read everything before deciding. The paid archive (49 USD one-time) is the executable assets: hooks source, sub-agent definitions, SSOT templates, case studies, RAG scripts, setup script. No subscription, no updates, no support — single release. MIT for code, CC BY-NC for docs.

If you're 3+ rebuilds in and the assistant is still fabricating reports, I'm in the comments. Happy to talk specifics.