How I stopped Claude Code from hallucinating on Day 4 (The "Spec-Driven" Workflow)

#ai #testing #claudecode #productivity

We’ve all been there. You start a new project with Claude Code (or Cursor, or Windsurf), and Day 1 feels like magic. You ask for an Auth system? Boom, done. You ask for a database schema? Boom, perfect.

But then Day 4 hits.

You ask Claude to "update the user service," and suddenly it:

Deletes the error handling it wrote 2 days ago.

Re-implements a feature that already exists in another file.

Totally forgets that you decided to use UUID instead of Long for IDs.

I call this "Context Rot."

LLMs are brilliant at generation but terrible at state management. Chat sessions are ephemeral, but your project is permanent. Once your chat context window fills up, the "Plan" falls out of the LLM's brain, and it starts guessing.

I was working on a backend project (a 45-day estimate) and hit this wall hard. I realised I didn't need better prompts—I needed a State Management System for the AI itself.

Here is the workflow I built to fix it. It compressed that 45-day project into 5 days.

The Solution: Force "State" into the Repo
The mistake we make is keeping the "Plan" in the chat. The Plan needs to live in the Repository, just like the code.

I implemented a strict 3-Phase Spec System using Markdown files. Before writing a single line of code, Claude must update these files.

The Folder Structure I created a .claude/specs directory in my project root with three specific subfolders:

.claude/
  └── specs/
      ├── plans/           # (Pending) High-level architectural decisions
      ├── in-progress/     # (Active) The "Context Anchor" for right now
      └── plans-executed/  # (Done) A permanent history of WHY code exists

The "Context Anchor" (The Secret Sauce) Most people give Claude a task like "Build the User API." Instead, I force it to generate a Specification File first.

The Spec: Defines the API contract, the data model, and the edge cases in Markdown.

The Rule: I added a CLAUDE.md file to my root that tells Claude: "Before you write code, you must read the active spec in .claude/specs/in-progress/."

This file becomes an external memory bank. When the chat context fills up, Claude re-reads the markdown file and instantly "remembers" the plan.

The Execution Log When a task is done, I don't delete the plan. I move it to .claude/specs/plans-executed/.

Six months from now, when I wonder "Why did we split the User Service into two files?", I don't have to dig through 100 chat logs. I just open user-service-plan-executed.md and read the architectural decision log.

The Result
By moving "State" out of the Chat and into Markdown:

Zero Hallucinations: Claude stops guessing because the "Truth" is in the file.

Resumable Sessions: I can close my laptop, come back 3 days later, start a new chat, and say "Resume the spec in .claude/specs/in-progress/". Claude instantly catches up.

9x Speed: I stopped reviewing code for "logic errors" and started reviewing it for "spec compliance."

Try it yourself (Open Source)
I’ve open-sourced the methodology and the CLAUDE.md rule file that enforces this workflow. You can drop it into any project today.

📂 Get the Free Workflow Rules on GitHub

🔌 Automated Toolkit (Optional)
If you want to go faster, I also built a CLI Toolkit that automates this whole process. It includes:

/session-start and /session-end commands that auto-manage the markdown state.

Auto-Review Agents: A java-code-reviewer that wakes up when you edit .java files to check for security flaws.

Test Generator: A skill that reads your BaseTest pattern and writes JUnit 5 tests automatically.

You can grab the full automation toolkit here: Spec-Driven AI Toolkit ($49)

Happy coding! Let me know if "Context Rot" drives you crazy, too. 👇

DEV Community

How I stopped Claude Code from hallucinating on Day 4 (The "Spec-Driven" Workflow)

Top comments (0)