JDiz00

Posted on Jun 19

The No BS Claude Code Setup: The CLAUDE.md + Hooks I Run Every Session

#claude #ai #productivity #devtool

If you use Claude Code seriously, you've felt this: the model is brilliant, but the session fights you. It forgets what it did last week. It edits the wrong file. It declares a broken build "done" and moves on. You end up re-explaining your conventions every single morning.

Here's what took me a while to accept: the model was never the bottleneck. The harness around it was. A great model with a sloppy harness is a junior dev with amnesia.

So I stopped tweaking prompts and started building a system. This is the actual setup I run every session — not theory, the real thing.

1. CLAUDE.md is a contract, not a wish list

Most people's CLAUDE.md is a pile of polite suggestions the agent ignores under pressure. Mine reads like a contract with non-negotiables. A few of the rules that earn their place:

## Response style
- Answer first, explain only if asked. No filler.

## Decision forks
- At a real choice point: stop, present options, wait. Don't guess and proceed.

## Done means done
- Never say a task is complete without verifying it. Run it, test it, or
  show the output. If you can't verify, say so explicitly.

That last one — "done means done" — is the highest-leverage line in the whole file. But a rule in a doc is only a suggestion until something enforces it. That's where hooks come in.

2. The one hook that changed everything: verify-before-done

Claude Code lets you run a hook when the agent tries to end its turn (a "Stop" hook). I use one that does a simple, brutal check: if the agent modified code and claims it works, but ran no test/build/command that turn, block the stop and make it actually verify.

The logic is basically:

on Stop:
  if changed_code_this_turn and claimed_done and not ran_verification:
      block("You said it works but verified nothing. Run the feature,
             test it, or state plainly what you couldn't verify.")

It's a few dozen lines, but it killed almost all of my "it said it worked and it didn't" pain in one move. The agent can no longer hand me a green checkmark it didn't earn.

3. Model routing: cheap work to cheap models

Subagents inherit the main model unless you say otherwise — which means a top-tier model can quietly end up doing grep sweeps and log-summarizing at premium prices. I route by task: recall/inventory/summarization → a cheap fast model, implementation/tests → a mid model, deep review/diagnosis → the strong one. A small pre-spawn hook auto-pins the tier so I don't have to remember. Same quality on the work that matters, a fraction of the spend on the work that doesn't.

4. Persistent memory + a knowledge graph

The agent writes durable facts to a small file-based memory (one fact per file, with a one-line index it loads each session) and I keep a knowledge graph over my notes. The payoff: it stops re-researching my own past work. Before it builds or researches something, a recall step checks "have we already done this?" — and usually we have.

5. Pre-build recall: don't rebuild what exists

Tied to the memory above: a check on each build/research request that surfaces prior work first. The number of times this has caught me about to rebuild something I finished three weeks ago is genuinely embarrassing.

The point

None of these are clever individually. Together they turn the agent from an amnesiac genius into something that behaves like a senior engineer who remembers the project and refuses to lie about whether the build is green.

If you want a running start, I packaged the two highest-leverage pieces — the CLAUDE.md contract and the verify-before-done guardrail — into a small free starter kit. It's drop-in: copy the files, restart, done in about ten minutes. It's pay-what-you-want with a $0 option, no email wall to read it:

→ Free Claude Code Starter Kit: https://expressive446.gumroad.com/l/qlypgs

Built and run by someone who ships with this stack daily. Happy to answer setup questions in the comments.

Claude is a trademark of Anthropic PBC. This is an independent setup guide and is not affiliated with, endorsed by, or sponsored by Anthropic.

Top comments (1)

solemness • Jul 2

Solid post, especially the point that Claude Code's flakiness is a session-infrastructure problem, not a model problem. The verify-before-done piece is the one worth hardening though — "the agent should verify" as a CLAUDE.md instruction is a suggestion, not a contract. Models skip it under context pressure. You need a Stop hook that can actually refuse to let the turn end.

Mechanics: a Stop hook fires when Claude tries to finish. If it exits 2, the run doesn't stop — stderr gets fed back to the model as the reason, and it has to keep going. You can also return JSON {"decision": "block", "reason": "..."} on stdout for the same effect with more control.

Two things people get wrong building this:

Loop guard — check stop_hook_active in the hook's input JSON. If it's already true, your hook already fired once this turn; don't block again or you'll get an infinite "keep working" loop that never actually exits.
Fail open — if your test command itself errors (missing binary, no test script, whatever), don't block. Log it and let the turn end. A hook that hard-fails on its own tooling problems is worse than no hook — it locks the session.

Rough shape: track in a temp file whether any Edit/Write tool ran this turn (via a PreToolUse hook), then in the Stop hook, if that flag is set and no test/build command ran after it, exit 2 with "you modified files but didn't verify — run the test suite" as the message.

Full disclosure: I'm the author — I built and sell a pack with this exact pattern (claude-md-armor, $7): solemness.gumroad.com/l/claude-md-.... The mechanics above work standalone if you'd rather DIY it in five minutes.