Peter Huang

Posted on Jun 10

Writing a CLAUDE.md that Claude actually follows

#ai #productivity #claude #devtools

TL;DR — CLAUDE.md is powerful, but most people fill it with vague preferences that Claude acknowledges and then ignores. The instructions that stick are specific, verifiable, and binary — not stylistic. This post shows the difference, with examples.

The CLAUDE.md that got ignored

My first CLAUDE.md had 30 lines. Things like:

- Write clean, readable code.
- Keep functions small and focused.
- Follow best practices.
- Be concise in explanations.
- Don't over-engineer solutions.

Claude read it. Every session it would acknowledge the file. And then it would write 400-line explanations, functions that did four things, and solutions with three layers of abstraction nobody asked for.

I blamed Claude. But the real problem was me: I'd written instructions that I couldn't have enforced even if I were reviewing a junior engineer's PR. "Be concise" isn't actionable. It's an aspiration. And aspirations don't constrain LLM behavior — they just get absorbed into the same ambient politeness that produces confident, plausible, wrong output.

Why most CLAUDE.md instructions don't stick

An LLM follows instructions the way a very eager-to-please person follows vague advice: by interpreting them in whatever way is consistent with what they were already going to do. "Write clean code" is consistent with almost anything. "Don't over-engineer" depends entirely on what counts as over-engineering, which the model will assess using its own judgment, which is the thing you were trying to override.

The instructions that actually change behavior share a few properties:

1. They're binary, not scalar.

"Be concise" is scalar — there's a spectrum and the model decides where on it to land. "Do not write an introduction paragraph that summarizes what you're about to do" is binary. Either there's an intro paragraph or there isn't. Binary instructions are enforceable; scalar ones get interpreted.

2. They name a specific artifact or action.

"Keep functions small" is vague. "If a function exceeds 40 lines, split it before returning" names a specific thing Claude can check. The more you describe the output rather than the quality, the more Claude can follow it.

3. They have no wiggle room for judgment.

"Prefer composition over inheritance" invites Claude to decide when composition is preferable. "Do not use class inheritance in this repo — use composition" closes the judgment loop. Instructions that leave room for discretion will be filled with the model's discretion.

A before/after on common instructions

Here's a CLAUDE.md rewrite that applies those three properties:

Before: - Write concise explanations.

After: - When I ask a question, answer in 3 sentences or fewer unless I explicitly ask you to elaborate. No bullet lists in conversational responses.

Before: - Follow the project's code style.

After: - Before writing any code, read the nearest existing file in that directory and match its import style, spacing, and naming exactly. Do not introduce new patterns.

Before: - Don't add features I didn't ask for.

After: - Do not modify any file I didn't mention in my request. If you think an adjacent change would help, describe it in a sentence at the end — do not make it.

Before: - Be careful with database changes.

After: - Never generate a migration without a rollback. Every migration file must have both an \upand a \downfunction before you show it to me.

The right column isn't longer because it's more thorough. It's longer because specificity takes more words. But each instruction on the right has exactly one interpretation.

The two categories of CLAUDE.md instructions

After rewriting mine a few times, I've found that instructions fall into two useful categories:

Constraints on output — things Claude must or must not produce. These work well when they're binary and verifiable at a glance. "Do not add comments to code unless I ask" is a constraint on output. So is "all functions must have return type annotations" or "never produce a file over 200 lines without splitting."

Procedural instructions — steps Claude must take before doing the main task. "Before writing a function, read the test file for that module" is procedural. So is "before answering a question about the codebase, run grep for the relevant symbol and cite the file:line." Procedural instructions work because they give Claude a concrete first step that shapes what it does next.

What doesn't work well: value statements, quality adjectives, relative preferences ("prefer X", "try to Y", "consider Z"). These are absorbed. They don't constrain.

What to put at the top

CLAUDE.md instructions have an effective range. The further down the file, the more likely they are to be deprioritized under a long task. Put the most important constraints at the top — not as an introduction or preamble, but as the first actual rules Claude encounters.

My current CLAUDE.md opens with three lines that are binary, specific, and cover the mistakes I'd otherwise see every session:

# Rules (always follow)

1. Do not modify files I did not explicitly mention. If a related change seems useful, list it at the end and wait.
2. Do not add code comments unless I ask. Well-named variables explain themselves.
3. When showing diffs or edits, show only the changed lines plus 2 lines of context — no full-file reprints.

Those three lines have changed my sessions more than the other 40 instructions combined.

Testing whether it works

You'll know your CLAUDE.md is working when you stop seeing the behavior you wrote it to prevent. That sounds obvious, but most people write CLAUDE.md once, never check whether it changed anything, and assume it did.

A practical test: after updating CLAUDE.md, trigger the exact scenario the new instruction was meant to handle. Write a request that would previously have produced the unwanted behavior. If it still appears, the instruction isn't binary enough, isn't specific enough, or is too far down the file. Revise until it's gone.

CLAUDE.md is not documentation. It's configuration. Treat it like a test suite: if the behavior you're trying to prevent still appears, the config is wrong.

The free agent + where to go deeper

If you're building a real CLAUDE.md-driven workflow, the same design philosophy applies to subagents: narrow scope, binary constraints, explicit refusal rules. The free shipping-coach agent in this repo is a working example of that approach — it's a 50-line .md file worth reading for its structure, not just its functionality:

👉 github.com/allcanprophesy-ops/claude-code-shipping-coach

A few more agents built the same way (pr-surgeon, regression-sentinel) are linked from the README if you want to see how the pattern scales beyond a single agent. But start by reading your own CLAUDE.md and asking: which of these instructions is binary? Which ones leave room for the model to decide? Fix the second category first.

What's the one instruction in your CLAUDE.md that you've tried multiple phrasings for and it still doesn't work? Drop it in the comments — I'll suggest a binary rewrite.

Top comments (1)

Alex Shev • Jun 11

A good CLAUDE.md is really a repo contract. It should tell the agent where truth lives, which commands prove a change, what files are dangerous, and how to stop when the evidence is weak.

The more terminal workflows become agent-driven, the more these files need to behave like executable operating rules, not just onboarding notes.