Brad Kinnard

Posted on May 10

How to Write a CLAUDE.md Rule That Actually Gets Enforced

#ai #programming #productivity #tutorial

Open a CLAUDE.md file at random and you'll find build commands, architecture notes, and rules. The rules tend to be the unenforceable kind. "Write clean code." "Be careful with types." "Follow our conventions." The author meant every word. The agent reads them. And nothing checks whether the agent followed them, because nothing can.

In a corpus of 580 CLAUDE.md, AGENTS.md, and .cursorrules files from public GitHub repos with 10+ stars, 74% contained zero machine-extractable rules. Not because the authors didn't care about rules. Because most rules were written in a form no parser could pull out as a deterministic check.

This post is about the difference. Specifically: how to phrase a rule so a parser can extract it and a verifier can check it, without sacrificing what you actually meant.

The principle

Enforceability comes from a verifiable surface. A rule is verifiable when there's a concrete pattern in code that either matches it or doesn't. "Use camelCase for function names" is verifiable: read the AST, list the function names, check the casing. "Name things consistently" is not: there's no concrete pattern to check, only a judgment to make.

The gap between the two is the gap between intent and enforcement. You meant the same thing in both cases. But only one of them survives translation into a check.

Here's the heuristic I use: could a junior engineer with no context mechanically check whether code follows this rule, just by reading the rule and looking at the code? If yes, the rule is enforceable. If they'd have to ask "what does 'consistent' mean here?", it isn't.

Three pairs, walked through

Take a few common intents and look at how they fail or succeed.

Type safety. You want strong typing.

Bad:  Be careful with types.
Good: No `any` types in `src/`. Async functions require explicit return types.

The bad version has no surface. "Careful" isn't a check. The good version has two: a forbidden token (any) and a structural property (return type annotation on async function declarations). Both check directly against the AST.

Module structure. You want predictable imports and exports.

Bad:  Prefer clean module structure.
Good: Named exports only. No default exports. Filenames in kebab-case.

"Clean" is meaningless to a parser. Named-only and kebab-case are both binary properties of code that exist or don't. The first version sounds like more guidance because it's broader, but breadth is the problem: it covers everything and enforces nothing.

Preferences over alternatives. You want React functional components, not class components.

Bad:  Write modern React.
Good: Prefer functional components over class components.

"Modern" is a moving target with no fixed surface. The "prefer X over Y" pattern, on the other hand, has a clean check: count instances of each, compute a ratio, score against a threshold. This is one of the most useful patterns in instruction files because it captures real-world preference (not absolute prohibition) in a measurable way.

The reference table

Twelve common intents, paired:

Intent	Unenforceable	Enforceable
Naming functions	Name things consistently	Use `camelCase` for function names
Filenames	Pick reasonable filenames	All filenames in kebab-case
Type safety	Be careful with types	No `any` types
Async return types	Make types clear	Async functions require explicit return types
Module exports	Prefer clean module structure	Named exports only, no default exports
File size	Keep files manageable	Maximum 300 lines per file
Logging	Be mindful of logging	Never use `console.log`; use `src/logger.ts`
Component style	Write modern React	Prefer functional components over class components
Package manager	Use the right package manager	Use `pnpm`, not `npm`
Test files	Keep tests organized	All test files end with `.test.ts`
Error handling	Handle errors properly	Async functions must use `try/catch` or return a `Result` type
Commit format	Write clear commit messages	Use conventional commits (`feat:`, `fix:`, `chore:`)

Every right-hand cell points at a concrete check: a token, a casing rule, a count, a file pattern, a configured tool. Every left-hand cell points at a judgment.

Real-world rules usually carry scope: "no any in src/," "named exports outside index.ts files," "no console.log in production code paths." Scope makes a rule narrower and more accurate without making it less enforceable. The interop layer that genuinely needs any keeps it; the rest of the codebase doesn't.

What kinds of checks exist

Worth knowing what's available, because it shapes what's writable. Static analysis tools targeting AI instruction files generally support a few classes of check:

AST-level: function names, type annotations, import patterns, forbidden tokens, structural properties
Filesystem: file existence, naming conventions, directory layout, file size limits
Regex: literal strings, content patterns, conventional formats
Tooling: presence and configuration of linters, formatters, package managers, test runners
Config-file: contents of .eslintrc, tsconfig.json, .prettierrc, etc.
Git-history: commit message formats, branch naming conventions
Preference ratios: "prefer X over Y" with a compliance percentage instead of a binary verdict

If your rule maps to one of these classes, it's enforceable. If it doesn't, it isn't. The trick when writing instruction files is to keep that map in mind: when you're about to write "be careful with X", ask which of these classes "carefulness with X" lives in. Usually the answer points at a concrete reformulation.

The unenforceable rules aren't worthless

Here's a real tension: most of what makes a CLAUDE.md useful isn't enforceable at all. Project context (what the repo does, where the architecture lives), agent behavior directives (be succinct, ask before deleting, don't touch /legacy), and onboarding instructions are all valuable. None of them extract as rules.

Don't try to make them enforceable. They're a different kind of content with a different purpose. Project context grounds the agent. Behavior directives shape its style. Neither is supposed to be checked against output; they're checked against the agent's process, which is a different problem.

The mistake worth avoiding is letting unenforceable prose crowd out enforceable rules. Anthropic's Claude Code best practices doc recommends deleting any instruction the model already follows correctly without it. Most "write clean code" style rules fail that test: the model already does its version of clean code, so the line is taking up attention budget your agent could be spending on the specific, verifiable rules that actually distinguish your codebase from a generic project. Cut what the model already does. Keep the checks.

A test for your own files

Pull up your CLAUDE.md or AGENTS.md right now. For each line that looks like a rule, ask:

Could a junior engineer check this without asking clarifying questions?
Does it name a specific pattern, file, token, casing, or value?
Would 5 different reviewers all agree on whether a piece of code passes this rule?

If a rule fails 1 or 2, it's not a rule, it's a wish. If it fails 3, it's ambiguous. Rewrite or delete.

If you want a mechanical version of this test, RuleProbe parses CLAUDE.md, AGENTS.md, .cursorrules, .windsurfrules, GEMINI.md, and copilot-instructions.md against 102 matchers and tells you which lines extracted as rules and which didn't:

npm install -g ruleprobe
ruleprobe parse ./CLAUDE.md --show-unparseable

The --show-unparseable flag is the interesting one. It surfaces every line that looked rule-shaped but didn't map to a check. That list is your rewrite queue.

RuleProbe on GitHub

What this leaves out

The hardest case is rules like "follow the existing error handling pattern in this codebase." That's enforceable in principle (compare new code's structural shape against the codebase's dominant pattern), but not by simple AST or regex matching. It needs codebase-aware analysis. Some tools handle that; most don't. If you find yourself writing those kinds of rules, know that they'll either need a tool that does pattern profiling or they'll stay aspirational.

The other thing enforceability doesn't catch: an agent that follows every rule and still writes broken code. Static rules reduce variance, they don't eliminate it. A function with any removed and an explicit return type can still have wrong logic. Treat passing rule checks as a floor, not a ceiling.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.