Mike Dolan

Posted on Apr 7

I Wrote 500 Lines of Rules for Claude Code. Here's How I Made It Actually Follow Them.

#python #claudecode #ai #productivity

CLAUDE.md is supposed to be the operating manual for Claude Code. You write your rules, preferences, project context, and workflow instructions. Claude reads it at session start and follows your instructions.

Except it doesn't. Not reliably.

If you have used Claude Code for more than a few sessions, you have seen it ignore your CLAUDE.md rules. Not maliciously. It just prioritizes answering your question over following a protocol. It skips steps, forgets constraints, and does things you explicitly told it not to do.

I have 500+ lines of rules across CLAUDE.md, SESSION_PROTOCOLS.md, and MEMORY.md. After months of Claude ignoring them, I stopped writing more rules and started enforcing the ones I had. The fix was not better rules. It was hooks.

The Problem: Rules Without Enforcement

Here is what happens with CLAUDE.md rules in practice:

Claude skips the session start protocol. I have a 7-step checklist that Claude must complete before doing anything else: search the brain database, read the project tracker, review unfinished items, show me the last session's notes. Claude skipped this in sessions 35, 38, and 39. It jumped straight into whatever I asked and ignored the checklist entirely.

Claude codes without permission. I have a rule that says "NEVER write, edit, or modify code unless Mike says GO with a specific step or fix." Claude violated this in sessions 17, 19, 21, and 25. Each time it started coding before I approved the plan, creating complexity and wasted work.

Claude forgets rules after compaction. When the context window fills up and Claude Code auto-compacts, the CLAUDE.md gets re-read but the nuance is gone. Claude follows the broad rules but skips the specific ones buried deeper in the file.

Claude prioritizes helpfulness over compliance. If you ask a question, Claude wants to answer it immediately. The idea of stopping to run a checklist first feels counterproductive to the AI, so it skips it. Your rules say "do this first." Claude's instinct says "answer the human."

This is not a bug. It is a fundamental limitation of instruction-following. Rules in a text file are suggestions. The AI decides whether to follow them based on how important they seem relative to the current task. Your carefully written protocol loses to "the user asked me a question and I should answer it."

Why More Rules Don't Help

The instinct is to write more detailed rules. Add more context. Be more explicit. Bold the important parts. Add warnings about past violations.

I tried all of this. My CLAUDE.md grew from 50 lines to 200 to 500+. I added violation history ("Violated in sessions 17, 19, 21, and 25. Each violation caused complexity and wasted time. ZERO tolerance."). I added bold warnings. I added all-caps headers.

It helped marginally. Claude followed the rules more often. But "more often" is not "always." And for rules that matter, "usually" is not good enough. If "do not code without permission" works 90% of the time, the 10% where it fails costs hours of rework.

The problem is not clarity. Claude understands the rules. It just doesn't always prioritize them over the immediate task. No amount of rewriting fixes that.

The Fix: Hooks That Enforce Rules Automatically

Claude Code has a hook system that runs scripts at specific points in a session. These hooks fire automatically. Claude cannot skip them, ignore them, or decide they are not important enough to follow.

This changes rule enforcement from "please follow this" to "this runs whether you follow it or not."

Here is how I enforce my most important rules with hooks:

Rule: Always Run the Session Start Protocol

The rule in CLAUDE.md:

When a session starts, complete ALL steps and output the checklist table:
1. Search the brain
2. Read PROJECT_TRACKER.md
3. Review session-start hook output
4. Present unfinished items to user
5. Present next-session notes to user
6. Output verified checklist (every item must show DONE)

Why CLAUDE.md alone fails: Claude jumps straight into answering whatever the user types first. The checklist gets skipped because Claude prioritizes responsiveness over protocol.

The hook enforcement: The session-start.py hook fires automatically before Claude sees your first message. It reads NEXT_SESSION.md, loads the last session's notes, flags unfinished items, and injects everything into Claude's context via additionalContext. Claude receives the checklist requirements as part of its initial context, not as a rule it might skip. The hook also injects the verification checklist template so Claude knows exactly what format to output.

The combination of hook injection plus CLAUDE.md instruction makes it reliable. The hook provides the data. CLAUDE.md provides the instruction to present it. Neither works as well alone.

Rule: Search Memory Before Every Response

The rule in CLAUDE.md:

Search the brain AND the web BEFORE you do ANYTHING substantive.
Before proposing a plan, before debugging, before recommending tools.

Why CLAUDE.md alone fails: Claude wants to answer immediately. Searching first feels like a delay, so it skips the search and goes straight to a response based on what it already knows.

The hook enforcement: The user-prompt-submit.py hook fires on every single message before Claude processes it. It extracts keywords from your prompt, searches the full transcript database (69,000+ messages), and injects the top 5 relevant matches into Claude's context. Claude does not decide whether to search. The search happens automatically and the results appear in context alongside your message.

This means Claude always has relevant history available, even when it would not have thought to search on its own. Past decisions, prior discussions, rejected approaches, all surfaced automatically.

Rule: Do Not Code Without Permission

The rule in CLAUDE.md:

NEVER write, edit, or modify code unless Mike says "GO" 
with a specific step or fix.

Why CLAUDE.md alone fails: Claude sees a problem and wants to fix it. The impulse to be helpful overrides the instruction to wait. Four separate sessions violated this rule before I added enforcement.

The hook enforcement: The same user-prompt-submit.py hook that searches memory also checks for the GO pattern. If the conversation contains words like "thoughts?" or "what do you think?" without an explicit "GO," the hook injects a reminder: "STOP. Present the plan and wait for GO before coding." Claude receives this reminder in context on every message, making it much harder to forget.

Rule: Never Lose Context to Compaction

The rule in CLAUDE.md:

Before compaction, save all context to the database.
After compaction, restore relevant context.

Why CLAUDE.md alone fails: Auto-compaction fires without warning. Claude cannot follow a rule to "save before compaction" because it does not know compaction is about to happen. By the time compaction fires, it is too late to save anything.

The hook enforcement: pre-compact.py fires automatically before compaction starts and captures the full conversation to a local SQLite database. post-compact.py fires after compaction finishes and re-injects relevant context (project summary, recent decisions, last session notes). Claude does not need to follow a rule. The hooks handle it without Claude's involvement.

Rule: Stop and Reassess When Frustrated

The rule in CLAUDE.md:

If the user is frustrated, stop iterating on the current approach.
Search the brain for what was discussed and reassess.

Why CLAUDE.md alone fails: Claude does not reliably detect frustration, and even when it does, it tends to apologize and continue the same approach rather than stopping to reassess.

The hook enforcement: The user-prompt-submit.py hook detects frustration signals in your message (all caps, repeated punctuation, anger keywords). When triggered, it searches the brain for the current topic and injects a STOP directive plus relevant context from past sessions. Claude receives "STOP. Reassess. Here is what was discussed about this topic before." This breaks the loop of Claude continuing down a wrong path while the user gets increasingly frustrated.

The Pattern: If a Rule Matters, It Needs a Hook

After months of running this system across 1,300+ sessions, the pattern is clear:

Rules that work in CLAUDE.md alone: broad preferences, style guidelines, general instructions. Things where occasional non-compliance is acceptable. "Use lists and frameworks when starting a project." "Be precise and complete." These work fine as text because the cost of ignoring them once is low.

Rules that need hook enforcement: anything where a single violation costs significant time or loses data. Session protocols, memory search, coding permission, compaction protection, frustration detection. These cannot rely on Claude choosing to follow them. They must run automatically.

The progression I have seen repeatedly:

Write the rule in CLAUDE.md
Claude violates it
Add bold warnings and violation history
Claude violates it less often but still sometimes
Build a hook that enforces it automatically
Rule is never violated again

If you are at step 3 or 4 with any of your rules, skip straight to step 5. The hook is always the answer for rules that matter.

What This Looks Like in Practice

Six Python hooks enforce my rules across the full session lifecycle:

session-start.py      injects last session's notes, unfinished items,
                      verification checklist. Claude cannot start
                      without context.

user-prompt-submit.py searches memory on every prompt, checks for GO
                      permission, detects frustration, injects relevant
                      history. Runs before Claude sees your message.

stop.py               captures every response to the database, triggers
                      backup if older than 12 hours. Runs after every
                      Claude response.

session-end.py        triggers sync and backup when the session closes.

pre-compact.py        captures the full conversation to SQLite before
                      compaction fires. Automatic, no Claude involvement.

post-compact.py       re-injects relevant context after compaction.
                      Claude picks up where it left off.

None of these depend on Claude deciding to follow a rule. They run automatically at the right lifecycle point. Claude receives their output as injected context, not as an instruction it might skip.

The Simple Version (No Hooks Required)

If you are not ready to set up hooks, you can still improve CLAUDE.md compliance with these patterns:

Front-load your rules. Put the most important rules at the top of CLAUDE.md. Claude reads from top to bottom and gives more weight to what it sees first.

Use verification outputs. Instead of "follow the checklist," write "output this checklist table with DONE or MISSING for each row." Making Claude produce a visible output forces it to actually check each item. If a row shows MISSING, you catch it immediately.

Reference past violations. "Violated in sessions 17, 19, 21, 25. Each caused hours of rework." This makes the rule feel consequential, not optional.

Use NEXT_SESSION.md for session continuity. At the end of every session, have Claude write a summary and your forward instructions to a file. At the start of the next session, have Claude read it first. This does not require hooks. It just requires adding "read NEXT_SESSION.md before doing anything else" to your CLAUDE.md.

These help. But they do not guarantee compliance the way hooks do.

Real Numbers

This system has been running across 1,300+ sessions and 9 projects:

Session start protocol skipped: 3 times before hook enforcement, 0 times after
Coding without permission: 4 violations before enforcement, 0 after
Context lost to compaction: dozens of times before hooks, 0 after
Frustration loops (Claude repeating failed approaches): frequent before circuit breaker, rare after

The difference is not better rules. It is automated enforcement.

Try It

The CLAUDE.md patterns (front-loading, verification outputs, violation history, NEXT_SESSION.md) work immediately with no install.

For the full hook enforcement system with automated memory search, compaction protection, and frustration detection:

curl -fsSL https://raw.githubusercontent.com/mikeadolan/claude-brain/main/install.sh | bash

Free, open source, works on macOS, Linux, and Windows.

GitHub: github.com/mikeadolan/claude-brain
Video: claude-brain in 85 seconds
Article 1: How I Built Persistent Memory for Claude Code
Article 2: Why I Use Claude Code for Everything
Article 3: The Session Protocol That Fixed Claude Code's Amnesia Problem
Article 4: Claude Code Compaction Kept Destroying My Work. I Built Hooks That Fixed It.

Top comments (4)

Benjamin Eckstein • Apr 7

Hey Mike,

500 lines is the symptom, not the disease. I had the same problem — then a colleague asked one question that collapsed my whole architecture.

I'd built 18 specialized agents, each with its own instruction file and evolution history. The system worked until I realized I was doing exactly what you describe: loading everything into context whether it's needed or not, then trying to enforce what gets ignored. Wrote about it — "agents define the workflow, capabilities are stackable units."

Skills changed the enforcement question entirely. Instead of a "search memory before coding" rule that Claude might skip, the coding skill loads that behavior into its own context by design. The rule doesn't need enforcement because it isn't a rule — it's just what happens when the skill runs. Your hooks approach is still "rules plus enforcement." Skills are "no rules, just structure."

I think hooks are the right answer if you're stuck with a monolithic CLAUDE.md. But if you can restructure, the problem dissolves.

Mike Dolan • Apr 7

Good point. Skills and hooks solve different layers of the problem.

You're right that skills eliminate certain rules by design. If the coding skill always searches memory before coding, you don't need a rule saying "search memory before coding." The behavior is structural, not instructional. That's cleaner for task-level workflows.

Where hooks still win: lifecycle events. A PreCompact hook fires before compaction to capture the full conversation to a database. A user-prompt-submit hook searches your history on every message and injects matches. A session-start hook loads your last session's notes automatically. These aren't task behaviors that a skill can own. They're system-level events that fire regardless of what task you're working on.

The real answer is probably both. Skills for task-level structure (your coding skill always loads the right context for coding). Hooks for system-level enforcement (every session starts with context, every prompt gets memory search, every compaction is captured). One handles what Claude does. The other handles what happens around Claude.

The 500 lines is still real though. Even with skills, you need somewhere to define the rules that cross all skills: decision numbering, commit protocols, session handoffs. Those live in CLAUDE.md because they apply everywhere, not just inside one skill.

Tom Lemmons • Apr 9

Good thread. I hit the same wall running several agents on a production codebase and solved it with a different layer — behavioral guidelines pushed from an MCP memory server at session start, plus authority elevation via --append-system-prompt-file on launch. But after reading this, I think Benjamin is right that skills are a better task-level workflows, and Mike is right that hooks handle lifecycle events that skills can't. I was not using either.

All three approaches solve different problems: MCP for shared tools, memory, and data, skills for when/how to use them, hooks for lifecycle enforcement. I'm now thinking about adding skills to my MCP server so agents get the workflow knowledge progressively instead of all at session start. Thanks to both of you — this shifted my thinking.

Mike Dolan • Apr 9

That's the right stack. MCP for shared data and tools, skills for task-level workflow, hooks for lifecycle enforcement. Each layer handles what the others can't.

The progressive loading idea is smart. Instead of dumping everything at session start, let skills pull what they need when they need it. Keeps the context lean and the agent focused on the current task. Same principle as our user-prompt-submit hook: inject relevant history per message instead of loading the full history upfront.

Glad the thread was useful.