More Rules, Worse Results: The Case for a Minimal CLAUDE.md

#ai #llm #productivity #agents

There are at least five Claude Code starter kits already published. They range from 27 commands with 4 npm packages to 47 commands with 4 MCP servers. Every single one of them is over-engineered.

This isn't speculation. Arize AI's research on coding agent rules shows that optimized rulesets contain 20–50 rules and improve SWE-bench accuracy by 10–15% — without changing the model, architecture, or tools. HumanLayer's analysis is more pointed: as instruction count increases, instruction-following quality decreases uniformly. The model doesn't ignore your new instructions — it starts ignoring all of them. Claude Code's system prompt already uses ~50 of your ~150–200 instruction budget before your CLAUDE.md even loads.

The kit that works best says the least.

After building 42 skills across 13 repositories — Rails, React, Elixir, vanilla JS — I distilled everything down to 8 workflow rules, 8 skills, 2 agents, and 1 knowledge base. The base CLAUDE.md is 99 lines. With a stack appended, it's ~160. Here's what survived.

Principles Over Rules

Every other kit I've seen includes rules like "functions under 20 lines" or "always use TypeScript interfaces." These are rules. They tell you what. They don't tell you when or why.

The 8 workflow rules encode principles instead:

### 1. Plan Mode Default
Enter plan mode for any task requiring 3+ steps.
For simple fixes (typo, one-line change), skip planning and just do it.

### 2. Subagent Strategy
Use subagents only for atomic, well-defined tasks.
Keep all reasoning and decision-making in the main session.

### 3. Self-Improvement Loop
When the user corrects you, update CLAUDE.md with the lesson.
Don't repeat the same mistake twice.

### 4. Verification Before Done
Never say "done" without proving it works:
- Code change → run tests
- Bug fix → reproduce before/after
- Refactor → full test suite passes

### 5. Demand Elegance (Balanced)
For non-trivial changes: pause, ask "is there a simpler way?"
For obvious fixes: just do it, don't overthink.

### 6. Autonomous Bug Fixing
Read the error, read the source, fix the root cause.
Don't ask — investigate and fix. Only ask if genuinely stuck.

### 7. Task Management
Plan → verify plan → track progress → explain decisions → document.

### 8. Core Principles
- Simplicity first — minimum complexity needed
- No laziness — never skip tests, never leave TODOs
- Minimal impact — change only what's necessary

Notice what's missing: no line limits, no naming conventions, no style preferences. Those belong in your linter config. These 8 rules shape behavior — how Claude thinks, plans, and self-corrects. The underlying principles (Clean Code, SRP, YAGNI, KISS) are taught through a code review knowledge base that explains when to apply each one and when to relax it.

DRY enforcement uses the Rule of Three: don't extract until you've seen three repetitions. YAGNI is enforced at four levels simultaneously — as a review lens, a new-feature gate, a test quality standard, and an anti-pattern catalog. One principle, four enforcement points, zero hardcoded opinions.

Rules tell you what. Principles tell you when and why.

Smell-Driven, Not Pattern-Driven

Most starter kits either ignore design patterns entirely or prescribe them upfront: "use the Repository pattern for data access," "all services should follow the Command pattern." Both approaches are wrong.

The /review-my-code skill uses an 8-lens review system. After applying all lenses, it checks whether any detected smells have a well-known pattern fix — but only when the smell is real:

Smell You Found	Pattern to Suggest	When It's Worth It
Long `case`/`if` switching on type	Strategy	3+ branches that will grow
Object creation scattered across conditionals	Factory	2+ creation paths, likely to add more
Multiple objects reacting to state changes	Observer/Pub-Sub	3+ listeners, or listeners will grow
Complex object with many optional params	Builder	4+ optional params, or complex validation
Adding behavior without modifying class	Decorator	Composable features (logging, caching, auth)
Step-by-step process with varying steps	Template Method	Shared skeleton, varying details

The rules that govern this table:

Only suggest when the code already smells — never preemptively
Prefer the framework's built-in pattern (Rails concerns, Phoenix behaviours) over rolling your own
If the simple version works and won't grow, skip the pattern — three if branches is fine

That third rule is the key. Every pattern adds indirection. Indirection is only justified when the smell is real and the code is likely to grow. A three-branch if statement that handles all known cases doesn't need a Strategy pattern. It needs to be left alone.

No existing starter kit encodes this smell-driven approach. Most either force patterns or ignore them entirely. Smells are the signal; patterns are the response.

Safety as a First-Class Citizen

Of the five major starter kits I've reviewed, none include destructive operation guards. Zero. This is wild. An AI agent with rm -rf access and no guardrails is a P0 incident waiting to happen.

The kit's safety section is short and non-negotiable:

### Destructive Operations — ALWAYS Ask First
These actions require explicit user confirmation every time:
- **Database:** dropping tables, removing columns, changing column types
- **Schema:** present a rollback strategy before executing
- **Files:** deleting files, rm -rf, overwriting uncommitted changes
- **Secrets:** never stage or commit .env, credentials, or API tokens
- **Git:** force push, reset --hard, amending published commits
- **Dependencies:** never upgrade a major version without asking
- **External APIs:** warn before calls that could cost money or hit rate limits

The migration rollback strategy requirement deserves attention. Before executing any schema change that could destroy data, Claude must present a rollback plan: add the new column → migrate data → remove the old column. Not the other way around. This single rule prevents the most common category of production data loss in Rails applications.

The philosophy is simple arithmetic: the cost of asking is 5 seconds. The cost of not asking is a production incident. There is no scenario where skipping confirmation is worth the risk.

Verification Loops

AI-generated documentation has a specific failure mode: confident statements that are subtly wrong. The model doesn't say "I think payments might use Sidekiq." It says "Payments are processed asynchronously via Sidekiq" — and if that's wrong, you won't catch it unless you check.

The /explain-system skill solves this with a claims table. Every factual claim from the analysis is extracted and verified against source code:

#	Claim	Source File(s)	Confidence	Status
1	Uses PostgreSQL as primary DB	`Gemfile:8`, `database.yml:3`	VERIFIED	Confirmed
2	Payments processed async via Sidekiq	`app/jobs/payment_job.rb:1`	VERIFIED	Confirmed
3	Failed jobs retry 3 times	(Sidekiq default)	INFERRED	Needs confirmation
4	Chose Rails for rapid prototyping	(no code evidence)	UNCERTAIN	Ask user

Three confidence levels:

VERIFIED — directly observed in source code with file:line reference
INFERRED — reasonable conclusion from evidence, not directly stated
UNCERTAIN — cannot confirm from code alone

The gate: zero UNCERTAIN claims in the final document. This gate is not optional — even if the user asks to skip it. Unresolvable items get an explicit callout: "[Not confirmed from code]". INFERRED claims are allowed but tagged.

The /walkthrough skill takes a different approach to the same problem. Instead of a verification loop, it requires a mandatory file:line reference on every single step:

### Step 3: Controller — Validates input and delegates to service

**File:** `app/controllers/orders_controller.rb:42`

The controller validates strong params, then delegates to OrderService.
If validation fails, it returns a 422 with error details.

**Data in:** Raw params from request
**Data out:** Validated order attributes

No file:line, no step. This is the trace-based equivalent of verification — every claim is anchored to a specific location in the codebase. You can verify any step by opening the file.

Verification is cheaper than correction. A 30-claim table takes 10 minutes to verify. Finding a subtle error in published documentation takes hours — if it's found at all.

The Kit

The complete configuration:

8 workflow rules that shape behavior, not style
8 base skills — QA, code review, testing, rule updates, context audit, onboarding, system explanation, walkthroughs
2 agents — test analyzer and codebase explorer (both read-only)
1 knowledge base — code review standards with 8 lenses, smell-to-pattern mapping, and severity definitions
4 stack configs — Rails, React, Elixir, static sites

Base CLAUDE.md: 99 lines. With a stack appended: ~160 lines.

Setup:

bash <(curl -s https://codeberg.org/gatleon/claude-starter-kit/raw/branch/main/setup.sh)

The setup script auto-detects your stack (Gemfile → Rails, mix.exs → Elixir, package.json + react → React), finds your test framework, and fills in the template. No manual configuration. No npm packages. No MCP servers.

The best CLAUDE.md is the one you actually read. 160 lines gets read. 2,000 lines gets ignored — and the research confirms it.

Source code on Codeberg · Made with ❤️ by Anjan · Built with Claude Code