NongdyZ

Posted on Jun 27

How I Set Up Claude Code with 26 Production Subagents (CLAUDE.md, MCP, Hooks)

#ai #devtools #productivity #claudecode

When I first opened Claude Code on a real project, it felt like hiring a brilliant junior who had read every book but never shipped anything. It could write code fast, but it would forget my conventions halfway through a session, run a migration without asking, and "fix" a bug by deleting the failing test.

After a few weeks of tuning, it now behaves like a small senior team: it plans before it codes, reviews its own work, writes tests, and refuses to run anything destructive. The difference wasn't a better model. It was a better setup — a CLAUDE.md, a set of focused subagents, a couple of MCP servers, and two small hooks.

Here is exactly how that setup works, so you can build your own.

1. CLAUDE.md is the contract, not a wishlist

CLAUDE.md is the file Claude Code reads automatically at the start of every session. Most people treat it like a README. It works far better as a contract: short, specific, and written in always/never terms the agent can't misread.

The two rules that gave me the biggest jump in quality:

## Build & test (run these, do not guess)
- Install:  pnpm install
- Dev:      pnpm dev
- Test:     pnpm test -- --run
- Lint:     pnpm lint
- Typecheck: pnpm typecheck

## Always
- Run `pnpm typecheck && pnpm test -- --run` before saying a task is done.
- Use parameterized queries. Never build SQL with string concatenation.
- When you change a public function, update its tests in the same edit.

## Never
- Never run `git push`, `git reset --hard`, or any DB migration without me confirming.
- Never add a dependency to fix something a few lines of code can solve.
- Never disable a failing test to make the suite pass.

The reason the "Build & test" block matters: without the exact commands, the agent guesses them (npm test, yarn test, make test) and wastes turns failing. Give it the real commands once and it stops guessing.

Keep this file under ~50 lines. A long CLAUDE.md gets diluted in the context and the agent starts ignoring the middle of it. Be ruthless.

2. Subagents: split one generalist into many specialists

A single agent juggling "review this, also write tests, also refactor" produces mediocre output at every task. Subagents fix this by giving each job its own focused instructions and its own context window. In Claude Code these live in .claude/agents/*.md and you invoke them by name.

A subagent is just a Markdown file with frontmatter and a system prompt. Here's a real one — my code-reviewer:

---
name: code-reviewer
description: "Reviews a diff for bugs, security issues, and convention violations. Use after writing or changing code."
tools: Read, Grep, Bash
---

You are a senior code reviewer. Review ONLY the changed lines plus their immediate context.

Output exactly three sections:
1. **Blocking** — bugs, security holes, data loss risks. Each with file:line and a fix.
2. **Should fix** — correctness or clarity issues that aren't blocking.
3. **Nits** — style/naming. Keep this short.

Rules:
- If there are no blocking issues, say so on line one. Do not invent problems to look thorough.
- Check: injection, missing error handling, N+1 queries, unhandled null, race conditions.
- Quote the exact line you're flagging. No vague "consider improving error handling".

The two lines that make it useful are "Do not invent problems to look thorough" and "Quote the exact line." Without them, reviewers either rubber-stamp everything or generate a wall of generic advice. With them, you get actionable output.

I run a roster of focused specialists this way — a debugger that reproduces before it theorizes, a test-writer that targets edge cases instead of happy paths, a security-auditor that thinks in OWASP categories, an api-designer, a refactor-architect, and so on. Each one is a small file with a tight output format. The pattern is always the same: one job, one output format, one anti-fluff constraint.

3. MCP: give the agent eyes and memory

MCP (Model Context Protocol) servers are how Claude Code reaches outside the chat — the filesystem, a browser, a persistent memory store. You configure them in .mcp.json:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "./"]
    },
    "sequential-thinking": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-sequential-thinking"]
    }
  }
}

The one I'd add first is sequential-thinking. It gives the agent a scratchpad to break a problem into steps before acting, which noticeably reduces the "confidently wrong" answers on multi-step tasks. A browser MCP (Playwright) is the second I reach for, because it lets the agent actually load the page and read the console instead of guessing why the UI broke.

One caution: MCP servers move fast and some get abandoned. Pin to maintained ones and check they still install before you rely on them.

4. Hooks: the safety net that runs without asking

Hooks are shell commands Claude Code runs on events you choose. Two are worth setting up on day one.

Auto-format on save — so you never review a diff full of whitespace noise:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [{ "type": "command", "command": "pnpm prettier --write \"$CLAUDE_FILE_PATHS\"" }]
      }
    ]
  }
}

Block dangerous commands — a pre-execution guard that refuses destructive shell calls even if the agent tries them:

#!/usr/bin/env bash
# .claude/hooks/guard.sh — exit non-zero to block the command
if echo "$CLAUDE_TOOL_INPUT" | grep -qE 'rm -rf /|git reset --hard|DROP TABLE'; then
  echo "Blocked: destructive command refused by guard hook." >&2
  exit 1
fi

This is the difference between trusting an autonomous agent and babysitting it. The model can be wrong; the hook is deterministic.

Putting it together: a feature, start to finish

With the setup above, my actual workflow on a new feature is short:

Ask the agent to plan — it writes the steps before touching code.
It scaffolds and implements, formatting on save via the hook.
I invoke code-reviewer on the diff, then test-writer for the gaps it found.
pnpm typecheck && pnpm test runs because CLAUDE.md says it must.
It drafts the commit and PR description.

The agent does the typing. I do the deciding. That's the whole point.

Want the whole thing as a drop-in?

Building all of this from scratch is a real time sink — the 26 subagents alone took me weeks of tuning to get the output formats right. If you'd rather skip that and drop a finished setup into any repo, I packaged my exact configuration as Claude Code Agent OS: 26 specialized subagents, 12 slash commands (/plan, /review, /test, /ship...), 5 ready-to-edit CLAUDE.md templates, the MCP configs, and the safety + format hooks (bash and PowerShell) — with a START HERE quickstart that gets you running in about 5 minutes.

But you don't need it to get value from this post. Start with a tight CLAUDE.md and one code-reviewer subagent today; that pair alone will change how your next commit feels.

If you build your own subagent roster, I'd genuinely like to see which specialists you find most useful — drop them in the comments.

Top comments (1)

Edu Peralta • Jul 4

I run several coding agents at once for a review tool I'm building, and the thing that breaks first once you're past a handful of subagents isn't the orchestration, it's trusting the merged diff. Each subagent tends to pick its own conventions for naming and error handling, and the top level summary makes the result look coherent even when two of them solved the same edge case two different ways. A green test suite stopped being enough proof for me, I read the actual hunks before anything ships. Do your hooks catch that kind of drift, or are you still reading every diff by hand once the subagents merge?