I stopped thinking of Claude Code as an AI assistant about three months ago. Now I think of it as a team of five engineers I have to manage.
The moment I started treating it as a team, with actual roles and permissions and reporting lines, everything got better. I used to blow through my Claude 20x Max weekly limit every single week. Now I hover around 40%.
The default Claude Code experience has the same failure mode as a solo developer doing everything in one sitting. The planning is contaminated by the urge to start coding. The review is soft because the reviewer wrote the code. The documentation is an afterthought.
It's crazy that we must treat AI as human to get the most out of it, but it does get the most out of it.
Context is the bottleneck
LLMs have a finite context window. Every token of planning, implementation, debugging, testing, and review that happens in one session is competing for space in the same window. The longer the session runs, the more the model loses track of earlier decisions. This is a well-documented phenomenon called context rot. Research shows an average 39% performance drop in multi-turn conversations, with models making premature assumptions and failing to course-correct. When models take a wrong turn in a conversation, they get lost and don't recover.
Humans have the same problem. We call it context switching. A developer who just spent two hours deep in implementation can't instantly flip to doing a rigorous code review on their own work. The implementation context is still in their head, biasing every judgment. That's why teams exist. The reviewer walks in cold. They only see what's on the page. Fresh eyes, fresh context.
Subagents work the same way. Each one spins up in a fresh context window. The reviewer has never seen the implementation process. It gets the diff, the affected files, and nothing else. Its entire context budget goes toward reviewing, not remembering how the code got written. The architect doesn't carry implementation details because it runs before any code exists.
You're giving each task a clean slate of attention.
Building the team
Subagents
Claude Code supports custom subagents that you define and drop into your project. You don't need to write these from scratch. Just tell Claude Code what kind of agent you want, describe the role and constraints, and it'll generate the agent file for you. The agents live in .claude/agents/ for project scope (committed to the repo, shared with the team) or ~/.claude/agents/ for personal scope.
A few design decisions that matter more than the prompt itself:
Tool restriction is the highest leverage knob. A reviewer agent with only Read, Grep, Glob cannot break anything. It can't edit a file. It can't run a destructive command. Give each agent the minimum tools it needs. Read-only agents are safe to run liberally. Write-capable agents need more supervision.
Model routing saves you from yourself. Not every task needs your most expensive model. I run Opus for the architect and reviewer because those are high leverage decisions. Sonnet for the debugger and engineer. Haiku for the researcher because it's just fetching and reading docs, not doing deep reasoning.
Project scope vs. user scope. Agents in .claude/agents/ get committed to the repo. Every engineer on the team gets them when they clone. Agents in ~/.claude/agents/ are personal.
The five
Architect
model: opus
tools: Read, Grep, Glob
Produces plans, never code. Restates the goal, identifies affected files, lists concrete steps with decision points, flags risks. It pushes back on ambiguity. "Add validation" is a bad step. "Reject empty strings in parseInput() at line 42 and return a 400 with message 'name required'" is a good one. No write tools. If it can't edit files, it can't skip ahead. Pair it with your project's design patterns skill.
Reviewer
model: opus
tools: Read, Grep, Glob, Bash
Never modifies code. Reads the diff plus the full affected files, then reports findings by severity: blockers, issues, nits, positives. Cites file and line number for every finding. If a category is empty, says so instead of padding with filler. This is where Opus earns its cost. A review that catches a race condition, a missing null check, or an auth bypass before production pays for itself. Load your testing and security conventions skill here.
Engineer
model: sonnet
tools: Read, Grep, Glob, Bash, Edit, Write
Writes implementation code and tests. Follows existing conventions and patterns in the repo. Covers edge cases and failure paths, not just the happy path. Doesn't refactor code it wasn't asked to touch. This is the agent that benefits the most from framework-specific skills. More on that below.
Debugger
model: sonnet
tools: Read, Grep, Glob, Bash, Edit
Reproduces, isolates, fixes. Critical instruction: keep asking "why" until you hit root cause, not symptom. Does not fix unrelated issues. Does not refactor while debugging. If the bug can't be reproduced, says so and stops.
Researcher
model: haiku
tools: Read, WebSearch
The researcher doesn't write code. When Claude encounters an unfamiliar library, a framework API it's not confident about, or a pattern it might hallucinate on, the researcher goes and looks it up. It searches the web, reads the documentation, and brings back facts instead of guesses. Instead of letting Claude make things up about a library's API, you send the researcher out to get the real answer first. Cheap to run, and it pays for itself every time it prevents a subtle API misuse that would've taken you an hour to debug.
Orchestrating with CLAUDE.md
To combine all these agents into something that works together, you need to orchestrate them in your project's CLAUDE.md (or your user-level one at ~/.claude/CLAUDE.md).
Out of the box, Claude Code will try to do everything itself in the main session. You need to instruct it to delegate and run agents in parallel. Tell it to always use subagents for tasks, to run the architect first, then fan out engineer tasks in parallel, and always run the reviewer after implementation completes.
This changes Claude from sequential (grind through everything in one thread) to parallel (delegate and fan out). On large tasks touching 10+ files, parallel agents cut the total time in half or more. The main session becomes a coordinator, not a worker.
Your CLAUDE.md is also where you put all your project-related context. Conventions, patterns, architectural decisions, stack choices. Keep it short and opinionated.
Skills
Skills are the other half of the system and most people skip them.
They're folders of instructions, scripts, and reference materials that Claude loads on demand. They live in .claude/skills/ and follow the Agent Skills open standard, which means they work across Claude Code, Codex, Gemini CLI, Cursor, and others.
The key design principle is progressive disclosure. At startup, Claude only sees each skill's name and description, roughly 30 to 50 tokens per skill. The full instructions only load when the agent decides the skill is relevant to the current task. You can have dozens of skills installed and pay zero context cost on unrelated tasks.
Why does this matter? Because there's real research backing it up. The SkillsBench paper tested 7 model configurations across 7,308 task trajectories and found that curated skills raise average pass rates by 16.2 percentage points. The headline finding: Claude Haiku with skills (27.7%) outperformed Claude Opus without skills (22.0%). A smaller, cheaper model with the right procedural knowledge beats a bigger, more expensive model running blind.
The ecosystem has exploded. There are now thousands of community skills you can install and start using immediately:
- samber/cc-skills-golang: A comprehensive Golang skill set covering idiomatic patterns, security, testing, and project structure. Supports company-level overrides so you can layer your team's conventions on top of community defaults.
- Flutter official skills: 43 skills from the Flutter team itself, covering layouts, state management, navigation, native interop, platform setup, and testing. When the framework maintainers write the skills, Claude gets the same guidance a Flutter team lead would give.
- Anthropic's document skills: The docx, pdf, pptx, and xlsx skills that power Claude's document editing capabilities. Open source and a great reference for writing complex skills with scripts and supporting files.
You don't have to build everything from scratch. Install community skills for your stack, and each of your agents will pick up and use the ones relevant to their task.
Putting it all together
Here's what the full workflow looks like in practice:
Same design constraints as a human engineering team. Role clarity, scoped authority, separation of concerns. Five agents, the right skills, and a CLAUDE.md that ties them together.

Top comments (0)