Mathieu Kessler

Posted on Feb 16

Claude Code agent teams: what the docs don't tell you

#productivity #ai #claudecode #programming

Claude Code quietly shipped agent teams as an experimental feature. The idea is simple: instead of one AI agent working through your task sequentially, you spin up multiple agents that work in parallel, communicate with each other, and coordinate through a shared task list.

I've been running team sessions for the past week. Some went well. Some burned through tokens and produced nothing useful. Here's what I figured out.

You need to enable it manually

Agent teams are off by default. Add this to your settings:

// .claude/settings.json or ~/.claude/settings.json
{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  }
}

Or export it in your shell: export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Subagents and agent teams are different things

This tripped me up at first and I've seen the same confusion everywhere. Claude Code has two parallelism mechanisms, and they work nothing alike.

Subagents use the Task tool. You spawn a focused worker, it does a thing, and returns results to the parent. Workers can't talk to each other. The parent manages everything. Token cost is lower because results get summarized back into the parent's context.

Agent teams are fully independent Claude Code sessions. Each teammate has its own context window, loads project context (CLAUDE.md, MCP servers) automatically, and can message other teammates directly. They coordinate through a shared task list with file-locked claiming and dependency tracking.

The practical difference: subagents are like sending someone to fetch an answer. Agent teams are like putting three people in a room to work on a problem together.

Here's when each one makes sense:

Use subagents when...	Use agent teams when...
You need quick, focused results	Teammates need to discuss and challenge each other
Tasks are independent — no cross-talk needed	Work requires coordination across multiple layers
You want lower token costs	Parallel exploration adds real value
The parent can manage all coordination	Tasks have clear file boundaries but need collaboration

I'd estimate 70-80% of tasks that feel like they need a team actually work fine with subagents. The Task tool with subagent_type set to Explore, Plan, or general-purpose handles most research and implementation tasks without the coordination overhead.

The cost math

Each teammate is a full Claude session with its own context window. A 3-teammate team uses roughly 3-4x the tokens of sequential work.

That's not a rounding error. For a task that takes 20 minutes sequentially, spinning up a team to finish it in 8 minutes costs 3-4x more. Whether that trade-off makes sense depends on how much your time is worth relative to API spend.

The tasks where teams clearly pay off:

Multi-file features with clean boundaries (frontend + backend + tests)
Large refactors where multiple directories can be worked on simultaneously
Adversarial review patterns (one agent implements, another tries to break it)

The tasks where they don't:

Anything touching fewer than 4-5 files
Sequential work where step 2 depends on step 1's output
Quick bug fixes (the coordination overhead exceeds the time saved)

File conflicts are the #1 failure mode

Two teammates editing the same file doesn't produce a merge conflict. It produces data loss. One agent's changes silently overwrite the other's.

This was my most expensive lesson. Before any parallel coding, you need to map out exactly which agent owns which files. No shared files. If two tasks need to touch the same file, they run sequentially, not in parallel.

The pattern that works: decompose your task into pieces where each piece owns a distinct set of files. If you can't draw clean boundaries, the task isn't a good candidate for a team.

Teammates don't know what you know

This is counterintuitive. Your main Claude Code session (the "lead") has your full conversation history — the bug you described, the architecture you discussed, the constraints you mentioned. When you spawn a teammate, they start with a blank conversation. They load CLAUDE.md and MCP servers, but they don't inherit any of your context.

This means your spawn prompts need to be detailed. Not "implement the auth module" but "implement an auth module using JWT tokens stored in httpOnly cookies, following the patterns in src/middleware/auth.ts, compatible with the Express 5 router setup in src/app.ts, with unit tests in tests/auth.test.ts."

The more context you front-load into the spawn prompt, the less the teammate drifts. I started writing spawn prompts that were 200-300 words and the quality of output went up immediately.

The workflow that actually produces results

After several failed attempts, here's the sequence I settled on:

1. Decide if you need a team at all.

Ask yourself: can I decompose this into 3+ parallel tasks with zero file overlap? If no, use sequential work or subagents.

2. Decompose into agent-sized pieces.

Each piece needs: a clear goal, explicit file ownership, a definition of done, and an integration contract (how this piece connects to the others). Vague task descriptions produce vague results.

3. Map file boundaries.

Write it down. Agent A owns these files. Agent B owns these files. No overlap. If you skip this step, you will lose work.

4. Plan first, then swarm.

Spawn one agent in plan mode to design the architecture. Review and approve the plan. Then spawn parallel agents to implement against the approved plan. This prevents agents from making conflicting architectural decisions.

5. Run a retrospective.

After the team finishes, look at what each agent actually did, how many tokens each consumed, and where time was wasted. The cost breakdown alone will change how you structure the next run.

Things that will bite you

A few gotchas that aren't in the docs:

No session resumption. If your terminal session drops, /resume and /rewind don't restore teammates. The team is gone. This matters for long-running tasks — save checkpoints or break work into shorter team sessions.

Tasks get stuck. Teammates sometimes forget to mark tasks as completed, which blocks any dependent tasks downstream. If something looks frozen, check the task list manually.

Shutdown is slow. Teammates finish their current API request before stopping. If an agent is mid-generation on a long response, you'll wait.

Drift happens. Letting a team run unattended for 30+ minutes increases the chance of wasted work. Agents can go down rabbit holes, misinterpret requirements, or produce code that conflicts with another agent's approach. Check in periodically.

One team per session. Clean up the current team (shut down all teammates, then delete the team) before starting a new one.

Start with non-code tasks

If you haven't used agent teams before, don't start with parallel implementation. Start with tasks where parallel exploration is valuable but file conflicts aren't a risk:

Multi-angle PR review: one agent checks logic correctness, another checks test coverage, a third checks security and error handling
Competing hypotheses debugging: agent A investigates "this is a race condition," agent B investigates "this is a stale cache," they report findings and you decide
Research from multiple angles: one agent reads the library docs, another searches for known issues on GitHub, a third looks at alternative libraries

These tasks let you learn how messaging, task claiming, and coordination work without the stress of parallel file edits.

A 19-prompt playbook

After running enough team sessions to see the patterns, I built a set of reusable prompts covering the full lifecycle:

Decide (prompts 1-5): Should I use a team? How do I decompose the task? What model for each agent? How do I map file boundaries?
Orchestrate (prompts 6-12): Plan-then-swarm coordination, delegate mode, adversarial code review, parallel TDD, handoff reports, real-time steering
Quality (prompts 13-15): Hooks that auto-validate work before an agent goes idle or marks a task complete, plan approval checklists
Recover (prompts 16-19): Unsticking looping agents, optimizing token costs, retrospectives, anti-pattern detection

The full playbook is free: Claude Code Agent Teams Playbook on NerdyChefs.ai

The honest take

Agent teams are a power tool, not a default workflow. The coordination overhead — context duplication, file boundary management, spawn prompt writing, monitoring for drift — means they only pay off for genuinely parallel work with clean boundaries.

For a solo developer working on a focused feature, sequential Claude Code with occasional subagent calls is cheaper and more predictable. For a large cross-layer feature where three agents can work on frontend, backend, and tests simultaneously without touching each other's files, teams save real time.

The key is being honest about which category your task falls into before you spawn anything.

What patterns have you found with agent teams? I'm particularly interested in hearing about non-obvious use cases where the parallel coordination actually paid off.

DEV Community