How parallel helpers can speed up your refactor — and drain your usage allowance twice as fast
✨ Introduction
Claude Code v1.0.60 introduced the new /agents
command. With it you can spin up sub‑agents — lightweight Claude workers that run under one top‑level session. Each gets its own context window, system prompt, and tool permissions. In other words, you now have a mini‑fleet of AIs that can tackle parts of a job in parallel. (docs.anthropic.com)
I’m knee‑deep in a large‑scale refactor that touches a few hundred files, so it felt like the perfect test‑drive for sub‑agents. Here’s what I learned.
🧐 Sub Agents at a Glance
- 🔹 What they are – Pre‑configured AI personalities you delegate work to. Think of them as coding experts with a very focused remit. (docs.anthropic.com)
- 🔹 Separate context history – Each sub‑agent maintains its own conversation history. That context history never pollutes the parent or its siblings.
- 🔹 True concurrency – Claude may schedule sub‑agent jobs at once. If your prompts are I/O‑light, they finish in parallel and feel much faster than sequential runs.
- 🔹 Same token pool – All prompts and completions still count against your plan’s 5‑hour rolling limit. More speed therefore ⇒ more tokens burned per wall‑clock minute.
🎯 Why Use Sub Agents? Key Benefits
🎁 Benefit | 💡 Why it matters |
---|---|
Clean context separation | No accidental "cross‑talk" between tasks. Each agent sees only what it needs, so your refactor instructions don’t collide with your test‑generation instructions. |
Longer effective history | Because contexts don’t mingle, every agent can use the full 200k‑token window (Claude 3.5) for its thread. |
Caching between runs | The parent session can hold shared project docs (architecture.md, coding‑style.md). Each sub‑agent can reference that shared memory without re‑uploading every time. |
Failure isolation | If one agent hits a hallucination or error, others keep going. You just retry the failed shard. |
🛠️ Real‑World Case — Batch‑Updating Hundreds of Files
For this large refactor, I loop through a list of more than 300 files. My prompt logic update a finished list, so when I reach the usage limit and the session breaks, I can rerun the same prompt file and it resumes with the first unfinished file. I now restart the refactor roughly every five hours, after the limit resets.
⏳ Previous Attempt 1 — One Huge Session
- 🐌 Looping through files in one conversation quickly ballooned the context. After ~40 files Claude hit the 200k‑token window and terminated.
🔄 Previous Attempt 2 — Powershell Outer Loop
foreach ($file in $filesToProcess) {
# Replace the {file-name} placeholder with the actual file name
$customizedPrompt = $promptTemplate -replace '\{file-name\}', $file.FullName
# Call claude-code with the customized prompt and file content
claude -p $customizedPrompt --dangerously-skip-permissions
}
- ⚙️ Each invocation got a fresh context, so context history per call were minimal.
- ⚠️ Downside: every call had to resend project structure, style guide, and helper functions—zero cache, lots of latency.
⚡ New Approach — One Parent, Many Sub‑Agents
- 🚀 Create a single high‑level session that loops over all files and delegates each one to a sub‑agent.
- 📂 Isolated histories: each file runs in its own sub‑agent session, so histories never mix.
- ✅ Advantage: tasks stay isolated, sessions don’t collide, and you avoid the context‑length limit.
- 🤯 Unexpected discovery: without me asking, Claude Code spun up five sub‑agent sessions in parallel. That sped up processing—but also burned tokens much faster.
- ⏱️ On the Pro plan I hit the usage limit in about 15 minutes. Sequential processing took roughly 30 minutes.
- 💸 Even with the $100 Max plan (5× tokens), I’d still exhaust the window in around 1 hour 15 minutes.
💸 Token Economics: Hidden Costs of Parallelism
- ⚖️ Parallelism ≠ free – The per‑file token cost is unchanged, but you pay for each session simultaneously. Your quota drains proportionally to concurrency.
- 📊 Plan sizing – On the $100 Max plan (≈ 5× Pro tokens) the same job would still exhaust the window in about 75 min.
🏁 Conclusion
Sub‑agents are a genuine force multiplier. For large‑scale, slice‑able tasks they feel like hiring a small AI team for the price of one. Just remember: speed kills (your quota).
If you’re a heavy user, set up guardrails before unleashing a parallel swarm — or be ready to smash that Upgrade button.
Happy token‑burning!🔥🔥🔥🔥🔥🔥🔥🔥
Top comments (0)