brian austin

Posted on Apr 1

Claude Code subagents: how to run parallel tasks without hitting rate limits

#ai #programming #productivity #claude

Claude Code subagents: how to run parallel tasks without hitting rate limits

One of the least-documented features of Claude Code is its ability to spin up subagents — separate Claude instances that handle isolated tasks in parallel. If you've been running everything sequentially and wondering why your agent workflow feels slow, this is the missing piece.

What is a subagent in Claude Code?

When Claude Code encounters a task that can be broken into independent pieces, it can launch child processes that each have their own context window, their own tool access, and their own conversation thread. The parent agent coordinates; the subagents execute.

This is different from just having a long conversation. Each subagent starts fresh — clean context, no accumulated tokens from previous steps.

The rate limit problem

Here's the frustrating reality: Claude's API has rate limits based on tokens per minute. When you run a long sequential workflow, you often hit the ceiling — especially if you're processing multiple files, running test suites, or doing research that requires many tool calls.

Subagents help because:

Each subagent uses its own token budget
They can be staggered so they don't all fire simultaneously
Failed subagents don't kill the parent task

How to trigger parallel subagents

In your CLAUDE.md, you can hint at parallelism:

## Workflow preferences
- For tasks involving multiple independent files, use parallel subagents
- Each subagent should handle one file or one logical unit
- Report results back to parent before merging

For more direct control, structure your prompts explicitly:

Task: Refactor these 4 service files to use the new error handling pattern.

Run these as parallel subagents:
1. Subagent A: src/services/auth.js
2. Subagent B: src/services/billing.js  
3. Subagent C: src/services/email.js
4. Subagent D: src/services/webhook.js

Each subagent: read the file, apply the pattern, write the result.
Parent: wait for all 4, then run tests.

Real example: running a test suite in parallel

I have a project with 12 test files. Sequential testing takes ~8 minutes because Claude has to read each file, understand the context, run the tests, interpret results, and move to the next. With subagents:

Run the test suite using 4 parallel subagents:
- Subagent 1: tests/auth/*.test.js
- Subagent 2: tests/billing/*.test.js
- Subagent 3: tests/api/*.test.js
- Subagent 4: tests/utils/*.test.js

Each subagent reports: pass/fail count + any error messages.
Only surface failures to me — I don't need to see passing tests.

Time dropped from 8 minutes to under 3.

Controlling subagent scope

The biggest risk with subagents is scope creep — a subagent that's supposed to refactor one file starts touching dependencies, which touches other files, which creates conflicts with another subagent working in parallel.

Prevention via CLAUDE.md:

## Subagent rules
- Subagents must not modify files outside their assigned scope
- If a subagent needs to touch a shared file, it must report back first
- No subagent should run git commands
- No subagent should modify package.json

You can also use settings.json deny rules to enforce this at the tool level:

{
  "permissions": {
    "deny": [
      "Bash(git commit:*)",
      "Bash(git push:*)",
      "Write(package.json)",
      "Write(package-lock.json)"
    ]
  }
}

This means no subagent — regardless of instructions — can commit or modify your package files.

The cost equation

Each subagent uses tokens. More parallelism = more tokens = higher cost if you're on pay-per-token pricing.

This is where flat-rate API proxies become genuinely useful. I've been running Claude Code through SimplyLouie ($2/month flat) by setting:

export ANTHROPIC_BASE_URL=https://simplylouie.com/api/proxy

With per-token pricing, running 4 parallel subagents costs 4x what a sequential run costs. With flat-rate, it's the same monthly fee regardless. For heavy parallel workloads, this math matters.

Debugging when subagents go wrong

Subagents can fail silently. The parent might report success even if a subagent partially failed. Build explicit reporting into your workflow:

After each subagent completes, output a status block:

[SUBAGENT-A] status: SUCCESS|FAILURE
[SUBAGENT-A] files_modified: [list]
[SUBAGENT-A] issues: [any problems encountered]

Parent: aggregate all status blocks before proceeding.

This gives you a clear audit trail and makes it obvious when something needs manual review.

When NOT to use subagents

Subagents aren't always the right tool:

Order-dependent tasks: If step 2 requires the output of step 1, run sequentially
Small tasks: Spinning up a subagent for a 10-line change has overhead that's not worth it
Shared state: If all your tasks write to the same file or database, parallelism causes conflicts
Debugging sessions: When you're investigating a bug, you want to follow one thread of reasoning

Summary

Subagents are Claude Code's answer to the "I have too much work for one context window" problem. Use them for:

Processing multiple independent files
Running test suites
Research tasks (each subagent investigates one question)
Any workflow where tasks don't depend on each other

The rate limit avoidance is a side effect — the real value is keeping each subagent focused and context-clean, which produces better output than a single agent trying to hold everything in one window.

Running Claude Code with a flat-rate API? Set ANTHROPIC_BASE_URL to skip per-token billing. Details at simplylouie.com/developers

DEV Community

Claude Code subagents: how to run parallel tasks without hitting rate limits

Claude Code subagents: how to run parallel tasks without hitting rate limits

What is a subagent in Claude Code?

The rate limit problem

How to trigger parallel subagents

Real example: running a test suite in parallel

Controlling subagent scope

The cost equation

Debugging when subagents go wrong

When NOT to use subagents

Summary

Top comments (0)