DEV Community

Cover image for Claude Code Dynamic Workflows: Running Hundreds of Parallel Subagents with Opus 4.8
Hassann
Hassann

Posted on • Originally published at apidog.com

Claude Code Dynamic Workflows: Running Hundreds of Parallel Subagents with Opus 4.8

Claude Opus 4.8 shipped with a major Claude Code capability: Dynamic Workflows. In one session, an orchestrating agent can launch many parallel subagents for large, branching tasks like refactoring across dozens of files, running broad test matrices, or comparing multiple implementation paths.

Try Apidog today

This guide breaks down how Dynamic Workflows work, when to use them, and how to implement the same orchestration pattern through the raw API. For model context, see what is Claude Opus 4.8. For agent architecture background, read the Claude Code agent harness breakdown.

What Dynamic Workflows actually are

In Claude Code, Dynamic Workflows appear as ultracode in the effort menu. The important detail: ultracode is not a new API effort level.

It combines two Opus 4.8 capabilities:

  1. xhigh effort
  2. Mid-conversation system messages

Claude Code Dynamic Workflows

Together, these give an orchestrator agent enough reasoning budget to plan a large job and the ability to gain permission mid-run to launch worker agents. The rest is Claude Code orchestration logic.

Ingredient 1: xhigh effort

The effort parameter controls how many tokens Opus 4.8 can spend across a response, including tool calls.

For Dynamic Workflows, xhigh matters because the orchestrator needs to:

  • Break a large task into independent units
  • Decide which units can run in parallel
  • Dispatch workers
  • Merge worker results back into one coherent output

Lower effort levels generally scope work down and make fewer tool calls, which is not ideal for orchestration-heavy workflows.

A practical starting point:

orchestrator = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Plan a refactor of the auth module across all 14 services."
        }
    ],
)
Enter fullscreen mode Exit fullscreen mode

Use a large max_tokens value so the orchestrator has enough room to plan and coordinate. 64K is a reasonable starting point for large coding tasks.

Ingredient 2: mid-conversation system messages

Mid-conversation system messages let you insert a system entry partway through the messages array instead of only at the beginning of the conversation.

That means an agent can receive new instructions, constraints, or permissions after the task has started.

For Dynamic Workflows, this is what allows the orchestrator to gain standing permission to launch multi-agent workflows during a run.

Anthropic documents the mechanism here: mid-conversation system messages.

Conceptually, the flow looks like this:

messages = [
    {
        "role": "user",
        "content": "Analyze the repository and identify independent refactor tasks."
    },
    {
        "role": "assistant",
        "content": "I found several independent areas: auth, billing, logging, and tests."
    },
    {
        "role": "system",
        "content": "You may now dispatch scoped worker agents in parallel. Each worker must receive only the files and instructions relevant to its assigned task."
    },
    {
        "role": "user",
        "content": "Dispatch workers and merge their results into one implementation plan."
    }
]
Enter fullscreen mode Exit fullscreen mode

The key implementation idea is simple: the orchestrator can gain new operational permissions based on what it discovers.

Turning it on in Claude Code

In Claude Code, Dynamic Workflows are available through the ultracode option in the effort menu.

Selecting it does two things:

  • Sets effort to xhigh
  • Allows the session to spawn parallel subagents through mid-conversation system messages

Claude Code ultracode option

After that, give Claude Code a task that is actually wide enough to parallelize.

Good prompt shape:

Refactor the auth module across all services.

1. Identify independent service-level changes.
2. Launch parallel workers where tasks do not overlap.
3. Run or update tests for each service.
4. Merge the results into one final summary with changed files and remaining risks.
Enter fullscreen mode Exit fullscreen mode

Claude Code then handles the orchestration loop:

  • Plans the task
  • Splits it into worker-sized units
  • Runs workers in parallel
  • Streams results back
  • Merges the results in the main session

If you are configuring Claude Code with a plan, see the Claude Agent SDK with Claude plan setup guide.

When to use Dynamic Workflows

Use Dynamic Workflows when the task is wide and parallelizable.

Good fits:

  • Refactoring the same pattern across many files
  • Running a large test matrix
  • Exploring multiple implementation approaches in parallel
  • Auditing a large codebase by module or package
  • Splitting independent migration work across services

Example:

Analyze all packages in this monorepo for deprecated API usage.

Assign one worker per package.
Each worker should:
- Find deprecated calls
- Propose replacements
- Identify tests that need updates
- Return a concise patch plan

Merge all worker outputs into a single migration checklist.
Enter fullscreen mode Exit fullscreen mode

When not to use Dynamic Workflows

Avoid Dynamic Workflows for narrow or sequential tasks.

Poor fits:

  • A one-file bug fix
  • A task where each step depends on the previous step
  • Small changes that fit in one normal model response
  • Work where parallel edits are likely to conflict heavily

Spawning hundreds of subagents for a small task burns tokens without improving the result. The cost is real: many xhigh workers can quickly become millions of tokens.

Use the pattern only when parallelism matches the shape of the work.

Building the same orchestration pattern through the API

You do not need Claude Code to build this pattern. The same ingredients are available through the Messages API.

Anthropic also provides a worked example here: build an orchestration mode.

The implementation shape:

  1. Run an orchestrator call with xhigh effort
  2. Ask it to produce a task breakdown
  3. Use a mid-conversation system message to grant dispatch permissions
  4. Fan out worker calls in parallel
  5. Feed worker results back to the orchestrator
  6. Ask the orchestrator to merge the final result

Step 1: create the orchestrator request

import anthropic

client = anthropic.Anthropic()

orchestrator = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": """
Plan a refactor of the auth module across all 14 services.

Return:
- independent work units
- files or services assigned to each unit
- expected worker output format
- merge strategy
"""
        }
    ],
)
Enter fullscreen mode Exit fullscreen mode

Step 2: define worker scope

Each worker should receive a narrow task. Do not send the entire repo context unless the worker needs it.

Example worker prompt:

worker_prompt = """
You are Worker 3.

Scope:
- Service: billing-api
- Files: src/auth/*, tests/auth/*
- Task: update usage of the shared auth middleware based on the new interface

Return:
- changed files
- summary of edits
- tests updated or recommended
- risks or unresolved questions
"""
Enter fullscreen mode Exit fullscreen mode

Step 3: run workers concurrently

Workers are separate Messages API calls. Since each task is narrower, you can often run workers at a lower effort level than the orchestrator.

from concurrent.futures import ThreadPoolExecutor

def run_worker(prompt: str):
    return client.messages.create(
        model="claude-opus-4-8",
        max_tokens=16000,
        output_config={"effort": "medium"},
        messages=[
            {
                "role": "user",
                "content": prompt
            }
        ],
    )

worker_prompts = [
    worker_prompt_1,
    worker_prompt_2,
    worker_prompt_3,
]

with ThreadPoolExecutor(max_workers=3) as executor:
    worker_results = list(executor.map(run_worker, worker_prompts))
Enter fullscreen mode Exit fullscreen mode

Step 4: merge worker results

Feed the worker outputs back to the orchestrator and ask it to reconcile conflicts, deduplicate findings, and produce the final implementation plan.

merge_response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Merge these worker results into one final refactor plan."
        },
        {
            "role": "user",
            "content": str(worker_results)
        }
    ],
)
Enter fullscreen mode Exit fullscreen mode

If you are comparing this approach with Anthropic-hosted agent infrastructure, the managed agents vs Agent SDK guide explains the trade-offs.

Cost and control

Parallel subagents multiply token usage quickly.

A workflow with 200 workers, each spending tens of thousands of tokens, can become expensive fast. Control the blast radius before you run it.

Use these guardrails:

  • Scope workers tightly: one module, service, package, or test group per worker
  • Lower worker effort where possible: use medium or low for narrow subtasks
  • Cap max_tokens per worker: prevent runaway calls
  • Cache shared context: avoid paying repeatedly for the same long instructions
  • Validate with a small batch first: run 3–5 workers before launching hundreds

Example worker budget:

worker = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=8000,
    output_config={"effort": "medium"},
    messages=[
        {
            "role": "user",
            "content": scoped_worker_prompt
        }
    ],
)
Enter fullscreen mode Exit fullscreen mode

The Opus 4.8 pricing breakdown covers effort levels and caching in more detail.

The short version: orchestration is powerful, but the bill scales with the number of agents. Treat parallelism as an explicit design choice.

Testing your orchestration with Apidog

When you build orchestration through the API, the hardest part to debug is fan-out and merge behavior.

Before launching hundreds of live worker calls, verify:

  • Workers receive the correct scoped context
  • Worker responses follow the expected shape
  • The merge step can handle all worker outputs
  • The mid-conversation system message lands where expected
  • Your token caps and effort levels are configured correctly

Apidog helps you test those pieces in isolation:

  • Save the orchestrator request and inspect the task breakdown before dispatching workers
  • Mock worker endpoints to test fan-out and merge logic without spending tokens on real calls
  • Add assertions for worker response shape
  • Replay a single worker call at different effort levels to tune cost
  • Validate requests against https://api.anthropic.com/v1/messages

A practical testing loop:

  1. Build the orchestrator request
  2. Confirm it returns a clean task breakdown
  3. Mock several worker responses
  4. Test your merge logic against the mock payloads
  5. Run one live worker call
  6. Tune effort and token caps
  7. Scale to a small batch
  8. Only then run the full workflow

The Opus 4.8 API guide has the base request to start from.

FAQ

What are Dynamic Workflows in Claude Code?

Dynamic Workflows let one Claude Code session launch many parallel subagents for large, branching tasks. They are powered by xhigh effort plus mid-conversation system messages on Opus 4.8.

Is ultracode a separate effort level?

No. ultracode is Claude Code’s name for xhigh effort combined with permission to launch multi-agent workflows.

The API effort levels remain:

  • low
  • medium
  • high
  • xhigh
  • max

What are mid-conversation system messages?

They are system messages inserted partway through a conversation. They let you add instructions or permissions after the task has started.

For Dynamic Workflows, they allow the orchestrator to gain permission to spawn workers mid-run.

Can I build Dynamic Workflows without Claude Code?

Yes. Use xhigh effort plus mid-conversation system messages on the raw Messages API. Then implement the orchestration loop yourself:

  1. Plan
  2. Dispatch workers
  3. Collect results
  4. Merge

Do Dynamic Workflows cost a lot?

They can. Hundreds of xhigh subagents can add up to millions of tokens.

Control cost by scoping workers tightly, lowering worker effort where possible, capping max_tokens, and caching shared context.

When should I avoid Dynamic Workflows?

Avoid them for narrow or strictly sequential tasks. Parallel workers add little value when each step depends on the previous one, and they waste tokens on small jobs.

Top comments (0)