Hassann

Posted on May 29 • Originally published at apidog.com

Claude Code Dynamic Workflows: Running Hundreds of Parallel Subagents with Opus 4.8

Claude Opus 4.8 shipped with a major Claude Code capability: Dynamic Workflows. In one session, an orchestrating agent can launch many parallel subagents for large, branching tasks like refactoring across dozens of files, running broad test matrices, or comparing multiple implementation paths.

Try Apidog today

This guide breaks down how Dynamic Workflows work, when to use them, and how to implement the same orchestration pattern through the raw API. For model context, see what is Claude Opus 4.8. For agent architecture background, read the Claude Code agent harness breakdown.

What Dynamic Workflows actually are

In Claude Code, Dynamic Workflows appear as ultracode in the effort menu. The important detail: ultracode is not a new API effort level.

It combines two Opus 4.8 capabilities:

xhigh effort
Mid-conversation system messages

Together, these give an orchestrator agent enough reasoning budget to plan a large job and the ability to gain permission mid-run to launch worker agents. The rest is Claude Code orchestration logic.

Ingredient 1: `xhigh` effort

The effort parameter controls how many tokens Opus 4.8 can spend across a response, including tool calls.

For Dynamic Workflows, xhigh matters because the orchestrator needs to:

Break a large task into independent units
Decide which units can run in parallel
Dispatch workers
Merge worker results back into one coherent output

Lower effort levels generally scope work down and make fewer tool calls, which is not ideal for orchestration-heavy workflows.

A practical starting point:

orchestrator = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Plan a refactor of the auth module across all 14 services."
        }
    ],
)

Use a large max_tokens value so the orchestrator has enough room to plan and coordinate. 64K is a reasonable starting point for large coding tasks.

Ingredient 2: mid-conversation system messages

Mid-conversation system messages let you insert a system entry partway through the messages array instead of only at the beginning of the conversation.

That means an agent can receive new instructions, constraints, or permissions after the task has started.

For Dynamic Workflows, this is what allows the orchestrator to gain standing permission to launch multi-agent workflows during a run.

Anthropic documents the mechanism here: mid-conversation system messages.

Conceptually, the flow looks like this:

messages = [
    {
        "role": "user",
        "content": "Analyze the repository and identify independent refactor tasks."
    },
    {
        "role": "assistant",
        "content": "I found several independent areas: auth, billing, logging, and tests."
    },
    {
        "role": "system",
        "content": "You may now dispatch scoped worker agents in parallel. Each worker must receive only the files and instructions relevant to its assigned task."
    },
    {
        "role": "user",
        "content": "Dispatch workers and merge their results into one implementation plan."
    }
]

The key implementation idea is simple: the orchestrator can gain new operational permissions based on what it discovers.

Turning it on in Claude Code

In Claude Code, Dynamic Workflows are available through the ultracode option in the effort menu.

Selecting it does two things:

Sets effort to xhigh
Allows the session to spawn parallel subagents through mid-conversation system messages

After that, give Claude Code a task that is actually wide enough to parallelize.

Good prompt shape:

Refactor the auth module across all services.

1. Identify independent service-level changes.
2. Launch parallel workers where tasks do not overlap.
3. Run or update tests for each service.
4. Merge the results into one final summary with changed files and remaining risks.

Claude Code then handles the orchestration loop:

Plans the task
Splits it into worker-sized units
Runs workers in parallel
Streams results back
Merges the results in the main session

If you are configuring Claude Code with a plan, see the Claude Agent SDK with Claude plan setup guide.

When to use Dynamic Workflows

Use Dynamic Workflows when the task is wide and parallelizable.

Good fits:

Refactoring the same pattern across many files
Running a large test matrix
Exploring multiple implementation approaches in parallel
Auditing a large codebase by module or package
Splitting independent migration work across services

Example:

Analyze all packages in this monorepo for deprecated API usage.

Assign one worker per package.
Each worker should:
- Find deprecated calls
- Propose replacements
- Identify tests that need updates
- Return a concise patch plan

Merge all worker outputs into a single migration checklist.

When not to use Dynamic Workflows

Avoid Dynamic Workflows for narrow or sequential tasks.

Poor fits:

A one-file bug fix
A task where each step depends on the previous step
Small changes that fit in one normal model response
Work where parallel edits are likely to conflict heavily

Spawning hundreds of subagents for a small task burns tokens without improving the result. The cost is real: many xhigh workers can quickly become millions of tokens.

Use the pattern only when parallelism matches the shape of the work.

Building the same orchestration pattern through the API

You do not need Claude Code to build this pattern. The same ingredients are available through the Messages API.

Anthropic also provides a worked example here: build an orchestration mode.

The implementation shape:

Run an orchestrator call with xhigh effort
Ask it to produce a task breakdown
Use a mid-conversation system message to grant dispatch permissions
Fan out worker calls in parallel
Feed worker results back to the orchestrator
Ask the orchestrator to merge the final result

Step 1: create the orchestrator request

import anthropic

client = anthropic.Anthropic()

orchestrator = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": """
Plan a refactor of the auth module across all 14 services.

Return:
- independent work units
- files or services assigned to each unit
- expected worker output format
- merge strategy
"""
        }
    ],
)

Step 2: define worker scope

Each worker should receive a narrow task. Do not send the entire repo context unless the worker needs it.

Example worker prompt:

worker_prompt = """
You are Worker 3.

Scope:
- Service: billing-api
- Files: src/auth/*, tests/auth/*
- Task: update usage of the shared auth middleware based on the new interface

Return:
- changed files
- summary of edits
- tests updated or recommended
- risks or unresolved questions
"""

Step 3: run workers concurrently

Workers are separate Messages API calls. Since each task is narrower, you can often run workers at a lower effort level than the orchestrator.

from concurrent.futures import ThreadPoolExecutor

def run_worker(prompt: str):
    return client.messages.create(
        model="claude-opus-4-8",
        max_tokens=16000,
        output_config={"effort": "medium"},
        messages=[
            {
                "role": "user",
                "content": prompt
            }
        ],
    )

worker_prompts = [
    worker_prompt_1,
    worker_prompt_2,
    worker_prompt_3,
]

with ThreadPoolExecutor(max_workers=3) as executor:
    worker_results = list(executor.map(run_worker, worker_prompts))

Step 4: merge worker results

Feed the worker outputs back to the orchestrator and ask it to reconcile conflicts, deduplicate findings, and produce the final implementation plan.

merge_response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Merge these worker results into one final refactor plan."
        },
        {
            "role": "user",
            "content": str(worker_results)
        }
    ],
)

If you are comparing this approach with Anthropic-hosted agent infrastructure, the managed agents vs Agent SDK guide explains the trade-offs.

Cost and control

Parallel subagents multiply token usage quickly.

A workflow with 200 workers, each spending tens of thousands of tokens, can become expensive fast. Control the blast radius before you run it.

Use these guardrails:

Scope workers tightly: one module, service, package, or test group per worker
Lower worker effort where possible: use medium or low for narrow subtasks
Cap max_tokens per worker: prevent runaway calls
Cache shared context: avoid paying repeatedly for the same long instructions
Validate with a small batch first: run 3–5 workers before launching hundreds

Example worker budget:

worker = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=8000,
    output_config={"effort": "medium"},
    messages=[
        {
            "role": "user",
            "content": scoped_worker_prompt
        }
    ],
)

The Opus 4.8 pricing breakdown covers effort levels and caching in more detail.

The short version: orchestration is powerful, but the bill scales with the number of agents. Treat parallelism as an explicit design choice.

Testing your orchestration with Apidog

When you build orchestration through the API, the hardest part to debug is fan-out and merge behavior.

Before launching hundreds of live worker calls, verify:

Workers receive the correct scoped context
Worker responses follow the expected shape
The merge step can handle all worker outputs
The mid-conversation system message lands where expected
Your token caps and effort levels are configured correctly

Apidog helps you test those pieces in isolation:

Save the orchestrator request and inspect the task breakdown before dispatching workers
Mock worker endpoints to test fan-out and merge logic without spending tokens on real calls
Add assertions for worker response shape
Replay a single worker call at different effort levels to tune cost
Validate requests against https://api.anthropic.com/v1/messages

A practical testing loop:

Build the orchestrator request
Confirm it returns a clean task breakdown
Mock several worker responses
Test your merge logic against the mock payloads
Run one live worker call
Tune effort and token caps
Scale to a small batch
Only then run the full workflow

The Opus 4.8 API guide has the base request to start from.

FAQ

What are Dynamic Workflows in Claude Code?

Dynamic Workflows let one Claude Code session launch many parallel subagents for large, branching tasks. They are powered by xhigh effort plus mid-conversation system messages on Opus 4.8.

Is `ultracode` a separate effort level?

No. ultracode is Claude Code’s name for xhigh effort combined with permission to launch multi-agent workflows.

The API effort levels remain:

low
medium
high
xhigh
max

What are mid-conversation system messages?

They are system messages inserted partway through a conversation. They let you add instructions or permissions after the task has started.

For Dynamic Workflows, they allow the orchestrator to gain permission to spawn workers mid-run.

Can I build Dynamic Workflows without Claude Code?

Yes. Use xhigh effort plus mid-conversation system messages on the raw Messages API. Then implement the orchestration loop yourself:

Plan
Dispatch workers
Collect results
Merge

Do Dynamic Workflows cost a lot?

They can. Hundreds of xhigh subagents can add up to millions of tokens.

Control cost by scoping workers tightly, lowering worker effort where possible, capping max_tokens, and caching shared context.

When should I avoid Dynamic Workflows?

Avoid them for narrow or strictly sequential tasks. Parallel workers add little value when each step depends on the previous one, and they waste tokens on small jobs.

DEV Community

Claude Code Dynamic Workflows: Running Hundreds of Parallel Subagents with Opus 4.8

What Dynamic Workflows actually are

Ingredient 1: `xhigh` effort

Ingredient 2: mid-conversation system messages

Turning it on in Claude Code

When to use Dynamic Workflows

When not to use Dynamic Workflows

Building the same orchestration pattern through the API

Step 1: create the orchestrator request

Step 2: define worker scope

Step 3: run workers concurrently

Step 4: merge worker results

Cost and control

Testing your orchestration with Apidog

FAQ

What are Dynamic Workflows in Claude Code?

Is `ultracode` a separate effort level?

What are mid-conversation system messages?

Can I build Dynamic Workflows without Claude Code?

Do Dynamic Workflows cost a lot?

When should I avoid Dynamic Workflows?

Top comments (0)

What Dynamic Workflows actually are

Ingredient 1: xhigh effort

Ingredient 2: mid-conversation system messages

Turning it on in Claude Code

When to use Dynamic Workflows

When not to use Dynamic Workflows

Building the same orchestration pattern through the API

Step 1: create the orchestrator request

Step 2: define worker scope

Step 3: run workers concurrently

Step 4: merge worker results

Cost and control

Testing your orchestration with Apidog

FAQ

What are Dynamic Workflows in Claude Code?

Is ultracode a separate effort level?

What are mid-conversation system messages?

Can I build Dynamic Workflows without Claude Code?

Do Dynamic Workflows cost a lot?

When should I avoid Dynamic Workflows?

Ingredient 1: `xhigh` effort

Is `ultracode` a separate effort level?