DEV Community

Cover image for A Harness for Every Task: Dynamic Workflows in Claude Code – Master Custom Multi-Agent Systems
netsi1964 🙏🏻
netsi1964 🙏🏻

Posted on

A Harness for Every Task: Dynamic Workflows in Claude Code – Master Custom Multi-Agent Systems

A Harness for Every Task: Dynamic Workflows (dynamic work orchestration) in Claude Code – How to Solve Complex Tasks with Tailor-Made Multi-Agent Systems

As an AI-native builder, you already know the real bottleneck isn’t raw model intelligence anymore — it’s context, persistence, and the ability to orchestrate work across time and parallel streams. Anthropic’s new dynamic workflows solve exactly this by letting Claude instantly write and run a custom harness (a tailored orchestration layer that defines how agents are spawned, coordinated, verified, and concluded) for your specific task.

This deep dive is written for experienced AI-native developers, architects, and power users who already live in tools like Claude Code, Cursor, or Aider. We’ll break down the concept from Anthropic’s blog post “A Harness for Every Task,” explain the mechanics, showcase powerful patterns, deliver practical examples, and end with key concepts you can immediately apply to your own work.

Background: Why the Default Harness Isn’t Always Enough

Claude Code ships with a strong default harness optimized for coding — handling files, tools, iterations, and reasoning in one coherent session. It works incredibly well for most daily tasks because many jobs feel like coding: structured, iterative, and with clear outputs.

But with longer, more ambitious, or high-stakes projects, classic failure modes appear:

  • Agentic laziness (agent laziness: the tendency to stop halfway and declare the task complete — e.g., fixing only 35 out of 50 security issues).
  • Self-preferential bias (self-favoring bias: the model prefers its own outputs when it has to verify or judge them).
  • Goal drift (goal drift: important nuances, edge cases, and “do not do X” instructions fade across many turns and summarizations).

Static custom harnesses built via SDKs or scripts were the old workaround — but they’re generic and maintenance-heavy. Dynamic workflows flip the script: Claude generates the precise orchestration for your exact task, on the fly.

Core Concept: What Is a Harness and How Do Dynamic Workflows Work?

A harness (orchestration layer) is everything around the model: how the task is broken down, which sub-agents run, what tools they have, how outputs are verified, and when the job is truly done. It turns a single model into a coordinated team.

With dynamic workflows:

  • Claude writes a JavaScript file using special functions to spawn sub-agents.
  • Each sub-agent gets its own context window, isolated worktree (clean workspace), and can use different models (Sonnet for speed, Opus for depth).
  • Workflows can run tens or even hundreds of parallel sub-agents in a single session.
  • They are resumable (pause and continue later) and can be saved, shared, and reused (e.g., via skills or ~/.claude/workflows).

This isn’t just “more compute.” It’s structural resistance against the common failure modes of long-running agent sessions.

Powerful Patterns – Your Building Blocks

Claude composes workflows from these reusable patterns. Learn them so you can guide the model more effectively:

  1. Classify-and-Act (classify and route)

    A classifier agent determines the task type and routes it to the right behavior or specialized sub-agent.

  2. Fan-out-and-Synthesize (split wide and merge)

    Break a big task into many small ones, run them in parallel in clean contexts, then let a synthesizer agent combine the results. Prevents cross-contamination.

  3. Adversarial Verification (opponent-based checking)

    Every producer agent gets a dedicated “adversary” agent that checks output against a rubric. Kills self-preferential bias.

  4. Generate-and-Filter / Tournament (generate, filter, compete)

    Create many ideas or solutions, deduplicate, and let them compete through pairwise judging. Excellent for creative or ranking tasks.

  5. Loop-until-Done (iterate until success criteria)

    Keep spawning agents until a stop condition is met (no new findings, no remaining errors). Perfect for debugging, research, and exhaustive analysis.

These patterns combine beautifully — for example, fan-out + adversarial verification + synthesize.

Real-World Examples You Can Use Today

Coding & Engineering:

  • Large-scale refactors or migrations: Fan-out across files/callsites/tests → adversarial review → merge.
  • Bug hunting: Generate competing hypotheses from logs/code/data → verify with a panel of agents.
  • Documentation or blog verification: Identify claims → spawn fact-check sub-agents → adversarial validation.

Non-Coding Work (often where the biggest wins are):

  • Review 50 conversation histories and extract rules into CLAUDE.md.
  • Analyze Slack incidents for true root causes.
  • Evaluate 80 CVs: Rank them, then double-check the top 10 with dedicated verifiers.
  • Stress-test a business plan with agents playing investor, customer, and competitor roles.
  • Deep research: Fan-out searches, fetch sources, adversarial verification, then synthesize with citations.

Concrete Prompt Example (inspired by Anthropic):

“This test fails maybe 1 in 50 runs. Set up a workflow to reproduce it. Form competing theories about the race condition, and don’t stop until one theory survives all the evidence.”

The resulting workflow: Classifier → Fan-out hypothesis generation → Adversarial verifiers/refuters → Loop-until-done.

Benefits, Trade-offs, and My Analysis

Benefits:

  • Superior context isolation dramatically reduces drift and bias.
  • Parallelization + specialization boosts both speed and quality.
  • Workflows become reusable assets — you gradually build your personal agent team.
  • Scales beautifully beyond coding into research, analysis, triage, and decision-making.

This represents a natural evolution from prompt engineering to workflow engineering. You no longer design the full system — you design the prompt that makes the system design itself.

Trade-offs:

  • Higher token usage — reserve for high-value tasks.
  • More complexity when debugging multi-agent setups.
  • Still dependent on the underlying model’s ability to design good orchestrations.

Key Concepts – Take These With You

  • Custom harness per task is the new paradigm. Stop forcing every job into one default structure.
  • Structure beats raw intelligence on long horizons. Separate contexts + dedicated roles (producer, verifier, synthesizer, judge) crush hallucinations and drift.
  • Parallelism + Verification is the golden combo. Fan-out gives scale; adversarial checks give trust.
  • Workflows are both output and capital. Save the best ones — you’re building your own agent organization over time.
  • Use strategically. They cost more tokens, so deploy them where quality and thoroughness matter most.

Dynamic workflows mark a shift from “better single agent” to orchestrated agent teams on demand. It’s not just a feature in Claude Code — it’s a new way to think about reliable agentic work.

Try it today: Take a complex task you’re struggling with and add “Use a dynamic workflow…” or the trigger “ultracode”. Watch how Claude designs the perfect harness. Iterate on the prompt, save what works, and level up.

The world isn’t getting simpler. But now we have a harness for every task.

Use it wisely.


Further Reading / Primary Source

“A harness for every task: dynamic workflows in Claude Code” by Anthropic (June 2, 2026)

Highly Recommended Video

For a practical, hands-on breakdown of Ultracode and dynamic workflows (including live demos and pro tips), watch this excellent video from Chase AI:

“the most POWERFUL claude code feature in months”

It complements the Anthropic blog perfectly and shows exactly how to start using these capabilities right away.

What complex task are you going to throw at a dynamic workflow first? Drop it in the comments — I’d love to see what you build.

Top comments (0)