DEV Community

Evan-dong
Evan-dong

Posted on

Claude Code Dynamic Workflows: What I Learned From a 111-Agent Research Run

#ai

I tried Claude Code's new dynamic workflows with Opus 4.8 on a real research task.


The short version: the agent count was interesting, but the failure handling was more useful.

What I ran

I used the built-in workflow:

/deep-research <research question>
Enter fullscreen mode Exit fullscreen mode

The question was broad enough to need multiple angles: understand an AI project related to medical imaging, compare possible use cases, look at competitors, and identify product weak spots.

After launching the workflow, I inspected it with:

/workflows
Enter fullscreen mode Exit fullscreen mode

That view shows the phase structure, agent counts, token totals, and runtime.

The phase breakdown

My run split into five phases:

Phase Agents Purpose
Scope 1 Define the research frame
Search 6 Search from several angles
Fetch 28 Retrieve and read sources
Verify 75 Check key claims
Synthesize 1 Produce the final report

Total: 111 agents.

Important caveat: this does not mean 111 agents ran at the same time. The point is that the workflow had a script-like structure that coordinated many subagents across phases.

The useful failure

The Verify phase extracted 25 claims and tried to check each claim with three independent agents.

Then the run partially broke.

In my case, 17 of the 25 claims ended up with a killed status. The issue looked like a StructuredOutput failure.

That sounds bad, but the workflow's interpretation was the useful part.

It separated:

  • confirmed;
  • contradicted;
  • not verified.

Only 2 of the 17 killed claims were actually contradicted. The other 15 were incomplete checks, not false claims.

That distinction matters.

killed should not automatically mean "wrong." It means the check did not finish. If your agent turns that into a confident conclusion, the report becomes dangerous.

Why workflows feel different from long chats

In a normal long chat, intermediate state gets compressed into the final answer. That is convenient, but it can hide weak spots.

With a workflow, the plan moves into a script. The runtime can hold phase state, branch logic, repeated checks, and failed agents without forcing everything into the chat context.

That makes it better for tasks where process quality matters:

  • codebase audits;
  • large migrations;
  • cross-checked research;
  • repeated claim verification;
  • multi-angle planning.

It is not a good fit for every task. For small edits or quick debugging, it is overkill.

My checklist

Before running a dynamic workflow, I would do this:

  1. Narrow the question.
  2. Run a small version first.
  3. Open /workflows, not just the final report.
  4. Inspect failed, killed, contradicted, and not-verified states separately.
  5. Do not treat killed as a refutation.
  6. Save the workflow only after it proves useful.
  7. Watch token usage.

Takeaway

Dynamic workflows are not just "more subagents."

They are a way to turn a long task into an inspectable process.

My run partially failed, but that failure was visible. For automated research and coding-agent work, visible failure is much better than a clean answer that hides uncertainty.

If you are testing Opus 4.8 for Claude Code, coding agents, or long-running workflows, I keep the model notes and API routing details here: Claude Opus 4.8 API for Coding Agents.

Top comments (0)