DEV Community

Cover image for Multiple Agents, Multiple Workstreams, and the Parts That Still Break
Verivus OSS Releases
Verivus OSS Releases

Posted on

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

I think the current debate around coding agents gets flattened too quickly.

One side says multiple agents are already here. Separate worktrees, specialized roles, parallel streams of work, and a measurable boost in throughput. The other side says a lot of these systems still over-promise, stall, and leave too much coordination work on the human operator.

After looking at our own repo activity and fixing a real compatibility break in grokrs, I think both sides are seeing something real.

The leverage is real.

The fragility is real too.

The weak version of the debate is already over

The weakest version of the debate is whether multiple agents or multiple workstreams can happen at the same time at all. In our environment, they clearly can.

I checked recent repo activity across /srv/repos/internal/verivusai-labs and /srv/repos/public and looked for same-hour overlap in both repository work and agent-specific metadata directories.

Here is the short version:

  • 41 git repos scanned
  • 25 with activity in the last four weeks
  • busiest days: 2026-03-21 and 2026-04-05, both with 7 active repos

There were also clear same-hour overlaps between agents and repos:

  • Ghost on 2026-03-30: Claude, Codex, and Cursor active in the same hour
  • arctos on 2026-04-02: Claude and AIVCS active in the same hour
  • GitNexus on 2026-04-05: Claude and Cursor active in the same hour

That is not a vibes-based claim. It is timestamped concurrent work.

So I do not think “multiple workstreams are fake” is a serious position anymore. The better question is whether multiple agents can work in parallel in a way that is reliable, observable, and cheap to integrate.

That is where things get more interesting.

What actually seems to break first

From what I can see, the first failures usually happen around the agent system, not inside the basic idea of parallelism itself.

1. Isolation

This is why worktrees keep coming up in the strongest pro-agent posts.

If multiple agents share mutable state carelessly, they interfere with each other. They overwrite assumptions, pollute local context, and turn parallel work into a race condition.

The useful claim is not “I launched a bunch of agents.” The useful claim is “I gave them isolated execution surfaces, so they could run without stepping on each other.”

2. Visibility

One of the better skeptical complaints I saw was from Demir Bülbüloğlu on 2026-02-22. The complaint was not just that a system failed. It was that the system claimed to be running multiple agents and then stalled instead of finishing.

That matters because it points to a gap between claimed concurrency and observable concurrency.

Once a system says it is running multiple agents, the operator needs answers to a few basic questions:

  • which task is active
  • which agent owns which workspace
  • whether a tool call finished, failed, or retried
  • whether output was actually produced or quietly dropped

Without that, “multiple workstreams” is not really a workflow model. It is hidden state with a strong marketing wrapper.

3. Protocol drift

I got a very practical reminder of this while repairing grokrs --x-search.

The break had nothing to do with whether X search was conceptually possible. It was a compatibility problem:

  • grokrs still emitted top-level search_parameters
  • xAI now expects search configuration on the tool objects themselves
  • the old shape fell onto the deprecated Live Search path and returned HTTP 410

After fixing that request shape, more drift showed up:

  • newer Responses payloads included output_text
  • newer tool-backed responses also carried server-side tool usage in a shape our parser did not yet accept

That is a normal systems problem, but I think it is exactly the kind of problem that gets misread in agent discourse. A workflow can be conceptually valid and still be operationally brittle because its boundaries are stale.

4. Human coordination load

This is where the skepticism from priyanka’s 2026-03-11 post lands for me. Even if multiple workstreams are real, the human often still carries the most expensive parts of the workflow:

  • decomposing the work
  • deciding which stream matters more
  • reviewing partial outputs
  • merging conflicting changes
  • deciding what to retry and what to discard

If that burden remains too high, then the system has not really achieved delegation. It has achieved assisted supervision.

What I changed in grokrs

To get grokrs ... --x-search working again, I made a few targeted compatibility fixes:

  1. Stop sending deprecated top-level search_parameters for tool-backed search
  2. Move X search filters onto the x_search tool object
  3. Accept newer response shapes like output_text
  4. Make the usage parser more tolerant of current server-side tool usage payloads

After that, this worked again:

grokrs --profile dev agent --headless --approval-mode allow --x-search \
  --max-iterations 2 "Summarize what people are saying about xAI on X in one sentence."
Enter fullscreen mode Exit fullscreen mode

I also reran the package test suites:

  • grokrs-api: 906 tests passed
  • grokrs-cli: 291 tests passed

I think this is the more useful lesson from the repair: multi-agent systems often degrade first at the boundaries. Not in the screenshot. Not in the prompt demo. At the boundaries.

The synthesis that seems most honest

I do not think the right conclusion is that the optimists are wrong or the skeptics are wrong.

The optimistic posts are right that parallel work is already useful. The skeptical posts are right that a system which merely claims parallelism is not enough.

The synthesis I keep coming back to is:

  • multiple agents are real
  • multiple workstreams are useful
  • neither is self-validating
  • the hard part is shifting from generation to coordination

That is why the interesting engineering work now seems to be moving toward:

  • isolated worktrees and workspace boundaries
  • explicit ownership of subtasks
  • event and progress visibility
  • parsers and clients that survive upstream churn
  • review and merge loops that handle partial failure well

I think that is the part of the story that matters most. “Can an agent write code?” is no longer the whole question. “Can the system around several agents make their work dependable?” is the real one.

Closing thought

Saying “we run multiple agents” is easy.

What matters is whether those agents can work in parallel without corrupting state, whether the operator can see what is happening, whether the system survives interface drift, and whether the outputs are cheap to review and integrate.

That is the line between a screenshot and an operating model.

The X discourse feels like it is converging on that distinction, even when the posts sound like they disagree. One side is seeing the leverage. The other side is seeing the fragility.

I think both are describing the same transition from different angles.

Sources

Top comments (0)