Verivus OSS Releases

Posted on Apr 16

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

#ai #softwareengineering #productivity #rust

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

I think the current debate around coding agents gets flattened too quickly.

One side says multiple agents are already here. Separate worktrees, specialized roles, parallel streams of work, and a measurable boost in throughput. The other side says a lot of these systems still over-promise, stall, and leave too much coordination work on the human operator.

After looking at our own repo activity and fixing a real compatibility break in grokrs, I think both sides are seeing something real.

The leverage is real.

The fragility is real too.

The weak version of the debate is already over

The weakest version of the debate is whether multiple agents or multiple workstreams can happen at the same time at all. In our environment, they clearly can.

I checked recent repo activity across /srv/repos/internal/verivusai-labs and /srv/repos/public and looked for same-hour overlap in both repository work and agent-specific metadata directories.

Here is the short version:

41 git repos scanned
25 with activity in the last four weeks
busiest days: 2026-03-21 and 2026-04-05, both with 7 active repos

There were also clear same-hour overlaps between agents and repos:

Ghost on 2026-03-30: Claude, Codex, and Cursor active in the same hour
arctos on 2026-04-02: Claude and AIVCS active in the same hour
GitNexus on 2026-04-05: Claude and Cursor active in the same hour

That is not a vibes-based claim. It is timestamped concurrent work.

So I do not think “multiple workstreams are fake” is a serious position anymore. The better question is whether multiple agents can work in parallel in a way that is reliable, observable, and cheap to integrate.

That is where things get more interesting.

What actually seems to break first

From what I can see, the first failures usually happen around the agent system, not inside the basic idea of parallelism itself.

1. Isolation

This is why worktrees keep coming up in the strongest pro-agent posts.

If multiple agents share mutable state carelessly, they interfere with each other. They overwrite assumptions, pollute local context, and turn parallel work into a race condition.

The useful claim is not “I launched a bunch of agents.” The useful claim is “I gave them isolated execution surfaces, so they could run without stepping on each other.”

2. Visibility

One of the better skeptical complaints I saw was from Demir Bülbüloğlu on 2026-02-22. The complaint was not just that a system failed. It was that the system claimed to be running multiple agents and then stalled instead of finishing.

That matters because it points to a gap between claimed concurrency and observable concurrency.

Once a system says it is running multiple agents, the operator needs answers to a few basic questions:

which task is active
which agent owns which workspace
whether a tool call finished, failed, or retried
whether output was actually produced or quietly dropped

Without that, “multiple workstreams” is not really a workflow model. It is hidden state with a strong marketing wrapper.

3. Protocol drift

I got a very practical reminder of this while repairing grokrs --x-search.

The break had nothing to do with whether X search was conceptually possible. It was a compatibility problem:

grokrs still emitted top-level search_parameters
xAI now expects search configuration on the tool objects themselves
the old shape fell onto the deprecated Live Search path and returned HTTP 410

After fixing that request shape, more drift showed up:

newer Responses payloads included output_text
newer tool-backed responses also carried server-side tool usage in a shape our parser did not yet accept

That is a normal systems problem, but I think it is exactly the kind of problem that gets misread in agent discourse. A workflow can be conceptually valid and still be operationally brittle because its boundaries are stale.

4. Human coordination load

This is where the skepticism from priyanka’s 2026-03-11 post lands for me. Even if multiple workstreams are real, the human often still carries the most expensive parts of the workflow:

decomposing the work
deciding which stream matters more
reviewing partial outputs
merging conflicting changes
deciding what to retry and what to discard

If that burden remains too high, then the system has not really achieved delegation. It has achieved assisted supervision.

What I changed in `grokrs`

To get grokrs ... --x-search working again, I made a few targeted compatibility fixes:

Stop sending deprecated top-level search_parameters for tool-backed search
Move X search filters onto the x_search tool object
Accept newer response shapes like output_text
Make the usage parser more tolerant of current server-side tool usage payloads

After that, this worked again:

grokrs --profile dev agent --headless --approval-mode allow --x-search \
  --max-iterations 2 "Summarize what people are saying about xAI on X in one sentence."

I also reran the package test suites:

grokrs-api: 906 tests passed
grokrs-cli: 291 tests passed

I think this is the more useful lesson from the repair: multi-agent systems often degrade first at the boundaries. Not in the screenshot. Not in the prompt demo. At the boundaries.

The synthesis that seems most honest

I do not think the right conclusion is that the optimists are wrong or the skeptics are wrong.

The optimistic posts are right that parallel work is already useful. The skeptical posts are right that a system which merely claims parallelism is not enough.

The synthesis I keep coming back to is:

multiple agents are real
multiple workstreams are useful
neither is self-validating
the hard part is shifting from generation to coordination

That is why the interesting engineering work now seems to be moving toward:

isolated worktrees and workspace boundaries
explicit ownership of subtasks
event and progress visibility
parsers and clients that survive upstream churn
review and merge loops that handle partial failure well

I think that is the part of the story that matters most. “Can an agent write code?” is no longer the whole question. “Can the system around several agents make their work dependable?” is the real one.

Closing thought

Saying “we run multiple agents” is easy.

What matters is whether those agents can work in parallel without corrupting state, whether the operator can see what is happening, whether the system survives interface drift, and whether the outputs are cheap to review and integrate.

That is the line between a screenshot and an operating model.

The X discourse feels like it is converging on that distinction, even when the posts sound like they disagree. One side is seeing the leverage. The other side is seeing the fragility.

I think both are describing the same transition from different angles.

Sources

Boris Cherny https://x.com/bcherny/status/2038454353787519164
Abdulmuiz Adeyemo https://x.com/AbdMuizAdeyemo/status/2025519825691283657
Numman Ali https://x.com/nummanali/status/2019473874455331156
Julian Goldie https://x.com/JulianGoldieSEO/status/2020081836240896487
Demir Bülbüloğlu https://x.com/demirbulbuloglu/status/2025598095312982249
priyanka https://x.com/pridesai/status/2031783971047051445

DEV Community

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

Multiple Agents, Multiple Workstreams, and the Parts That Still Break

The weak version of the debate is already over