Parallel agent demos look great right up until two tasks both try to use the same mouse.
One task is logging into a site. Another one opens a browser window. A third thing is only supposed to answer a status question, but now the whole system is clicking in the wrong place, cancelling the wrong run, or reporting a failure that is really just resource contention.
That was one of the more useful reliability lessons while building CliGate, my local control plane for a resident assistant, Claude Code, Codex, channels, scheduled work, and desktop automation.
The real bug was not "parallelism"
Parallelism was fine.
The bug was pretending every task could use every resource at the same time.
Code tasks can run in parallel. A weather lookup can happen while a runtime session is still working. A background summary does not need to block anything.
But the desktop is different.
There is only one physical keyboard, one mouse, and one visible screen. If two agent runs both think they own that surface, the result is not intelligent multitasking. It is sabotage.
I first made the classic bad fix
My first instinct was the lazy one:
if a new task shows up and something else is already running, cancel the old run and let the new one take over.
That sounds simple. It is also wrong.
A user asking "how far did it get?" should not cancel a login flow. A user asking for the weather should not kill a desktop task. And the worst version of this bug is when an agent sees another active run and accidentally cancels itself.
That was the moment I stopped treating concurrency as a prompt problem.
It was a resource problem.
The fix was to separate task concurrency from resource ownership
The rule I wanted turned out to be much simpler than the behavior I had before:
- independent tasks should run in parallel
- tasks that need the desktop should queue
- cancellation should only happen when the user clearly asks for it
That meant I needed the assistant to know something more concrete than "there are active runs."
It needed to know:
- which run currently holds the desktop
- whether the new request is only a status query
- whether the new task can run without the desktop
- whether the user is correcting the current task or replacing it
That sounds obvious in hindsight, but it changes the control flow a lot.
I ended up treating the desktop like a leased resource
In CliGate, desktop input is now handled more like a lease than a best-effort tool call.
A task that starts using the mouse and keyboard becomes the current desktop holder. Another task can still exist, but it does not get to click on top of the first one. It waits.
The mental model is closer to this:
new task arrives
-> does it need the desktop?
-> if no, run in parallel
-> if yes and desktop is free, acquire it
-> if yes and desktop is busy, queue it
-> only cancel when the user explicitly says stop
That was the missing boundary.
Before that, I had a lot of behavior that looked concurrent in logs but felt random to the user.
The queue mattered more than another retry loop
One of the subtle failures in desktop automation is that retries can make things worse.
If a second task keeps trying to grab the mouse while the first task is still typing or waiting for a window, more retries do not increase reliability. They just increase interference.
So the better fix was not:
- retry the click harder
- guess a different coordinate
- keep asking the model what to do
The better fix was to make the assistant say the honest thing:
the desktop is busy, I am queued behind the current task, and I will start automatically when it is released
That turns a confusing failure into predictable behavior.
I also had to block the dumbest cancellation path
There was another bug hiding inside the same area.
If the assistant is allowed to cancel any active run it sees, it needs a hard rule against cancelling the run it is currently inside.
So I treated that as an invariant rather than a suggestion:
- do not list the current run as a cancel target
- reject self-cancel if it is somehow attempted anyway
- check for cancellation continuously so cancelled runs actually stop cleanly
That part is not glamorous, but it is the difference between a trustworthy scheduler and an agent that panic-clicks its own off switch.
The UX got simpler once the system became less "smart"
This is the pattern I keep running into with agent tooling.
A lot of bad behavior comes from trying to be clever too early.
What users actually need is often plainer than that:
- if a task does not conflict, let it run
- if it conflicts over one physical resource, queue it
- if the user asks for status, answer from status
- if the user says stop, stop
That is less magical than a giant orchestration layer that tries to infer everything from every message.
It is also much easier to trust.
That resource-aware queue ended up making parallel agent work feel more human, not less. The assistant stopped acting like every incoming message was a fight for control, and started acting more like an operator that understands the difference between a question, a correction, and a second pair of hands trying to grab the same mouse.
If you are building AI tooling that touches the desktop, this is the part I would not fake: parallel tasks are fine, but physical resources still need ownership.
Top comments (0)