DevDev

Posted on Mar 22

I Built an Orchestration Layer to Manage Multiple Cursor Agents

#ai #automation #programming #productivity

I Built an Orchestration Layer to Manage Multiple Cursor Agents

Once I started using multiple coding agents in parallel, I ran into an unexpected problem:

The bottleneck was no longer code generation.

It was coordination.

Running one agent feels magical.

Running five or six agents at once feels like managing a messy engineering team with no org chart, no ownership boundaries, and no reliable way to validate what just happened.

That was the reason I built SAMAMS:

Sentinel Automated Multiple AI Management System

It is an orchestration layer for managing multiple Cursor agents like an actual engineering organization.

The problem

At first, using more agents sounds like obvious leverage.

More agents should mean:

more parallel work
faster iteration
less time writing repetitive code

But in practice, I kept hitting the same problems:

multiple agents touched the same files
work was duplicated
outputs conflicted with each other
one agent could quietly drift off course and burn tokens
I still had to manually babysit the entire process

So instead of getting leverage, I got overhead.

The more agents I added, the more coordination cost I created for myself.

That was the real problem:

we have coding agents, but we do not really have agent management systems.

The idea

My basic thought was simple:

If humans scale through ownership, boundaries, task decomposition, and validation, then agents should probably be managed the same way.

So instead of treating agents like glorified autocomplete, I started treating them more like workers in a structured engineering system.

That became SAMAMS.

Its job is to:

break work into manageable layers
assign bounded responsibility
isolate agent execution
monitor progress
detect conflicts
recover when things go wrong

How it works

The planning model is divided into three levels:

Proposal

A large project or initiative.

Milestone

A meaningful feature-level chunk inside the proposal.

Task

An atomic execution unit that one agent can handle in a single working session.

This gives the system a hierarchy instead of just a pile of prompts.

Each task also gets a constrained instruction document, which I call a frontier.

It tells the agent what it should do, what it should avoid touching, and what boundary it should stay within.

That part matters a lot.

Without boundaries, agents do not really parallelize well.

They collide.

Bounded ownership

One of the main design ideas came from Domain-Driven Design.

Each agent is assigned a bounded context:

a set of files
a module
a responsibility area
a clearly scoped task

The goal is to reduce chaos.

Instead of every agent working inside the same shared mess, each one gets a defined workspace and a defined responsibility.

This makes parallel execution much more practical.

Git worktree isolation

To make that boundary real, each agent runs in its own git worktree.

That means agents do not just have conceptual separation.

They have physical workspace separation too.

This helps prevent:

accidental file overlap
branch confusion
chaotic local changes
merge disorder

On top of that, I added merge control logic so agent outputs do not all slam into the main branch at once.

Failure handling

This was one of the most important parts for me.

In most agent workflows, when something goes wrong, the human has to stop everything, inspect the state, figure out what happened, and manually decide what to do next.

That does not scale.

So I added a structured recovery flow.

If an agent keeps failing, or if conflicts are detected, the system can:

pause the task
collect current state
inspect diffs and logs
run a planning pass
decide whether to retry, resume, or cancel

I call this a strategy meeting.

It is basically a controlled intervention step for agent failure cases.

What I realized while building this

The deeper lesson was that the hard problem is not:

"How do I get an agent to write code?"

It is:

"How do I coordinate many agents safely and predictably?"

And that question leads directly to another one:

How do you validate outcomes deterministically?

Agents are not deterministic.

But validation must be.

That means the long-term answer is not just better orchestration.

It is orchestration plus strong validation:

build checks
test automation
type checks
conflict detection
duplication detection
structured success/failure metrics

I think this is the missing layer in a lot of current agent tooling.

Current stack

Right now the project uses:

Go for the backend
Go for the agent process proxy
React for the frontend

It currently runs locally and supports Cursor agents.

The architecture is meant to be extensible, though.

My goal is not "a Cursor-only hack," but a general orchestration layer for multi-agent development workflows.

Current status

This is still early.

It works, but it is definitely not polished production software yet.

There are still rough edges around:

cleanup flows
existing repo onboarding
error handling
broader runner support

So this is less of a "finished launch" and more of a serious prototype for a problem I think will become much more important.

Why I open sourced it

Because I do not think this is just my problem.

As more developers move from using one agent to using many in parallel, I think they will hit the same wall:

not generation,

but coordination.

That is the problem I am trying to solve with SAMAMS.

Repo:

https://github.com/teamswyg/samams

I would especially love feedback on:

multi-agent coordination
deterministic validation
failure recovery
support for more agent runners

Final thought

We already know agents can generate code.

The harder and more interesting problem is building the system around them:

the structure, the boundaries, the validation, and the recovery logic that make parallel agent work actually usable.

That is what I am exploring with SAMAMS.

Top comments (3)

Kalpaka • Mar 28

The bounded context approach from DDD is solid for the easy case — agents with separate file sets. The harder case: an agent discovers mid-execution that the task decomposition was wrong. The file it needs lives inside another agent's boundary, or what was split into two tasks turns out to be one.

At that point the orchestrator stops coordinating and starts making architectural decisions about where boundaries should be. Your strategy meeting concept sits exactly at that edge.

You can validate agent output deterministically (tests, types, builds). Validating whether the boundary assignment was correct is a different problem class, and that's where the real complexity in multi-agent systems lives.

Jane • Mar 23

Really interesting build. I like how you identified that the real bottleneck in multi-agent workflows is coordination, not generation. The way you applied bounded ownership, worktree isolation, and recovery flows makes this feel much more practical than a lot of “agent automation” ideas. Thanks for open sourcing it.

Apex Stack • Mar 28

The git worktree isolation is the detail that makes this actually viable. I've been running parallel agents on a large Astro site and the moment two of them touch overlapping files, you get silent overwrites that are nearly impossible to debug after the fact. Physical workspace separation solves it at the root.

The "strategy meeting" concept for failure recovery is interesting too. Right now most of my agents just log failures and move on — the human review happens later, usually the next morning. Having an automated triage step that can decide retry vs. cancel vs. escalate would save a lot of that manual inspection time.

One thing I'd be curious about: how do you handle the case where Task A's output changes the assumptions that Task B was planned around? Like if an agent refactors a module's API and a parallel agent is writing code that calls the old API. The bounded context helps, but cross-boundary dependencies are where I see the most breakage in practice.