DEV Community

DevDev
DevDev

Posted on

I Built an Orchestration Layer to Manage Multiple Cursor Agents

I Built an Orchestration Layer to Manage Multiple Cursor Agents

Once I started using multiple coding agents in parallel, I ran into an unexpected problem:

The bottleneck was no longer code generation.

It was coordination.

Running one agent feels magical.

Running five or six agents at once feels like managing a messy engineering team with no org chart, no ownership boundaries, and no reliable way to validate what just happened.

That was the reason I built SAMAMS:

Sentinel Automated Multiple AI Management System

It is an orchestration layer for managing multiple Cursor agents like an actual engineering organization.


The problem

At first, using more agents sounds like obvious leverage.

More agents should mean:

  • more parallel work
  • faster iteration
  • less time writing repetitive code

But in practice, I kept hitting the same problems:

  • multiple agents touched the same files
  • work was duplicated
  • outputs conflicted with each other
  • one agent could quietly drift off course and burn tokens
  • I still had to manually babysit the entire process

So instead of getting leverage, I got overhead.

The more agents I added, the more coordination cost I created for myself.

That was the real problem:

we have coding agents, but we do not really have agent management systems.


The idea

My basic thought was simple:

If humans scale through ownership, boundaries, task decomposition, and validation, then agents should probably be managed the same way.

So instead of treating agents like glorified autocomplete, I started treating them more like workers in a structured engineering system.

That became SAMAMS.

Its job is to:

  • break work into manageable layers
  • assign bounded responsibility
  • isolate agent execution
  • monitor progress
  • detect conflicts
  • recover when things go wrong

How it works

The planning model is divided into three levels:

Proposal

A large project or initiative.

Milestone

A meaningful feature-level chunk inside the proposal.

Task

An atomic execution unit that one agent can handle in a single working session.

This gives the system a hierarchy instead of just a pile of prompts.

Each task also gets a constrained instruction document, which I call a frontier.

It tells the agent what it should do, what it should avoid touching, and what boundary it should stay within.

That part matters a lot.

Without boundaries, agents do not really parallelize well.

They collide.


Bounded ownership

One of the main design ideas came from Domain-Driven Design.

Each agent is assigned a bounded context:

  • a set of files
  • a module
  • a responsibility area
  • a clearly scoped task

The goal is to reduce chaos.

Instead of every agent working inside the same shared mess, each one gets a defined workspace and a defined responsibility.

This makes parallel execution much more practical.


Git worktree isolation

To make that boundary real, each agent runs in its own git worktree.

That means agents do not just have conceptual separation.

They have physical workspace separation too.

This helps prevent:

  • accidental file overlap
  • branch confusion
  • chaotic local changes
  • merge disorder

On top of that, I added merge control logic so agent outputs do not all slam into the main branch at once.


Failure handling

This was one of the most important parts for me.

In most agent workflows, when something goes wrong, the human has to stop everything, inspect the state, figure out what happened, and manually decide what to do next.

That does not scale.

So I added a structured recovery flow.

If an agent keeps failing, or if conflicts are detected, the system can:

  • pause the task
  • collect current state
  • inspect diffs and logs
  • run a planning pass
  • decide whether to retry, resume, or cancel

I call this a strategy meeting.

It is basically a controlled intervention step for agent failure cases.


What I realized while building this

The deeper lesson was that the hard problem is not:

"How do I get an agent to write code?"

It is:

"How do I coordinate many agents safely and predictably?"

And that question leads directly to another one:

How do you validate outcomes deterministically?

Agents are not deterministic.

But validation must be.

That means the long-term answer is not just better orchestration.

It is orchestration plus strong validation:

  • build checks
  • test automation
  • type checks
  • conflict detection
  • duplication detection
  • structured success/failure metrics

I think this is the missing layer in a lot of current agent tooling.


Current stack

Right now the project uses:

  • Go for the backend
  • Go for the agent process proxy
  • React for the frontend

It currently runs locally and supports Cursor agents.

The architecture is meant to be extensible, though.

My goal is not "a Cursor-only hack," but a general orchestration layer for multi-agent development workflows.


Current status

This is still early.

It works, but it is definitely not polished production software yet.

There are still rough edges around:

  • cleanup flows
  • existing repo onboarding
  • error handling
  • broader runner support

So this is less of a "finished launch" and more of a serious prototype for a problem I think will become much more important.


Why I open sourced it

Because I do not think this is just my problem.

As more developers move from using one agent to using many in parallel, I think they will hit the same wall:

not generation,

but coordination.

That is the problem I am trying to solve with SAMAMS.

Repo:

https://github.com/teamswyg/samams

I would especially love feedback on:

  • multi-agent coordination
  • deterministic validation
  • failure recovery
  • support for more agent runners

Final thought

We already know agents can generate code.

The harder and more interesting problem is building the system around them:

the structure, the boundaries, the validation, and the recovery logic that make parallel agent work actually usable.

That is what I am exploring with SAMAMS.

Top comments (0)