DEV Community

Cover image for Parallel Agents Are Easy. Shipping Without Chaos Isn’t.
rokoss21
rokoss21

Posted on

Parallel Agents Are Easy. Shipping Without Chaos Isn’t.

Introducing Swarm-IOSM — a Parallel Subagent Orchestration Engine for Claude Code

Everyone is building multi-agent workflows now.

Swarm prompts. Agent teams. Tool calling. “Auto-developers”.

And yet… most of them collapse the moment you try to use them on real codebases.

Not because the models can’t code.

Because parallel development has two hard problems that prompt-chains don’t solve:

  1. Safe concurrency (two agents writing into the same file is not “parallelism”, it’s a race condition)
  2. Stop conditions (how do you know the result is shippable, not just “it ran”)

I built Swarm-IOSM to turn agent orchestration into an engineering discipline:
locks, dispatch scheduling, gates, and anti-chaos rules — executable, repeatable, and production-oriented.

GitHub: https://github.com/rokoss21/swarm-iosm


The Hidden Failure Mode of “Agent Swarms”

Here’s the truth nobody wants to say out loud:

Most “agent swarms” are just concurrency without a correctness model.

They don’t fail spectacularly. They fail quietly:

  • Agent A fixes a bug and touches auth.py
  • Agent B adds a feature and also touches auth.py
  • You merge both and discover behavior drift
  • The PR looks large, architecture degrades, confidence drops
  • Then the swarm spawns more tasks to “fix” the mess
  • Congratulations, you built a self-replicating backlog generator

The root cause is simple:

“Parallel agents” ≠ Parallel development

Parallel development requires conflict prevention, not conflict resolution.


Swarm-IOSM: IOSM Methodology + Execution Engine

IOSM is the methodology:

Improve → Optimize → Shrink → Modularize
A disciplined loop that forces engineering quality to remain measurable, not performative.

Swarm-IOSM is the execution engine:

  • PRD-driven decomposition
  • Continuous dispatch scheduling
  • File-conflict prevention via lock discipline
  • Auto-spawn protocol for discoveries
  • Quality gates as stop conditions

It’s not “a prompt”.

It’s a workflow runtime for parallel software development inside Claude Code.


The Architecture: An Orchestrator That Does Not Implement

Swarm-IOSM is intentionally designed around one rule:

The Orchestrator does NOT implement.

The main agent coordinates only.

All implementation work happens in subagents, each producing a report.

This is not a style preference — it’s a safety boundary.

When the orchestrator writes code, it stops being a scheduler and becomes “yet another contributor”, losing global coordination ability.

So Swarm-IOSM splits responsibilities cleanly:

  • Orchestrator = scheduling + gates + conflict check + state tracking
  • Subagents = execution + reports + spawn candidates

The Core Engine: Continuous Dispatch (No Wave Barriers)

Most orchestration frameworks work like this:

Prepare plan → run wave 1 → wait → run wave 2 → wait → merge

That’s not how software work actually flows.

Reality is continuous: tasks unblock tasks every minute.

Swarm-IOSM implements continuous dispatch scheduling:

  • tasks move through states: backlog → ready → running → done
  • as soon as dependencies are satisfied, tasks are eligible to run
  • you dispatch ready tasks immediately (no waiting for a “wave boundary”)

This is what makes it feel fast.

It maximizes parallelism without turning the repo into a battlefield.


The Missing Primitive: “Touches” Lock Manager

This is the centerpiece.

Swarm-IOSM treats a codebase like a shared memory system.

If agents are threads, then files are memory regions.

So Swarm introduces a primitive that classic “agent swarms” ignore:

Touches = the set of files/folders a task may modify.

Each task declares:

  • Touches: auth.py, services/auth/
  • Concurrency class:

    • read-only (no locks, always safe)
    • write-local (lock only touches)
    • write-shared (exclusive, sequential)

Then Swarm enforces locks:

  • folder lock blocks everything inside it
  • file lock blocks only that file
  • read-only tasks remain parallel always

Result:

✅ real parallelism
✅ predictable merges
✅ no random collisions “because agent decided to edit config too”


Auto-Spawn… Without Infinite Task Proliferation

Auto-spawn sounds cool until you actually run it.

A naive swarm will spawn tasks forever.

Swarm-IOSM forces auto-spawn to be bounded and deduplicated:

  • spawn budget total
  • per-gate budgets
  • dedup key: <primary_touch>|<intent_category>
  • severity thresholds
  • anti-loop counters (max iterations without progress)

This is what transforms “agent creativity” into something you can safely run in an engineering process.


IOSM Gates: Stop Conditions That Mean Something

Most systems “stop” when tasks finish.

Swarm-IOSM stops when quality is achieved.

It tracks four gate families:

Gate-I (Improve)

Clarity, invariants, low duplication.

Gate-O (Optimize)

Latency budget, error budget, chaos checks, no obvious inefficiencies.

Gate-S (Shrink)

Surface area reduction, dependency stability, onboarding time.

Gate-M (Modularize)

Contracts, coupling limits, no circular dependencies.

Swarm is not just “agents executing tasks”.

It’s agents executing tasks until the system crosses a production threshold.


Quick Start (The Happy Path)

Swarm-IOSM lives here:

https://github.com/rokoss21/swarm-iosm

1) Install as a Claude Code skill

Project-level:

git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
Enter fullscreen mode Exit fullscreen mode

User-level:

git clone https://github.com/rokoss21/swarm-iosm.git ~/.claude/skills/swarm-iosm
Enter fullscreen mode Exit fullscreen mode

2) Initialize project context

/swarm-iosm setup
Enter fullscreen mode Exit fullscreen mode

3) Create a feature track

/swarm-iosm new-track "Add user authentication with JWT"
Enter fullscreen mode Exit fullscreen mode

Swarm generates PRD + plan and returns a track id like:

2026-01-17-001

4) Validate & generate a continuous dispatch plan

python .claude/skills/swarm-iosm/scripts/orchestration_planner.py \
  swarm/tracks/<track-id>/plan.md --validate

python .claude/skills/swarm-iosm/scripts/orchestration_planner.py \
  swarm/tracks/<track-id>/plan.md --continuous
Enter fullscreen mode Exit fullscreen mode

5) Execute

/swarm-iosm implement
Enter fullscreen mode Exit fullscreen mode

6) Integrate

/swarm-iosm integrate <track-id>
Enter fullscreen mode Exit fullscreen mode

This produces integration artifacts and quality gate reporting.


Why This Is Different From “Yet Another Agent Framework”

This part matters.

Swarm-IOSM doesn’t compete with “prompt frameworks” by being smarter.

It wins by being stricter.

Swarm-IOSM treats a repo as a concurrency system.

Locks are not optional.

Swarm-IOSM treats quality as a stop condition.

No gates = no ship.

Swarm-IOSM treats spawn as a budgeted resource.

Infinite loops are a design bug, not “agent autonomy”.

You can replace models, providers, or toolchains.

But you can’t replace engineering discipline with vibes.


Real-World Fit: Where Swarm-IOSM Shines

Use Swarm-IOSM when:

  • multi-file features require coordination
  • brownfield refactoring needs guardrails
  • parallel implementation streams are valuable
  • acceptance criteria must exist (not “it compiles”)

Avoid Swarm-IOSM when:

  • it’s a single-file change
  • you want quick fixes without planning
  • you’re doing purely exploratory research

A hammer is not a screwdriver.

A swarm is not a substitute for architecture.


The Meta-Point: This Is Part of a Bigger Stack

I’m building a full deterministic engineering ecosystem around AI systems:

  • IOSM = methodology layer
  • Swarm-IOSM = execution/orchestration layer
  • FACET = deterministic contract layer for AI behavior

If you’ve read my FACET articles, you already know the thesis:

We don’t need “more prompting”.
We need engineering primitives: contracts, determinism, orchestration rules, replayable artifacts.

Swarm-IOSM is exactly that philosophy applied to parallel agent development.


Links


Closing Thoughts

Parallel agents are not the hard part.

The hard part is shipping without chaos:

  • no file conflicts
  • no accidental coupling
  • no architecture collapse
  • no infinite spawn loops
  • gates that enforce engineering quality

Swarm-IOSM is my answer to that.

If you’re using Claude Code and you’ve ever tried to scale beyond a single agent — try it:

https://github.com/rokoss21/swarm-iosm

And if you want the next deep dive, I can write a follow-up:

  • the touches lock hierarchy rules
  • a demo track walkthrough
  • and how IOSM gates can be automated for CI.

Top comments (1)

Collapse
 
rokoss21 profile image
rokoss21 • Edited

If you tried multi-agent coding: what broke first — file conflicts, endless follow-up tasks, or lack of “done” criteria?

I built Swarm-IOSM to solve those 3 failure modes.

Try it in Claude Code:
/swarm-iosm new-track "Add JWT auth"