DEV Community

Cover image for Swarm-IOSM: Orchestrating Parallel AI Agents with Quality Gates
rokoss21
rokoss21

Posted on

Swarm-IOSM: Orchestrating Parallel AI Agents with Quality Gates

TL;DR: Swarm-IOSM is an orchestration engine for Claude Code that transforms complex development tasks into coordinated parallel work streams. It implements continuous dispatch scheduling (no wave barriers), hierarchical file lock management, and enforces IOSM quality gates before merge. Real-world speedup: commonly 3-8x faster than sequential execution.


The Parallel Agent Problem

You're working on a complex feature. It needs:

  • Codebase analysis to understand existing patterns
  • Architecture design for the new system
  • Implementation across 3 modules (independent)
  • Integration tests
  • Security audit

Traditional approach: One agent does everything sequentially. 15 hours of wall-clock time.

What if you could run analysis, design, and implementation in parallel? 4-6 hours.

But here's the catch: parallel AI agents need coordination. They can't all edit the same file. They need to share knowledge. And you need quality guarantees before merging their work.

That's what Swarm-IOSM solves.


What is Swarm-IOSM?

Swarm-IOSM is a Claude Code Skill that orchestrates parallel AI agent execution with built-in quality enforcement. It combines:

  1. Continuous Dispatch Loop — Tasks launch immediately when dependencies are met (no artificial wave barriers)
  2. File Lock Management — Hierarchical conflict detection prevents parallel write chaos
  3. PRD-Driven Planning — Structured requirements → decomposition → execution
  4. IOSM Quality Gates — Automated code quality, performance, and modularity checks
  5. Auto-Spawn Protocol — Agents discover new work during execution

Core Model

Touches → Locks → Gates → Done
Enter fullscreen mode Exit fullscreen mode

A correctness model for parallel agent work:

  • Declare what files you touch
  • Acquire locks to prevent conflicts
  • Pass quality gates
  • Ship

Key Innovation: Continuous Dispatch

Traditional orchestration waits for entire "waves" to complete:

Wave 1: [T01, T02, T03] → Wait for ALL to finish
Wave 2: [T04, T05]      → Can't start until Wave 1 done
Enter fullscreen mode Exit fullscreen mode

Swarm-IOSM uses continuous scheduling:

T01 done → T04 starts IMMEDIATELY (even if T02, T03 still running)
Enter fullscreen mode Exit fullscreen mode

This eliminates idle time and maximizes parallelism. Here's the dispatch algorithm:

while not gates_met:
    # 1. Collect ready tasks (deps satisfied, no conflicts)
    ready = [t for t in backlog if deps_satisfied(t) and not conflicts(t)]

    # 2. Classify by mode (background vs foreground)
    bg = [t for t in ready if can_auto_background(t)]
    fg = [t for t in ready if needs_user_input(t)]

    # 3. Dispatch batch (max 3-6 tasks)
    launch_parallel(bg[:6], mode='background')
    launch_parallel(fg[:2], mode='foreground')

    # 4. Monitor & spawn
    for report in collect_completed():
        spawn_candidates = parse_spawn_candidates(report)
        backlog.extend(deduplicate(spawn_candidates))

    # 5. Check gates
    if all_gates_pass():
        break
Enter fullscreen mode Exit fullscreen mode

Result: Tasks launch as soon as they're ready, not when an arbitrary wave completes.


Live Example: Adding Redis Caching

Let's walk through a real track from examples/demo-track/.

Problem

API endpoint /api/natal/chart has 450ms P95 latency. Database CPU at 75% during peak hours.

Goal

Add Redis caching to reduce latency to <200ms and achieve 80%+ cache hit rate.

Step 1: Create Track

/swarm-iosm new-track "Add Redis caching to API endpoints"
Enter fullscreen mode Exit fullscreen mode

Claude generates:

  • PRD.md — 10 sections (Problem, Goals, Requirements, Risks, IOSM Targets)
  • spec.md — Technical design with acceptance tests
  • plan.md — Task breakdown with dependencies

Generated plan (7 tasks):

T01: Analyze current performance (Explorer, 1h, read-only)
T02: Design caching strategy (Architect, 2h, foreground)
T03: Implement cache service (Implementer-A, 3h, background)
T04: Add caching to /natal endpoint (Implementer-B, 2h, background, after T03)
T05: Add caching to /transits endpoint (Implementer-C, 2h, background, after T03)
T06: Integration testing (TestRunner, 2h, background, after T04+T05)
T07: Security audit + merge (Integrator, 1h, foreground, after T06)
Enter fullscreen mode Exit fullscreen mode

Step 2: Execute Plan

/swarm-iosm implement
Enter fullscreen mode Exit fullscreen mode

Orchestrator creates continuous_dispatch_plan.md:

## Initial Ready Set
- T01 (Explorer, background)

## Expected Timeline
Batch 1: T01 → completes in 1h
Batch 2: T02 → completes in 2h (total: 3h)
Batch 3: T03 → completes in 3h (total: 6h)
Batch 4: T04, T05 (PARALLEL) → completes in 2h (total: 8h)
Batch 5: T06 → completes in 2h (total: 10h)
Batch 6: T07 → completes in 1h (total: 11h)

Serial estimate: 13h
Parallel estimate: 11h
Speedup: ~1.2x
Enter fullscreen mode Exit fullscreen mode

But wait — T01 discovers an N+1 query issue:

## SpawnCandidates (from T01 report)

| ID | Subtask | Touches | Effort | Severity |
|----|---------|---------|--------|----------|
| SC-01 | Optimize calculate_aspects N+1 query | `backend/core/astro/natal.py` | M | medium |
Enter fullscreen mode Exit fullscreen mode

Orchestrator auto-spawns SC-01 and adjusts timeline.

Step 3: Integration & Quality Gates

/swarm-iosm integrate demo-add-caching
Enter fullscreen mode Exit fullscreen mode

Generated iosm_report.md:

## Gate Evaluation Summary

| Gate | Target | Final | Status |
|------|--------|-------|--------|
| Gate-I (Code Quality) | ≥0.75 | 0.89 | ✅ PASS |
| Gate-O (Performance) | Tests pass | All pass | ✅ PASS |
| Gate-M (Modularity) | No circular deps | Pass | ✅ PASS |
| Gate-S (Simplicity) | API stable | N/A | ⚪ SKIP |

IOSM-Index: 0.85 ✅ (threshold: 0.80)

**Result:** APPROVED FOR PRODUCTION MERGE
Enter fullscreen mode Exit fullscreen mode

Results

  • P95 latency: 450ms → 180ms (60% improvement)
  • 🎯 Cache hit rate: 82%
  • All tests passing (24 unit + 6 integration)
  • 🔒 Zero production errors during rollout
  • ⏱️ Total time: 9.25h parallel vs 16h+ sequential (~1.7x faster)

Technical Deep Dive

1. File Lock Management

Challenge: How do you prevent two agents from editing the same file simultaneously?

Solution: Hierarchical lock manager with folder/file awareness.

Lock rules:

def conflicts(lock_a: str, lock_b: str) -> bool:
    a, b = normalize(lock_a), normalize(lock_b)
    # Exact match
    if a == b:
        return True
    # Folder contains file
    if a.startswith(b + '/') or b.startswith(a + '/'):
        return True
    return False
Enter fullscreen mode Exit fullscreen mode

Example:

## Lock Plan

Tasks with overlapping touches (sequential only):
- `backend/core/__init__.py`: T03, T04 → ❌ Cannot run parallel
- `backend/api/`: T05, T06 → ❌ Folder conflict

Safe parallel execution:
- `backend/auth.py` (T02) + `backend/payments.py` (T07) → ✅ No overlap
Enter fullscreen mode Exit fullscreen mode

Read-only tasks: Always parallel (no locks needed).


2. IOSM Quality Gates

Four gates enforce production-grade quality:

Gate-I: Improve (Code Quality)

semantic_coherence: ≥0.95  # Clear naming, no magic numbers
duplication_max: ≤0.05     # Max 5% duplicate code
invariants_documented: true # Pre/post-conditions
todos_tracked: true        # All TODOs in issue tracker
Enter fullscreen mode Exit fullscreen mode

Measured by: AST analysis, clone detection, docstring coverage.

Gate-O: Optimize (Performance & Resilience)

latency_ms:
  p50: ≤100
  p95: ≤200
  p99: ≤500
error_budget_respected: true
chaos_tests_pass: true
no_obvious_inefficiencies: true  # N+1 queries, memory leaks
Enter fullscreen mode Exit fullscreen mode

Measured by: Load testing (locust, k6), chaos engineering, profiling.

Gate-M: Modularize (Clean Boundaries)

contracts_defined: 1.0       # 100% of modules
change_surface_max: 0.20     # ≤20% of codebase touched
no_circular_deps: true
coupling_acceptable: true
Enter fullscreen mode Exit fullscreen mode

Measured by: Dependency graph analysis, interface stability.

Gate-S: Shrink (Minimal Complexity)

api_surface_reduction: ≥0.20  # Or justified growth
dependency_count_stable: true
onboarding_time_minutes: ≤15
Enter fullscreen mode Exit fullscreen mode

Measured by: Public API count, requirements.txt diff, README clarity.

IOSM-Index Calculation:

IOSM-Index = (Gate-I + Gate-O + Gate-M + Gate-S) / 4
Production Threshold: ≥ 0.80
Enter fullscreen mode Exit fullscreen mode

Auto-spawn rules:

  • Gate-I < 0.75 → Spawn clarity/duplication fixes
  • Gate-O fails → Spawn test/performance fixes
  • Gate-M fails → Spawn boundary clarification tasks

3. Auto-Spawn Protocol

Problem: Agents discover issues during execution (e.g., N+1 queries, missing tests).

Solution: Structured SpawnCandidates section in reports.

Format:

## SpawnCandidates

| ID | Subtask | Touches | Effort | User Input | Severity | Dedup Key | Accept Criteria |
|----|---------|---------|--------|------------|----------|-----------|-----------------|
| SC-01 | Fix missing type annotation | `backend/auth.py` | S | false | medium | auth.py\|type-annot | mypy passes |
| SC-02 | Clarify API contract | `docs/api_spec.yaml` | M | true | high | api_spec\|contract | Contract approved |
Enter fullscreen mode Exit fullscreen mode

Orchestrator actions:

  1. Parse SpawnCandidates from completed task reports
  2. Deduplicate by dedup_key (prevents duplicate work)
  3. If needs_user_input=false and severity != criticalauto-spawn
  4. If needs_user_input=true → Add to blocked queue
  5. Run new tasks through planner and dispatch

Spawn protection: Budget limits (default: 20 auto-spawns per track) prevent infinite loops.


4. Cost Tracking & Model Selection

Model selection rules:

Model Use Case Cost (per 1M tokens)
Haiku Read-only analysis $0.25 / $1.25
Sonnet Standard implementation $3.00 / $15.00
Opus Architecture, security $15.00 / $75.00

Budget controls:

  • Default limit: $10.00 per track
  • ⚠️ 80% usage → Warning
  • 🛑 100% usage → Pause execution

Check current spend:

## Cost Tracking (from iosm_state.md)
- budget_total: $10.00
- spent_so_far: $6.50
- remaining: $3.50
Enter fullscreen mode Exit fullscreen mode

Real-World Use Cases

1. Greenfield Feature (Email Notifications)

Task: Add complete email notification system to SaaS app

Plan:

  • T01: Design email templates (Architect, foreground)
  • T02: Implement SMTP service (Implementer-A, background)
  • T03: Add queue system (Implementer-B, background, parallel with T02)
  • T04: Write integration tests (TestRunner, background, after T02+T03)
  • T05: Add API endpoints (Implementer-C, background, after T02)

Results:

  • ~3x faster (4-6h parallel vs 12-15h sequential)
  • 100% test coverage (Gate-O enforcement)
  • 📉 Minimal technical debt (Gate-I: 0.92)

2. Brownfield Refactoring (Payment Module)

Task: Refactor legacy payment processing (5000+ LOC, 3 years old)

Workflow:

  1. Plan mode: Explorer analyzes codebase (read-only, safe)
  2. PRD with rollback strategy
  3. Comprehensive regression tests (before touching code)
  4. Parallel implementation (2 modules refactored simultaneously)
  5. Gate-M fails: Circular dependency detected
  6. Auto-spawn: "Break circular import between Payment and Invoice"
  7. Re-check Gate-M: Pass ✅

Results:

  • 🎯 Gate-driven quality — Forced resolution of hidden issues
  • 🔒 Safe refactor — All tests passing before merge
  • 📊 Measured improvement — 40% reduction in module coupling

3. Multi-Module Feature (Multi-Tenant Architecture)

Task: Add multi-tenancy (affects 8 modules)

Plan: 20+ tasks across 5 waves

  • Wave 1: T01 Design schema (Architect, critical path)
  • Wave 2: T02-T04 Database migrations (3 parallel implementers)
  • Wave 3: T05-T10 Update 6 modules (6 parallel implementers)
  • Wave 4: T11-T15 Tests (5 parallel test runners)
  • Wave 5: T16 Integration

Auto-spawn: 3 critical tasks discovered during execution

Results:

  • 📈 High parallelism — 6 modules updated simultaneously
  • 💰 Budget control — $6.50 spent (within $10 limit)
  • ⏱️ Time savings — ~18h parallel vs 60h+ sequential

Getting Started (5 Minutes)

Installation

# Clone into Claude Code skills directory
git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
Enter fullscreen mode Exit fullscreen mode

Verify: type /swarm-iosm in Claude Code.

Create Your First Track

/swarm-iosm new-track "Add user authentication with JWT"
Enter fullscreen mode Exit fullscreen mode

Claude will:

  1. Ask questions (mode: greenfield/brownfield, priorities, constraints)
  2. Generate PRD (10 sections)
  3. Create plan.md with task breakdown
  4. Show orchestration plan

Execute

/swarm-iosm implement
Enter fullscreen mode Exit fullscreen mode

Watch the magic:

  • Parallel agents launch automatically
  • Progress tracked in iosm_state.md
  • Reports appear in reports/ directory

Integrate

/swarm-iosm integrate <track-id>
Enter fullscreen mode Exit fullscreen mode

Quality gates run automatically. You get iosm_report.md with pass/fail.


Commands Reference

Command Description
/swarm-iosm setup Initialize project context
/swarm-iosm new-track "<desc>" Create feature track
/swarm-iosm implement Execute plan (auto mode)
/swarm-iosm status Check progress
/swarm-iosm watch Live monitoring (v1.3)
/swarm-iosm simulate Dry-run with timeline (v1.3)
/swarm-iosm resume Resume after crash (v1.3)
/swarm-iosm retry <task-id> Retry failed task (v1.2)
/swarm-iosm integrate <id> Merge and run gates

What Swarm-IOSM is NOT

To set clear expectations:

  • Not a general-purpose workflow engine — Designed specifically for Claude Code agent orchestration
  • Not a replacement for CI/CD — Complements your pipeline, doesn't replace it
  • Not a code generator "autopilot" — Requires human oversight and decision-making
  • Not safe to run unattended on production repos — Always review changes before merge

Architecture Overview

┌──────────────────────────────────────────────────────────────────────┐
│                    ORCHESTRATOR (Main Claude Agent)                  │
│  ┌─────────────────────────────────────────────────────────────────┐ │
│  │              Continuous Dispatch Loop (v1.1+)                   │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────────┐ │ │
│  │  │ Collect  │→ │ Classify │→ │ Conflict │→ │ Dispatch Batch   │ │ │
│  │  │  Ready   │  │  Modes   │  │  Check   │  │ (max 3-6 tasks)  │ │ │
│  │  └──────────┘  └──────────┘  └──────────┘  └──────────────────┘ │ │
│  │       ↑                                           │             │ │
│  │       │        ┌──────────┐  ┌──────────┐         ↓             │ │
│  │       └────────│  IOSM    │←─│ Auto-    │←────────┘             │ │
│  │                │  Gates   │  │ Spawn    │                       │ │
│  │                └──────────┘  └──────────┘                       │ │
│  └─────────────────────────────────────────────────────────────────┘ │
│                                   │                                  │
│               ┌───────────────────┼───────────────────┐              │
│               ↓                   ↓                   ↓              │
│  ┌────────────────────┐ ┌────────────────────┐ ┌─────────────────┐   │
│  │   Subagent (BG)    │ │   Subagent (BG)    │ │  Subagent (FG)  │   │
│  │   Explorer         │ │   Implementer-A    │ │  Architect      │   │
│  │   read-only        │ │   write-local      │ │  needs_user     │   │
│  └────────────────────┘ └────────────────────┘ └─────────────────┘   │
│               │                   │                   │              │
│               ↓                   ↓                   ↓              │
│         reports/T01.md      reports/T02.md      reports/T03.md       │
│         + SpawnCandidates   + SpawnCandidates   + Escalations        │
└──────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

IOSM Framework Integration

Swarm-IOSM implements the IOSM methodology (Improve → Optimize → Shrink → Modularize) as an executable system:

┌────────────────────────────────────────────────────────────────────────────┐
│                           IOSM FRAMEWORK                                   │
│                   https://github.com/rokoss21/IOSM                         │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────────┐    │
│    │ IMPROVE  │ →  │ OPTIMIZE │ →  │  SHRINK  │ →  │   MODULARIZE     │    │
│    │          │    │          │    │          │    │                  │    │
│    │ Clarity  │    │ Speed    │    │ Simplify │    │ Decompose        │    │
│    │ No dups  │    │ Resil.   │    │ Surface  │    │ Contracts        │    │
│    │ Invars   │    │ Chaos    │    │ Deps     │    │ Coupling         │    │
│    └────┬─────┘    └────┬─────┘    └────┬─────┘    └────────┬─────────┘    │
│         │               │               │                   │              │
│    ┌────▼─────┐    ┌────▼─────┐    ┌────▼─────┐    ┌────────▼─────────┐    │
│    │ Gate-I   │    │ Gate-O   │    │ Gate-S   │    │     Gate-M       │    │
│    │ ≥0.85    │    │ ≥0.75    │    │ ≥0.80    │    │     ≥0.80        │    │
│    └──────────┘    └──────────┘    └──────────┘    └──────────────────┘    │
│                                                                            │
│    IOSM-Index = (Gate-I + Gate-O + Gate-S + Gate-M) / 4                    │
│    Production threshold: ≥ 0.80                                            │
└────────────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Version History

v2.1 (2026-01-19) — Current

  • Automated State Management (iosm_state.md auto-generated)
  • Status Sync CLI (--update-task)
  • Improved Report Conflict Detection

v2.0 (2026-01-18)

  • Inter-Agent Communication (shared_context.md)
  • Task Dependency Visualization (--graph)
  • Anti-Pattern Detection
  • Template Customization

v1.3 (2026-01-17)

  • Simulation Mode (/swarm-iosm simulate) with ASCII Timeline
  • Live Monitoring (/swarm-iosm watch)
  • Checkpointing & Resume (/swarm-iosm resume)

v1.2 (2026-01-16)

  • Concurrency Limits (Resource Budgets)
  • Cost Tracking & Model Selection (Haiku/Sonnet/Opus)
  • Intelligent Error Diagnosis & Retry (/swarm-iosm retry)

v1.1 (2026-01-15)

  • Continuous Dispatch Loop (no wave barriers)
  • Gate-Driven Continuation
  • Auto-Spawn from SpawnCandidates
  • Touches Lock Manager

Contributing

We welcome contributions! Key areas:

  • Gate Automation Scripts — Measure IOSM criteria automatically
  • CI/CD Integration — GitHub Actions, GitLab CI examples
  • Language-Specific Checkers — Python, TypeScript, Rust evaluators
  • More Examples — Real-world track demonstrations
  • IDE Integration — VS Code extension

See CONTRIBUTING.md for guidelines.


Conclusion

Swarm-IOSM proves that AI agent orchestration can be both fast (3-8x speedup through parallelism) and safe (quality gates before merge).

The continuous dispatch model eliminates artificial wave barriers, file lock management prevents conflicts, and IOSM gates enforce production-grade standards.

Key takeaway: Don't choose between speed and quality. With proper orchestration, you get both.

Try it today:

git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
/swarm-iosm new-track "Your next feature"
Enter fullscreen mode Exit fullscreen mode

Links


Questions? Ideas? Issues?


Built with ⚡ by @rokoss21 | IOSM: Improve → Optimize → Shrink → Modularize

Top comments (0)