How Swarm Orchestrator v8 Tries to Break Its Own AI Patches

#ai #programming #opensource #showdev

Most AI coding tools commit when their own checks pass. Swarm Orchestrator v8 adds a second adversarial layer: independent falsifier adapters that try to break each patch before it merges. v8.0.1 is on main with that subsystem on by default.

This post walks through the v8 architecture, the four verification points, the producer/falsifier adapter split, and the limitations that haven't been solved in v8.0 yet.

What is Swarm Orchestrator? A contract-first AI coding swarm with hash-chained evidence and verifier-gated commits. It compiles a natural-language goal into a typed contract, dispatches it to a population of personas inside one cached Anthropic session, races candidate diffs per obligation, and commits only what passes verification. It wraps an LLM; it doesn't replace one.

The shape of a run

You hand it a goal in plain English. The contract compiler turns that into contract.jsonl plus a manifest.json carrying the goal, repo context, extractor provenance, and a SHA-256 of the canonical contract bytes. Identical inputs produce identical contract hashes.

goal (text)
   |
   v
contract compiler  ->  contract.jsonl + manifest.json
   |
   v
+-------------------------------------------------+
|        population manager (single session)      |
|                                                 |
|  ledger (jsonl, hash-chain) <- personas (8)     |
|       ^                          |              |
|       | tournament + verifier scoring           |
|       |                                         |
|  WASM deterministic floor (zero-LLM obligs)     |
+-------------------------------------------------+
   |                              |
   v                              v
streaming verifier      post-merge integration
   |                              |
   +--------------+---------------+
                  v
       falsifier adapters (Codex, Copilot)
                  |
                  v
            committed diffs

The population manager opens one cached Anthropic session and walks each obligation. It picks the persona whose trigger predicate matches the obligation type. In tournament mode, N candidates run in parallel; the verifier scores them, the top scorer is a commit candidate, and losers get logged but never merge.

Two adapter subsystems

The most common confusion in v6 was treating the coding CLIs and the falsifiers as one thing. v8 splits them cleanly.

Producer adapters (src/adapters/) wrap third-party coding CLIs as the worker in the v6 verified-branch pipeline. Backends: Copilot, Claude Code, Codex, Claude Code Teams. All four are opt-in via swarm run --v6.

Falsifier adapters (src/falsification/adapters/) take a patch the producer's verifier already accepted and try to falsify the obligation by surfacing a counter-example, regression fixture, or property-violation trace. A confirmed counter-example flips the obligation back to failed.

Falsifier	Default	Obligation types
`CodexFalsifier`	on	`property-must-hold`
`CopilotFalsifier`	on	`import-graph-must-satisfy`, `function-must-have-signature`
`ClaudeCodeFalsifier`	off (per-adapter opt-in)	all three

The CLI surface is one flag: --falsifiers <on|off> (default on). Per-adapter selection happens at the API layer via defaultAdapterRegistry({ includeCopilot, includeClaudeCode }).

Four verification points

A patch has to survive these before it merges:

Pre-generation memoization. Skip generation if the obligation result is already cached.
Mid-stream abort. During generation, the streaming verifier can abort the call. Works in --mode single only; tournament mode skips it.
Post-generation per-obligation verifier. Scores the candidate diff. In tournament mode, top scorer wins; in single mode it's pass/fail.
Post-merge integration check. After the diff lands, the integration check confirms the broader system still holds.

The architectural rule from the README: nothing commits without passing the obligation's verifier. Then the falsifiers get a shot.

The hash-chained ledger

Every action lands in .swarm/ledger/<run-id>.jsonl with the SHA of the prior entry. Tampering is detectable; runs resume from any prior state. If a process is killed mid-run, swarm v8 resume <run-id> walks the ledger and picks up where it left off.

The ledger format is shared with v6, but v8 writes more granular events (per-persona dispatch, per-candidate score, falsifier verdict) so a run can be replayed or audited end-to-end.

Quick start

git clone https://github.com/moonrunnerkc/swarm-orchestrator.git
cd swarm-orchestrator && npm install && npm run build && npm link

# Compile a goal, then run it
swarm v8 compile "add a /health endpoint that returns 200 OK" --yes
swarm v8 run .swarm/contracts/<contract-id>

# Or both in one step (defaults to v8)
swarm run --goal "add a /health endpoint that returns 200 OK"

# Resume a killed run
swarm v8 resume <run-id>

Requires Node >= 20, git >= 2.40, and ANTHROPIC_API_KEY. Pass --extractor stub --session stub to run offline.

There's also a GitHub Action:

- uses: moonrunnerkc/swarm-orchestrator@v8
  with:
    goal: "add a /health endpoint"
    contract-only: false
    cost-cap: "5.00"
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

What v8.0 doesn't do

Limitations worth reading before adopting

Tournament mode doesn't stream. --mode tournament plus --forbid-import skips the streaming abort; streaming verification is --mode single only.
Post-merge failure doesn't auto-rollback. The run is marked failed; per-obligation worktree snapshots are post-v8.0.
--cost-cap is enforced post-obligation, not mid-call. Cumulative spend is checked at the end of each obligation against estimated Sonnet 4 pricing. Exit code 6 if exceeded.
Bandit dispatch is not built (Phase 5). Codex and Copilot have disjoint obligation types, so there's nothing to arbitrate between yet.
Cross-vendor producer race is deferred (Phase 6).

Full list with rationale lives in docs/v8-architecture-deviations.md.

Repo

moonrunnerkc / swarm-orchestrator

Contract-first AI coding swarm with hash-chained evidence. Compiles a goal into typed obligations, races persona candidates per obligation in a single cached inference session, verifies before commit, and logs every action in an append-only ledger.

Swarm Orchestrator

Contract-first AI coding swarm with hash-chained evidence and verifier-gated commits.

swarm compiles a natural-language goal into a typed contract, dispatches it to a population of personas inside one cached Anthropic session, races candidate diffs per obligation, and commits only the diffs that pass verification. After the producer's verifier accepts a patch, registered falsifier adapters get a chance to break it before it merges. Every action lands in an append-only hash-chained ledger you can audit, resume, or replay.

It wraps an LLM; it does not replace one. The model writes the code, the orchestrator decides what reaches your repo.

Status

Version 8.0.1 on main. Node >= 20 (CI matrix: 20, 22). License ISC. The v8 architecture is the default for swarm run; the v6 verified-branch pipeline is preserved under swarm run --v6 and the swarm swarm / swarm execute commands Falsifier subsystem: Codex on, Copilot on, ClaudeCode…

View on GitHub