Most AI coding tools commit when their own checks pass. Swarm Orchestrator v8 adds a second adversarial layer: independent falsifier adapters that try to break each patch before it merges. v8.0.1 is on main with that subsystem on by default.
This post walks through the v8 architecture, the four verification points, the producer/falsifier adapter split, and the limitations that haven't been solved in v8.0 yet.
The shape of a run
You hand it a goal in plain English. The contract compiler turns that into contract.jsonl plus a manifest.json carrying the goal, repo context, extractor provenance, and a SHA-256 of the canonical contract bytes. Identical inputs produce identical contract hashes.
goal (text)
|
v
contract compiler -> contract.jsonl + manifest.json
|
v
+-------------------------------------------------+
| population manager (single session) |
| |
| ledger (jsonl, hash-chain) <- personas (8) |
| ^ | |
| | tournament + verifier scoring |
| | |
| WASM deterministic floor (zero-LLM obligs) |
+-------------------------------------------------+
| |
v v
streaming verifier post-merge integration
| |
+--------------+---------------+
v
falsifier adapters (Codex, Copilot)
|
v
committed diffs
The population manager opens one cached Anthropic session and walks each obligation. It picks the persona whose trigger predicate matches the obligation type. In tournament mode, N candidates run in parallel; the verifier scores them, the top scorer is a commit candidate, and losers get logged but never merge.
Two adapter subsystems
The most common confusion in v6 was treating the coding CLIs and the falsifiers as one thing. v8 splits them cleanly.
Producer adapters (src/adapters/) wrap third-party coding CLIs as the worker in the v6 verified-branch pipeline. Backends: Copilot, Claude Code, Codex, Claude Code Teams. All four are opt-in via swarm run --v6.
Falsifier adapters (src/falsification/adapters/) take a patch the producer's verifier already accepted and try to falsify the obligation by surfacing a counter-example, regression fixture, or property-violation trace. A confirmed counter-example flips the obligation back to failed.
| Falsifier | Default | Obligation types |
|---|---|---|
CodexFalsifier |
on | property-must-hold |
CopilotFalsifier |
on |
import-graph-must-satisfy, function-must-have-signature
|
ClaudeCodeFalsifier |
off (per-adapter opt-in) | all three |
The CLI surface is one flag: --falsifiers <on|off> (default on). Per-adapter selection happens at the API layer via defaultAdapterRegistry({ includeCopilot, includeClaudeCode }).
Four verification points
A patch has to survive these before it merges:
- Pre-generation memoization. Skip generation if the obligation result is already cached.
-
Mid-stream abort. During generation, the streaming verifier can abort the call. Works in
--mode singleonly; tournament mode skips it. - Post-generation per-obligation verifier. Scores the candidate diff. In tournament mode, top scorer wins; in single mode it's pass/fail.
- Post-merge integration check. After the diff lands, the integration check confirms the broader system still holds.
The architectural rule from the README: nothing commits without passing the obligation's verifier. Then the falsifiers get a shot.
The hash-chained ledger
Every action lands in .swarm/ledger/<run-id>.jsonl with the SHA of the prior entry. Tampering is detectable; runs resume from any prior state. If a process is killed mid-run, swarm v8 resume <run-id> walks the ledger and picks up where it left off.
The ledger format is shared with v6, but v8 writes more granular events (per-persona dispatch, per-candidate score, falsifier verdict) so a run can be replayed or audited end-to-end.
Quick start
git clone https://github.com/moonrunnerkc/swarm-orchestrator.git
cd swarm-orchestrator && npm install && npm run build && npm link
# Compile a goal, then run it
swarm v8 compile "add a /health endpoint that returns 200 OK" --yes
swarm v8 run .swarm/contracts/<contract-id>
# Or both in one step (defaults to v8)
swarm run --goal "add a /health endpoint that returns 200 OK"
# Resume a killed run
swarm v8 resume <run-id>
Requires Node >= 20, git >= 2.40, and ANTHROPIC_API_KEY. Pass --extractor stub --session stub to run offline.
There's also a GitHub Action:
- uses: moonrunnerkc/swarm-orchestrator@v8
with:
goal: "add a /health endpoint"
contract-only: false
cost-cap: "5.00"
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
What v8.0 doesn't do
Full list with rationale lives in Limitations worth reading before adopting
--mode tournament plus --forbid-import skips the streaming abort; streaming verification is --mode single only.--cost-cap is enforced post-obligation, not mid-call. Cumulative spend is checked at the end of each obligation against estimated Sonnet 4 pricing. Exit code 6 if exceeded.docs/v8-architecture-deviations.md.
Repo
moonrunnerkc
/
swarm-orchestrator
Contract-first AI coding swarm with hash-chained evidence. Compiles a goal into typed obligations, races persona candidates per obligation in a single cached inference session, verifies before commit, and logs every action in an append-only ledger.
Swarm Orchestrator
Contract-first AI coding swarm with hash-chained evidence and verifier-gated commits.
swarm compiles a natural-language goal into a typed contract, dispatches it to a
population of personas inside one cached Anthropic session, races candidate diffs per
obligation, and commits only the diffs that pass verification. After the producer's
verifier accepts a patch, registered falsifier adapters get a chance to break it
before it merges. Every action lands in an append-only hash-chained ledger you can
audit, resume, or replay.
It wraps an LLM; it does not replace one. The model writes the code, the orchestrator decides what reaches your repo.
Status
Version 8.0.1 on main. Node >= 20 (CI matrix: 20, 22). License ISC. The v8
architecture is the default for swarm run; the v6 verified-branch pipeline is
preserved under swarm run --v6 and the swarm swarm / swarm execute commands
Falsifier subsystem: Codex on, Copilot on, ClaudeCode…
Top comments (0)