DEV Community

suraj kumar
suraj kumar

Posted on

swarm-test is now a GitHub Action — multi-agent reliability testing on every PR

swarm-test v0.3.0 turns multi-agent reliability testing into a CI/CD gate.

The Setup

Add this to .github/workflows/reliability.yml:

name: Agent Reliability
on: [pull_request]
jobs:
swarm-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: surajkumar811/swarm-test@v0.3.0
with:
script: my_crew.py
fail-on-severity: high

That's it. Every PR now gets tested for cascade failures, blast radius, context leakage, intent drift, collusion, timeout resilience, contract violations, and single points of failure.

What You See on the PR

Findings show up as inline annotations:

  • Critical findings → errors (block the merge)
  • High findings → warnings
  • Medium findings → notices

Plus a job summary with your Swarm Score and the top findings with remediation steps.

Why This Matters

Most teams test individual agents and call it done. But the failures that take down production live in the interactions between agents — and those only surface when you test the whole graph.

Running this manually means you test when you remember. Running it in CI means you test every single change, automatically, before it merges.

Works Across Frameworks

CrewAI, LangGraph, AutoGen — same action, same config. The graph topology is what gets tested, not the framework.

pip install swarm-test --upgrade
GitHub: github.com/surajkumar811/swarm-test

Top comments (0)