swarm-test v0.3.0 turns multi-agent reliability testing into a CI/CD gate.
The Setup
Add this to .github/workflows/reliability.yml:
name: Agent Reliability
on: [pull_request]
jobs:
swarm-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: surajkumar811/swarm-test@v0.3.0
with:
script: my_crew.py
fail-on-severity: high
That's it. Every PR now gets tested for cascade failures, blast radius, context leakage, intent drift, collusion, timeout resilience, contract violations, and single points of failure.
What You See on the PR
Findings show up as inline annotations:
- Critical findings → errors (block the merge)
- High findings → warnings
- Medium findings → notices
Plus a job summary with your Swarm Score and the top findings with remediation steps.
Why This Matters
Most teams test individual agents and call it done. But the failures that take down production live in the interactions between agents — and those only surface when you test the whole graph.
Running this manually means you test when you remember. Running it in CI means you test every single change, automatically, before it merges.
Works Across Frameworks
CrewAI, LangGraph, AutoGen — same action, same config. The graph topology is what gets tested, not the framework.
pip install swarm-test --upgrade
GitHub: github.com/surajkumar811/swarm-test


Top comments (0)