swarm-test v0.3.4 — Auto-Classifying Agent Roles (and Why Role Changes What a Failure Means)

#ai #python #testing #opensource

swarm-test v0.3.4 adds automatic agent role classification. The tool now figures out what each agent does in your graph — and uses that to interpret risk differently per role.

Here's the core insight: not all agent failures mean the same thing.

An orchestrator with 90% blast radius is expected. That's its job — it routes work to everything, so of course its failure impacts everything. It needs a fallback, but the high blast radius itself isn't a design flaw.

A worker with 90% blast radius is a design smell. A worker shouldn't have that much downstream impact. If it does, something is wired wrong.

Same metric. Opposite meaning. Until now, swarm-test flagged both identically. Now it knows the difference.

How classification works:

swarm-test analyzes each agent's position in the graph — in-degree, out-degree, betweenness centrality, connection patterns — combined with name and role hints. It assigns one of:

ORCHESTRATOR — routes work, central, high blast radius by design
WORKER — does task work, should be replaceable
VALIDATOR — checks/approves outputs, security-sensitive
GATEWAY — entry/exit point, on the critical path
AGGREGATOR — collects from many agents (high in-degree)
MONITOR — observes the system, off the critical path
ROUTER — intermediate hop

Each comes with a confidence score and a risk profile.

I ran it on my own 14-agent system (ARE, a passport-photo processing pipeline). It correctly identified:

ComplianceAgent → VALIDATOR (97% confidence, flagged security-sensitive)
HealthMonitorAgent → MONITOR (87%)
OrchestratorAgent → ORCHESTRATOR (flagged "expected high blast")
The processing agents → WORKER / GATEWAY

The role-adjusted severity is where it gets useful. A validator with context leakage gets its severity upgraded — a validator leaking data is a security problem, not just a reliability one. An orchestrator with high blast radius gets a note that it's expected-by-design, so you focus on adding a fallback rather than panicking about the number.

This also sets up something bigger: once the tool understands roles, you can declare expected roles in config and catch when an agent drifts from its intended role over time. More on that soon.

Works across CrewAI, LangGraph, AutoGen, and custom orchestrators.

pip install swarm-test --upgrade
GitHub: github.com/surajkumar811/swarm-test