I ran the same 39 AI agent security samples through three scanners: AgentGuard, Semgrep, and CodeQL.
The Results
| Scanner | Detection Rate | False Positives |
|---|---|---|
| AgentGuard v0.6.4 | 100% (39/39) | 0 |
| Semgrep | 0% (0/39) | 0 |
| CodeQL | 0% (0/39) | 0 |
Zero. Semgrep and CodeQL detected nothing. They have zero rules for AI agent security.
AgentGuard has 17 detection rules covering all 10 OWASP ASI categories plus 4 novel attack vectors: Memory Poisoning, Tool Output Trust, Action Chain Amplification, and Multi-Agent Collusion.
Real World
AgentGuard found 332 critical vulnerabilities across Microsoft AutoGen and LlamaIndex. Issues reported directly: autogen#7917, autogen#7918, llama_index#22245.
Reproduce
git clone https://github.com/dockfixlabs/agentguard-benchmark
cd agentguard-benchmark
pip install dfx-agentguard
python benchmark.py
GitHub: https://github.com/dockfixlabs/agentguard
PyPI: pip install dfx-agentguard
Top comments (0)