AI agents are everywhere. CrewAI has 45K+ GitHub stars. AutoGPT has 182K+. LangChain sits at 100K+. But here is the question nobody seems to be asking: how secure are these frameworks?
OWASP released the Agentic AI Top 10 in 2025, identifying the most critical security risks in autonomous AI systems. We built a free scanner that checks agent code against all of them.
The results were not great.
The Numbers
We scanned 27 of the most popular agent frameworks and SDKs:
-
9 FAIL (critical findings --
exec(),os.system(), no sandboxing) - 9 WARN (high-severity issues -- supply chain risks, prompt injection vectors)
- 9 PASS (clean scan)
- 31 total OWASP violations across all frameworks
The full registry with every framework, verdict, risk score, and OWASP mapping is live at registry.agentsign.dev.
What We Check
12 detection rules, each mapped to a specific OWASP Agentic AI risk:
| Rule | OWASP | Severity | What it catches |
|---|---|---|---|
| AS-001 | AA-03 | CRITICAL | Unsafe code execution (exec, eval, os.system) |
| AS-002 | AA-05 | HIGH | Hardcoded secrets and API keys |
| AS-003 | AA-04 | MEDIUM | Excessive permissions |
| AS-004 | AA-02 | HIGH | Prompt injection via file input |
| AS-005 | AA-02 | CRITICAL | Known injection patterns (SQL, XSS, command) |
| AS-006 | AA-09 | HIGH | Code execution without sandboxing |
| AS-007 | AA-06 | LOW | Supply chain without integrity checks |
| AS-008 | AA-01 | HIGH | Excessive agency / auto-approval |
| AS-009 | AA-07 | MEDIUM | Unsafe output handling (XSS via agent output) |
| AS-010 | AA-08 | MEDIUM | Insufficient logging/monitoring |
| AS-011 | AA-10 | HIGH | Data exfiltration patterns |
| AS-012 | MCP-07 | HIGH | MCP server without authentication |
Notable Findings
Some of the most-starred projects have the most critical findings:
Open Interpreter (57K stars): Risk score 80/100.
exec(),os.system(),child_process, no sandbox, excessive agency. This is a code agent that runs commands on your machine by design, but the scan flags that there are no isolation mechanisms.AutoGPT (182K stars): Risk score 65/100.
exec(),os.system(), no sandbox. The most-starred AI agent framework fails on unsafe code execution.LangChain (100K stars): WARN verdict. Supply chain risks and prompt injection vectors. Not critical, but worth monitoring.
Anthropic SDK, Vercel AI SDK, Google ADK: All PASS with clean scans. These frameworks were designed with security constraints from the start.
How to Scan Your Own Agent
No signup. No API key. Three ways to use it:
1. GitHub Action (recommended)
Create .github/workflows/agentsign.yml:
name: AgentSign Security Scan
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: razashariff/agentsign-action@v1
with:
path: '.'
fail-on: 'FAIL'
Every push and PR gets scanned. FAIL blocks the merge. Outputs include verdict, risk score, and findings count.
2. cURL
curl -X POST https://registry.agentsign.dev/api/scan \
-H "Content-Type: application/json" \
-d '{"code": "exec(user_input)", "name": "my-agent"}'
Returns:
{
"verdict": "FAIL",
"risk_score": 40,
"findings": [
{
"rule": "AS-001",
"owasp": "AA-03",
"severity": "CRITICAL",
"detail": "Dangerous code patterns: exec()"
}
]
}
3. Shields.io Badge
Add a live security badge to your README:

PASS = green. WARN = yellow. FAIL = red. Cached 5 minutes.
API Endpoints
All public, all free, rate-limited at 30 req/min:
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/scan |
Scan code against 12 OWASP rules (max 50KB) |
| GET | /api/badge/:name |
Shields.io-compatible badge endpoint |
| GET | /api/rules/version |
Current rules version and count |
| GET | /api/registry |
Full registry as JSON |
Why This Matters
The OWASP Agentic AI Top 10 exists because these are real attack vectors. Agents that call exec() without sandboxing can be hijacked through prompt injection. Agents with hardcoded secrets leak them. Agents without logging leave no audit trail.
As agents get more autonomous -- booking flights, writing code, managing infrastructure -- the blast radius of a compromised agent grows. Static analysis is not a silver bullet, but it is the minimum. If your agent framework fails basic pattern matching against known risks, that is worth knowing.
Links
- Registry: registry.agentsign.dev
- GitHub Action: razashariff/agentsign-action@v1
- OWASP Agentic AI Top 10: genai.owasp.org
- AgentSign Platform: agentsign.dev
Feedback, issues, or want your framework rescanned? Open an issue or reach out at contact@agentsign.dev.
Top comments (0)