SpiderRating

Posted on Mar 23 • Originally published at spiderrating.com

State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.

#ai #security #python #opensource

We scanned every publicly available MCP server and OpenClaw skill — 15,923 in total. Here's the complete security landscape of the AI tool ecosystem.

TL;DR: 36% of MCP servers scored F (failing). 42 skills confirmed malicious (0.4%), with 552 initially flagged. Token leakage is the #1 vulnerability, found in 757 servers. Only 2% earned a B grade or higher.

The Dataset

SpiderRating analyzed 15,923 AI tools across two ecosystems:

5,725 MCP servers (Model Context Protocol — the standard for connecting AI agents to external tools)
10,198 OpenClaw/ClawHub skills (agent behavior definitions for Claude, Cursor, Windsurf)

Each tool was rated on three dimensions: Description Quality, Security, and Metadata — combined into a SpiderScore (0-10) and letter grade (A-F).

This is the largest independent security analysis of the MCP/AI tool ecosystem to date.

Key Findings

1. Most AI Tools Are Mediocre — Only 2% Score B or Higher

Grade	MCP Servers	Skills	What It Means
A (9.0+)	0 (0%)	0 (0%)	No tool meets "exemplary" standards
B (7.0-8.9)	116 (2%)	95 (1%)	Production-ready with good practices
C (5.0-6.9)	1,995 (35%)	9,050 (89%)	Adequate but room for improvement
D (3.0-4.9)	1,546 (27%)	1,052 (10%)	Significant quality/security gaps
F (<3.0)	2,068 (36%)	1 (0%)	Failing — serious issues

Zero tools scored A. MCP servers have a bimodal distribution: either decent (C) or terrible (F).

2. Token Leakage Is the #1 Vulnerability

We found 32,691 security findings across the ecosystem.

Rank	Vulnerability	Servers Affected	Findings
1	Token Leakage	757 (13%)	6,632
2	Command Injection	269 (5%)	1,007
3	SQL Injection	105 (2%)	787
4	Path Traversal	244 (4%)	761
5	Prototype Pollution	145 (3%)	489
6	Hardcoded Credentials	163 (3%)	389
7	Secret Leakage (metadata)	114 (2%)	376
8	Command Injection (os)	112 (2%)	263

Token leakage alone accounts for 20% of all findings. API keys, auth tokens, and secrets are being exposed through MCP tool outputs.

3. 36% of MCP Servers Score F

More than a third of MCP servers are fundamentally unsafe:

Average MCP score: 4.11/10
Average skill score: 5.91/10

Why MCP servers score worse: Description quality crisis — average 3.13/10. Most servers don't tell AI agents what their tools do.

4. 552 Skills Flagged, 42 Confirmed Malicious

We used a two-pass security analysis:

Automated Threat Scanner — pattern matching for known malicious behaviors
LLM Verification — Claude Haiku reviews each finding to distinguish "security tool describing attacks" from "malicious skill executing attacks"

Results:

552 skills initially flagged with critical security issues
42 confirmed malicious after LLM verification (0.4% of ecosystem)
97% of automated findings were false positives — mostly legitimate security tools whose descriptions triggered keyword-based detection

5. The Description Quality Crisis

97% of tools lack a scenario trigger — they don't tell the AI when to use them.

Signal	Coverage
Has action verb	~60%
Has scenario trigger	~3%
Has param documentation	~45%
Has error guidance	~8%

AI agents frequently choose the wrong tool — not because AI is dumb, but because tool documentation is broken.

What This Means for Developers

If you build MCP servers:

Write scenario triggers — tell AI agents when to use each tool
Don't log tokens — use structured error handling that strips secrets
Use parameterized queries — SQL injection is #3
Add a README and license — it's 20% of your score

If you install AI tools:

Check the SpiderScore before installing — below C (5.0) has known issues
Be cautious with skills rated critical — 0.4% are confirmed malicious
Prefer tools with B grade — they've demonstrated security best practices

Methodology

Scanner: spidershield (open source, MIT)
Data: 15,923 tools, 78,849 tool descriptions, 32,691 security findings
Precision: 93.6% calibrated accuracy
Scoring: Description (45%) + Security (35%) + Metadata (20%)

Data updated daily. Full methodology available at spiderrating.com.

What's the worst MCP security issue you've encountered? Share in the comments.

DEV Community