<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: SpiderRating</title>
    <description>The latest articles on DEV Community by SpiderRating (@spiderrating).</description>
    <link>https://dev.to/spiderrating</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3839266%2F2503f847-aeb2-4bb8-bd60-09aff1a58382.jpeg</url>
      <title>DEV Community: SpiderRating</title>
      <link>https://dev.to/spiderrating</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/spiderrating"/>
    <language>en</language>
    <item>
      <title>98% of MCP Tools Don't Tell AI Agents When to Use Them</title>
      <dc:creator>SpiderRating</dc:creator>
      <pubDate>Mon, 23 Mar 2026 18:05:42 +0000</pubDate>
      <link>https://dev.to/spiderrating/98-of-mcp-tools-dont-tell-ai-agents-when-to-use-them-28a8</link>
      <guid>https://dev.to/spiderrating/98-of-mcp-tools-dont-tell-ai-agents-when-to-use-them-28a8</guid>
      <description>&lt;p&gt;We analyzed &lt;strong&gt;78,849 tool descriptions&lt;/strong&gt; across 15,923 MCP servers and AI skills. The results explain a lot about why AI agents feel "dumb."&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: Only 2% of tools tell the AI agent &lt;em&gt;when&lt;/em&gt; to use them. Only 3% document their parameters. This is why AI agents pick the wrong tool — and it's fixable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What AI Agents Need&lt;/th&gt;
&lt;th&gt;What They Get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;"What does this tool do?" (action verb)&lt;/td&gt;
&lt;td&gt;68% have one&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"When should I use this tool?" (scenario trigger)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2% have one&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"What format should parameters be?" (param docs)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3% have them&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"Can you show me an example?" (param examples)&lt;/td&gt;
&lt;td&gt;7% have them&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"What happens if it fails?" (error guidance)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2% have it&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;98% of tools don't tell the AI agent when to use them.&lt;/strong&gt; The agent has to &lt;em&gt;guess&lt;/em&gt; from the tool name and a vague description.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Security Problem
&lt;/h2&gt;

&lt;p&gt;As a Reddit user pointed out in response to our &lt;a href="https://spiderrating.com/blog/state-of-mcp-security-2026" rel="noopener noreferrer"&gt;State of MCP Security report&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The missing usage guidance number is the one that doesn't get enough attention. When a tool doesn't tell the agent when to use it, the agent has to infer from context. That inference step is exactly where a poisoned tool description or injected instruction can redirect behavior."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Missing scenario triggers aren't just a quality problem — they're an &lt;strong&gt;attack surface&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Description Score Gap
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP servers&lt;/strong&gt;: average description score &lt;strong&gt;3.13/10&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills&lt;/strong&gt;: average description score &lt;strong&gt;5.67/10&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Skills score higher because the SKILL.md format encourages structured descriptions. MCP servers have no such convention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Better Descriptions = Better Scores
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Description Score&lt;/th&gt;
&lt;th&gt;Average Overall Score&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Low (0-3)&lt;/td&gt;
&lt;td&gt;4.55&lt;/td&gt;
&lt;td&gt;3,751&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mid (3-5)&lt;/td&gt;
&lt;td&gt;5.39&lt;/td&gt;
&lt;td&gt;2,665&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Good (5-7)&lt;/td&gt;
&lt;td&gt;5.32&lt;/td&gt;
&lt;td&gt;7,976&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Great (7-10)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.47&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,531&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Tools with great descriptions score &lt;strong&gt;42% higher overall&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Good Tool Description Looks Like
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Bad&lt;/strong&gt; (98% look like this):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search"&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;items"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Good&lt;/strong&gt; (what AI agents need):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_products"&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;product&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;catalog&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;by&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;keyword,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;category,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;or&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;range.&lt;/span&gt;
  &lt;span class="s"&gt;Use&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;this&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;when&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;asks&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;find,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;browse,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;or&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;look&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;up&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;products.&lt;/span&gt;
  &lt;span class="s"&gt;Returns&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;up&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;20&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;sorted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;by&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;relevance.&lt;/span&gt;
  &lt;span class="s"&gt;Parameters:&lt;/span&gt;
    &lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(string,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;required):&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;keywords&lt;/span&gt;
    &lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(string,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;optional):&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Filter&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;by&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;
  &lt;span class="s"&gt;Errors:&lt;/span&gt;
    &lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Returns&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;empty&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;array&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;if&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;
    &lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Returns&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;429&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;if&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;limited&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;—&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;wait&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;60&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;seconds"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Paradigm Shift
&lt;/h2&gt;

&lt;p&gt;Most developers write tool descriptions for humans. But AI agents don't have common sense:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Human-to-Human: "Search for items" → human infers the rest
Human-to-Agent: "Search products by keyword. Use when user wants to find
                 or discover products. Not for order lookup — use
                 get_order instead."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're still learning how to write for non-human intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Do Today
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add scenario triggers&lt;/strong&gt; — "Use this when..."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document parameters&lt;/strong&gt; beyond the JSON schema&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add error guidance&lt;/strong&gt; — what should the agent do when things fail?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run &lt;code&gt;spidershield scan&lt;/code&gt;&lt;/strong&gt; on your server — it scores your descriptions&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Scanner is open source (MIT): &lt;a href="https://github.com/teehooai/spidershield" rel="noopener noreferrer"&gt;github.com/teehooai/spidershield&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Full data: &lt;a href="https://spiderrating.com" rel="noopener noreferrer"&gt;spiderrating.com&lt;/a&gt; | Previous: &lt;a href="https://spiderrating.com/blog/state-of-mcp-security-2026" rel="noopener noreferrer"&gt;State of MCP Security 2026&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 of our MCP ecosystem research. Part 1: &lt;a href="https://spiderrating.com/blog/state-of-mcp-security-2026" rel="noopener noreferrer"&gt;State of MCP Security 2026&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.</title>
      <dc:creator>SpiderRating</dc:creator>
      <pubDate>Mon, 23 Mar 2026 04:55:54 +0000</pubDate>
      <link>https://dev.to/spiderrating/state-of-mcp-security-2026-we-scanned-15923-ai-tools-heres-what-we-found-2k2g</link>
      <guid>https://dev.to/spiderrating/state-of-mcp-security-2026-we-scanned-15923-ai-tools-heres-what-we-found-2k2g</guid>
      <description>&lt;p&gt;We scanned every publicly available MCP server and OpenClaw skill — &lt;strong&gt;15,923 in total&lt;/strong&gt;. Here's the complete security landscape of the AI tool ecosystem.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: 36% of MCP servers scored F (failing). 42 skills confirmed malicious (0.4%), with 552 initially flagged. Token leakage is the #1 vulnerability, found in 757 servers. Only 2% earned a B grade or higher.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Dataset
&lt;/h2&gt;

&lt;p&gt;SpiderRating analyzed &lt;strong&gt;15,923 AI tools&lt;/strong&gt; across two ecosystems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;5,725 MCP servers&lt;/strong&gt; (Model Context Protocol — the standard for connecting AI agents to external tools)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10,198 OpenClaw/ClawHub skills&lt;/strong&gt; (agent behavior definitions for Claude, Cursor, Windsurf)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each tool was rated on three dimensions: Description Quality, Security, and Metadata — combined into a &lt;strong&gt;SpiderScore&lt;/strong&gt; (0-10) and letter grade (A-F).&lt;/p&gt;

&lt;p&gt;This is the largest independent security analysis of the MCP/AI tool ecosystem to date.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Findings
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Most AI Tools Are Mediocre — Only 2% Score B or Higher
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Grade&lt;/th&gt;
&lt;th&gt;MCP Servers&lt;/th&gt;
&lt;th&gt;Skills&lt;/th&gt;
&lt;th&gt;What It Means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;A&lt;/strong&gt; (9.0+)&lt;/td&gt;
&lt;td&gt;0 (0%)&lt;/td&gt;
&lt;td&gt;0 (0%)&lt;/td&gt;
&lt;td&gt;No tool meets "exemplary" standards&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;B&lt;/strong&gt; (7.0-8.9)&lt;/td&gt;
&lt;td&gt;116 (2%)&lt;/td&gt;
&lt;td&gt;95 (1%)&lt;/td&gt;
&lt;td&gt;Production-ready with good practices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;C&lt;/strong&gt; (5.0-6.9)&lt;/td&gt;
&lt;td&gt;1,995 (35%)&lt;/td&gt;
&lt;td&gt;9,050 (89%)&lt;/td&gt;
&lt;td&gt;Adequate but room for improvement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;D&lt;/strong&gt; (3.0-4.9)&lt;/td&gt;
&lt;td&gt;1,546 (27%)&lt;/td&gt;
&lt;td&gt;1,052 (10%)&lt;/td&gt;
&lt;td&gt;Significant quality/security gaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;F&lt;/strong&gt; (&amp;lt;3.0)&lt;/td&gt;
&lt;td&gt;2,068 (36%)&lt;/td&gt;
&lt;td&gt;1 (0%)&lt;/td&gt;
&lt;td&gt;Failing — serious issues&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Zero tools scored A.&lt;/strong&gt; MCP servers have a bimodal distribution: either decent (C) or terrible (F).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Token Leakage Is the #1 Vulnerability
&lt;/h3&gt;

&lt;p&gt;We found &lt;strong&gt;32,691 security findings&lt;/strong&gt; across the ecosystem.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Vulnerability&lt;/th&gt;
&lt;th&gt;Servers Affected&lt;/th&gt;
&lt;th&gt;Findings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Token Leakage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;757 (13%)&lt;/td&gt;
&lt;td&gt;6,632&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Command Injection&lt;/td&gt;
&lt;td&gt;269 (5%)&lt;/td&gt;
&lt;td&gt;1,007&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;SQL Injection&lt;/td&gt;
&lt;td&gt;105 (2%)&lt;/td&gt;
&lt;td&gt;787&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Path Traversal&lt;/td&gt;
&lt;td&gt;244 (4%)&lt;/td&gt;
&lt;td&gt;761&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Prototype Pollution&lt;/td&gt;
&lt;td&gt;145 (3%)&lt;/td&gt;
&lt;td&gt;489&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Hardcoded Credentials&lt;/td&gt;
&lt;td&gt;163 (3%)&lt;/td&gt;
&lt;td&gt;389&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Secret Leakage (metadata)&lt;/td&gt;
&lt;td&gt;114 (2%)&lt;/td&gt;
&lt;td&gt;376&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Command Injection (os)&lt;/td&gt;
&lt;td&gt;112 (2%)&lt;/td&gt;
&lt;td&gt;263&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Token leakage alone accounts for 20% of all findings.&lt;/strong&gt; API keys, auth tokens, and secrets are being exposed through MCP tool outputs.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. 36% of MCP Servers Score F
&lt;/h3&gt;

&lt;p&gt;More than a third of MCP servers are fundamentally unsafe:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Average MCP score: 4.11/10&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Average skill score: 5.91/10&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why MCP servers score worse: &lt;strong&gt;Description quality crisis&lt;/strong&gt; — average 3.13/10. Most servers don't tell AI agents what their tools do.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. 552 Skills Flagged, 42 Confirmed Malicious
&lt;/h3&gt;

&lt;p&gt;We used a two-pass security analysis:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Automated Threat Scanner&lt;/strong&gt; — pattern matching for known malicious behaviors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM Verification&lt;/strong&gt; — Claude Haiku reviews each finding to distinguish "security tool describing attacks" from "malicious skill executing attacks"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;552 skills&lt;/strong&gt; initially flagged with critical security issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;42 confirmed malicious&lt;/strong&gt; after LLM verification (0.4% of ecosystem)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;97% of automated findings were false positives&lt;/strong&gt; — mostly legitimate security tools whose descriptions triggered keyword-based detection&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. The Description Quality Crisis
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;97% of tools lack a scenario trigger&lt;/strong&gt; — they don't tell the AI &lt;em&gt;when&lt;/em&gt; to use them.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Coverage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Has action verb&lt;/td&gt;
&lt;td&gt;~60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Has scenario trigger&lt;/td&gt;
&lt;td&gt;~3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Has param documentation&lt;/td&gt;
&lt;td&gt;~45%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Has error guidance&lt;/td&gt;
&lt;td&gt;~8%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;AI agents frequently choose the wrong tool — not because AI is dumb, but because tool documentation is broken.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Developers
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you build MCP servers:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write scenario triggers — tell AI agents &lt;em&gt;when&lt;/em&gt; to use each tool&lt;/li&gt;
&lt;li&gt;Don't log tokens — use structured error handling that strips secrets&lt;/li&gt;
&lt;li&gt;Use parameterized queries — SQL injection is #3&lt;/li&gt;
&lt;li&gt;Add a README and license — it's 20% of your score&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;If you install AI tools:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check the SpiderScore before installing — below C (5.0) has known issues&lt;/li&gt;
&lt;li&gt;Be cautious with skills rated critical — 0.4% are confirmed malicious&lt;/li&gt;
&lt;li&gt;Prefer tools with B grade — they've demonstrated security best practices&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Methodology
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scanner&lt;/strong&gt;: &lt;a href="https://github.com/teehooai/spidershield" rel="noopener noreferrer"&gt;spidershield&lt;/a&gt; (open source, MIT)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data&lt;/strong&gt;: 15,923 tools, 78,849 tool descriptions, 32,691 security findings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Precision&lt;/strong&gt;: 93.6% calibrated accuracy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring&lt;/strong&gt;: Description (45%) + Security (35%) + Metadata (20%)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data updated daily. Full methodology available at &lt;a href="https://spiderrating.com" rel="noopener noreferrer"&gt;spiderrating.com&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's the worst MCP security issue you've encountered? Share in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>python</category>
    </item>
  </channel>
</rss>
