DEV Community

Cover image for We scanned 4,162 MCP servers. 73% are invisible to AI agents.
Hiroki Honda
Hiroki Honda

Posted on

We scanned 4,162 MCP servers. 73% are invisible to AI agents.

There are 4,162 MCP servers registered on Smithery right now. The Python and TypeScript SDKs see 97 million monthly downloads. Every major AI provider has adopted MCP.

But nobody had measured the quality of these tools.

We built ToolRank, an open-source scoring engine that analyzes MCP tool definitions across four dimensions. Then we pointed it at the entire Smithery registry.

The biggest finding wasn't about quality. It was about visibility.

73% of MCP servers are invisible

Out of 4,162 registered servers, only 1,122 expose tool definitions that agents can read.

The remaining 3,040 — 73% — have no tool definitions at all. They're registered, but when an AI agent searches for tools, these servers don't exist. They have no name, no description, no schema. They are invisible.

This is the equivalent of having a website with no indexable content. Google can't rank what it can't read. Agents can't select what they can't see.

Among the visible: average score 84.7/100

For the 1,122 servers that do expose tool definitions, we scored each one across four dimensions:

Dimension Weight What it measures
Findability (25%) Can agents discover you? Registry presence, naming
Clarity (35%) Can agents understand you? Description quality, purpose, context
Precision (25%) Is your schema precise? Types, enums, required fields
Efficiency (15%) Are you token-efficient? Definition size, tool count

Average ToolRank Score: 84.7/100

Level Score Count %
Dominant 85-100 677 60%
Preferred 70-84 406 36%
Selectable 50-69 39 3.5%
Visible 25-49 0 0%
Absent 0-24 0 0%

The average is higher than expected. But there's a survivorship bias: servers that bother to expose tool definitions tend to be better maintained overall. The real quality problem is the 73% that expose nothing.

Top scoring servers

Rank Server Score
1 microsoft/learn_mcp 96.5
2 docfork/docfork 96.5
3 brave (Brave Search) 94.7
4 LinkupPlatform/linkup-mcp-server 93.5
5 smithery-ai/national-weather-service 93.3

What do top servers do differently? They start descriptions with clear action verbs. They include usage context ("Use this when..."). They define required fields, enums, and defaults. They keep tool count under 15.

The most common defects

Among the 1,122 scored servers, the most frequent issues:

  1. Missing usage context — Description says what the tool does, but not when to use it. Agents need "Use this when..." to decide between competing tools.

  2. No return value described — Agents can't predict what they'll get back. This leads to incorrect downstream handling.

  3. Missing parameter descriptions — Schema has types but no explanations. An agent sees "query": {"type": "string"} but doesn't know what format the query should be in.

  4. No required fields defined — Agents guess which parameters are mandatory, leading to failed executions.

  5. Description too short — Under 50 characters. Not enough information for reliable selection.

What does this mean?

Academic research backs up why this matters. A study of 10,831 MCP servers (arXiv 2602.18914) found that tools with quality-compliant descriptions achieve 72% selection probability vs 20% for non-compliant ones. That's a 3.6x advantage.

The fixes are trivial. Adding "Use this when..." to a description. Defining required fields. Starting the description with a verb. These are 5-minute changes with measurable impact.

We're calling this ATO

ATO (Agent Tool Optimization) is the practice of optimizing tools so AI agents autonomously discover, select, and execute them.

SEO optimized for search engines. LLMO optimized for LLM citations. ATO optimizes for agent tool selection.

The key difference: SEO and LLMO result in mentions and links. ATO results in your API being called and transactions occurring. LLMO is Stage 1 of ATO — necessary but not sufficient.

Full ATO framework →

Try it yourself

Score your tools: toolrank.dev/score — paste your tool definition JSON. Includes "Try example" buttons so you can see how scoring works instantly.

Ecosystem ranking: toolrank.dev/ranking — live rankings updated weekly.

Scoring engine: github.com/imhiroki/toolrank — fully open source. The scoring logic is transparent.

Methodology

  • Data source: Smithery Registry API (registry.smithery.ai)
  • Scan date: March 28, 2026
  • Servers in registry: 4,162
  • Servers with tool definitions: 1,122 (27%)
  • Scoring level: Level A (rule-based, 14 checks, zero LLM cost)
  • Request interval: 2 seconds (respectful to Smithery infrastructure)
  • Full scan time: ~2 hours
  • Code: Open source

Data updates weekly via automated scanning. Daily differential scans catch new servers.


Built by @imhiroki. Questions, feedback, or want to improve your score? Open an issue on GitHub.

Top comments (0)