We Scanned 100 MCP Servers. Anthropic's Own Reference Implementations Scored F.

AgentsID — Sun, 29 Mar 2026 20:46:48 +0000

We scanned 100 MCP server packages — including the official reference implementations from Anthropic, Microsoft, and Notion — and published the results.

Every vendor-maintained server that exposed tools scored F.

The Numbers

100 MCP server packages scanned
41 exposed tool definitions (59% were opaque to security review)
485 tools analyzed
893 total findings
71% scored F. Zero scored A.

The Gold Standard Failure

We didn't just scan random community packages. We targeted the servers that developers copy as templates:

Server	Maintainer	Tools	Grade
server-github	Anthropic	26	F
server-filesystem	Anthropic	14	F
@playwright/mcp	Microsoft	22	F
notion-mcp-server	Notion	22	F
server-puppeteer	Anthropic	7	F
server-memory	Anthropic	9	F
server-everything	Anthropic	13	F

Anthropic's GitHub MCP server exposes 26 tools — push_files, merge_pull_request, fork_repository — with zero input validation, zero per-tool permissions, and zero scope boundaries. An agent with a GitHub PAT can push to any repo, merge any PR, and fork any project the token can access. No guardrails.

These aren't theoretical risks. The related @modelcontextprotocol/server-git was hit with CVE-2025-68143 (path traversal) and CVE-2025-68144 (argument injection) in early 2026. Our scanner identifies exactly the structural preconditions — unbounded strings, no schema constraints — that made those CVEs inevitable.

Hallucination-Based Vulnerabilities: A New Vulnerability Class

We identified something no one else is scanning for: hallucination-based vulnerabilities (HBVs) — security weaknesses that exist in the semantic space between what a tool description says and what the LLM infers.

163 HBVs across 41 servers. Seven classes:

Vague descriptions — "manages user data" could mean read or delete. The LLM picks whichever fits the prompt.
A*mbiguous tool names* — manage_users gives the model no signal about whether it creates or destroys.
Missing scope boundaries — "access files" without specifying which files.
Short descriptions — 17 characters forces the LLM to hallucinate capabilities.
No description — behavior is entirely inferred from the name.
Implicit authority escalation — dangerous tool described as a "helper utility."
Overlapping descriptions — two tools with 92% description overlap. The LLM picks one non-deterministically.

HBVs are invisible to traditional scanners (SAST, DAST). They can't be fixed by patching code — they require rewriting tool descriptions. And they work even with perfect authentication. OAuth doesn't help when the tool schema allows anything.

The Thesis

The MCP specification is vulnerable by default. It allows — and through its reference implementations, actively encourages — empty schemas, unbounded inputs, and vague tool descriptions. Schema strictness and semantic validation must move from optional best practice to protocol-level mandatory.

Try It Yourself

The scanner is open source:

npx @agentsid/scanner -- npx @modelcontextprotocol/server-filesystem ./

Full paper, methodology, and all 100 scan reports:
https://github.com/stevenkozeniesky02/agentsid-scanner/blob/master/docs/state-of-agent-security-2026.md

Steven Kozeniesky — AgentsID Research (agentsid.dev)

Why 88% of MCP Servers Have No Real Authentication (And How to Fix It)

AgentsID — Fri, 27 Mar 2026 20:23:41 +0000

AI agents are accessing databases, sending emails, calling APIs, and making purchases. But there's no standard way to identify them, limit what they can do, or trace their actions back to a human.

I dug into the numbers:

88% of MCP servers need authentication
Only 8.5% use OAuth
53% rely on static API keys in environment variables
80% of organizations can't tell what their agents are doing in real-time

This is the wild west. So I built AgentsID to fix it.

The Problem

When you build an MCP server, every tool is wide open by default. Any agent with the API key can call any tool — search, delete, deploy, admin reset — with zero restrictions.

There's no way to:

Give Agent A access to search but block delete
Know which agent made which tool call
Trace an agent's actions back to the human who authorized it

The Fix: 3 Lines of Middleware

Install the SDK:

npm install @agentsid/sdk

Add the middleware:

import { createHttpMiddleware } from '@agentsid/sdk'; 

const guard = createHttpMiddleware({
  projectKey: process.env.AGENTSID_PROJECT_KEY,
});

Validate every tool call:

const auth = await guard.validate(token, toolName);
if (!auth.permission.allowed) { 
  return { error: 'Blocked', reason: auth.permission.reason };
}

That's it. Every tool call is now validated.

What You Can Control

AgentsID uses a deny-first model. Everything is blocked unless you explicitly allow it. The permission engine supports 14 constraint types:

Access — Allow/deny by tool name with wildcards (search_* allowed, delete_* blocked)
Time & Rate — Restrict to business hours, limit calls per minute/hour
Behavioral — Require tools to run in sequence, detect anomalous behavior
Resource — Set budget caps, limit session duration
Governance — Require human approval for sensitive actions, limit delegation depth

Delegation Chains

When Agent A spawns Agent B, permissions automatically narrow. Agent B can never have more access than Agent A. Revoke the parent and the entire chain downstream stops.

Audit Trail

Every tool call — allowed or denied — is logged. You get a full record of what each agent did, when, and why it was allowed or blocked. The dashboard shows it all in a live feed.

Getting Started

npm install @agentsid/sdk    # TypeScript
pip install agentsid          # Python
gem install agentsid          # Ruby

Free tier: 25 agents, 10,000 events/month. No credit card.

https://agentsid.dev/dashboard
https://agentsid.dev/docs
https://agentsid.dev/guides (Claude, Cursor, Codex)
https://github.com/stevenkozeniesky02/agentsid

DEV Community: AgentsID