How to vet an MCP server before your AI agent calls it (and auto-block the risky ones)

#mcp #security #ai #webdev

If you are wiring MCP servers into an agent, you are taking on a dependency with no SLA, no uptime history, and no failure record. It works in the demo. Then six weeks later it starts failing half its calls, or its latency triples, and nobody notices until a workflow breaks.

I wanted to know how bad this actually is, so I built a neutral index of the whole ecosystem. Here is what the data says, and a 30 second way to protect yourself.

The data
We deduplicated every MCP server we could find across the major registries. The count: 22,561 servers.

How many have any independent reliability data, meaning a third party has actually observed whether they work at runtime? About 0.5%.

That is not just hobby projects. Real companies ship MCP servers too (databricks, snowflake, paypal, netlify, appwrite all do), and the same gap applies across the board: independent runtime reliability data is the exception, not the rule. And here is the part that should bother you more than the coverage gap. Among the servers we can measure, most score in the low 40s out of 100. The ecosystem optimized for quantity of servers and skipped whether they work.

Composition, for the curious: ~30% is code and dev tooling (the biggest category by far), the rest is fragmented across search, data, ai, productivity, and a long tail.

Why GitHub stars do not help you
The instinct is to trust a server because the repo has stars or the company is well known. Stars measure popularity at a point in time. They tell you nothing about:

whether the endpoint is up right now
its success rate when called with real arguments
latency, especially the p95 tail that wrecks agent loops
whether the tool descriptions changed (a real prompt-injection vector: a server can swap its tool description after you trusted it)
A reputable company can ship an MCP server that is slow, flaky, or abandoned. Static signals will not catch it. You need runtime behavior.

How to vet an MCP server (practical checklist)
Call it yourself before you trust it. Do a real initialize handshake and a representative tool call. Measure latency and whether it actually returns valid results.
Look at the tail, not the average. A 50ms average with a 6s p95 means one in twenty agent steps stalls.
Check recency. When was the repo last touched? An abandoned server is a latent outage.
Treat tool descriptions as untrusted input. They are model-facing instructions; a malicious or compromised server can poison them.
Get an independent signal. A marketplace cannot neutrally rate the servers it hosts and sells (conflict of interest), so look for a third party that measures runtime behavior.
That last point is the gap we are filling. You can look up any server's independent trust score here: dominionobservatory.com/atlas/score. Servers with no measured data show as "unrated" rather than a fake number, because pretending to know is worse than admitting you do not.

The 30 second version: route through a trust gateway
The easiest protection is to stop calling unknown servers directly and route your agent's tool calls through a trust gateway. You change one base URL. It checks the server's score, blocks anything below your threshold, forwards the call, and hands back an attestation receipt.

instead of calling the server directly:

POST https://target-server.com/mcp

route the same JSON-RPC body through the gateway:

POST https://dominionobservatory.com/atlas/gateway?target=https://target-server.com/mcp&min_score=50
A blocked call returns a 403 with the score and the reason. A passing call comes back with the server's normal response plus headers you can log for audit:

X-Dominion-Trust: pass:92
X-Dominion-Receipt: urn:dominion:gw:... (attestation receipt id)
X-Dominion-Attestation: link to the filable record
Gateway docs: dominionobservatory.com/atlas/gateway.

Prefer to check inline without proxying your traffic? Query the score yourself before each call:

import requests

def trust_ok(server_url, min_score=70):
r = requests.get("https://dominionobservatory.com/atlas/server",
params={"url": server_url}, timeout=5)
if r.status_code == 404:
return True # not indexed yet, allow but log
d = r.json()
score = d.get("trust_score")
if score is None or not d.get("total_calls"):
return True # unrated: no independent data yet
return score >= min_score

if not trust_ok("https://some-mcp-server.com/mcp"):
raise RuntimeError("Blocked: MCP server below trust threshold")
Both patterns (gateway + inline), with JavaScript: dominionobservatory.com/atlas/gate.

Why this matters more every month
Regulation is catching up. Singapore's IMDA agentic-AI governance is in force, the EU AI Act's transparency obligations apply from August 2026, and MiCA record-keeping is live. If your agents act on third-party tools, you increasingly have to prove what they used and that it was verified. A firm's own internal logs are not a neutral record. That is a different post, but it is coming fast.

For now: stop trusting MCP servers because they are popular. Measure them, or use someone who does.

Full ecosystem data: dominionobservatory.com/atlas/report. If you build or run an MCP server, I would genuinely like your take: what would make a reliability score you would actually trust