I monitored 11 public MCP servers. Latency ranged 215 (97ms to 21 seconds).

#mcp #ai #devtools #monitoring

TL;DR: I built a tiny tool that speaks the MCP protocol and ran it against 11 public Model Context Protocol servers. Handshake latency ranged from 97ms to nearly 21 seconds — a 215× spread — and the bigger problem isn't downtime at all. Free live index + open-source CLI at the bottom.

The itch

The Model Context Protocol went from a proposal to 10,000+ public servers in about a year. Agents now lean on these servers the way web apps lean on APIs. But I kept hitting flaky failures while building on them and couldn't tell: was it my code, or the server?

There's no Pingdom for MCP. So I built one — and the first thing I did was point it at a set of well-known public servers.

How it works (no API, just the protocol)

The trick is that MCP is just a protocol — JSON-RPC over HTTP. So instead of calling some third-party API, the tool pretends to be an agent: it runs the real handshake (initialize then tools/list), exactly like Claude or any MCP client would, and measures the round trip. Zero dependencies, ~150 lines.

Finding 1: a 215× latency spread

Server	Handshake latency	Tools
Hugging Face	97 ms	8
Context7	108 ms	2
Cloudflare Docs	148 ms	2
SpaceMolt	194 ms	183
Exa	239 ms	2
CrashStory	522 ms	18
Roundtable	551 ms	13
DeepWiki	605 ms	3
Chainflip Broker	634 ms	6
Microsoft Learn	664 ms	3
GitMCP	20,820 ms	5

Median was 522ms. If your agent calls a tool on a 21-second server, that's 21 seconds your user stares at a spinner — or the request times out. Latency here isn't vanity; it's whether the agent works or hangs.

Finding 2: the real blind spot — contract drift

Those 11 servers expose 245 tools between them. Every tool is a contract: a name and a set of required inputs that agents depend on.

Here's what nobody's watching. A normal uptime monitor sees 200 OK and says healthy. But MCP servers rarely fail by going down — they fail by quietly changing the contract: a tool gets renamed in a redeploy, an optional param becomes required, a tool disappears. The server still returns 200 OK, and every agent calling it silently breaks.

Uptime tells you the server answered. It can't tell you the server still does what your agent expects.

Catching that requires actually speaking the protocol and diffing the tool schemas over time. That's the whole point of the tool.

Try it

Free live index (continuously updated): https://mcpwatch.app/reliability-index.html
Full report: https://mcpwatch.app/report.html
Open-source CLI (MIT): npx mcpwatch — https://github.com/ClaudefoustCEO/mcpwatch

If you run an MCP server, I'd genuinely love to add it to the index — drop the URL in the comments. And I'm curious what you'd want monitored that I'm not thinking about.

Top comments (1)

Alex Shev • Jun 15

The latency spread is the signal here. MCP servers are often discussed like they are just capability catalogs, but handshake and tool-call latency become part of the agent's reasoning budget.

I would love to see this kind of index include a "safe to put in an interactive loop" threshold, not only uptime. A 20-second server can still be useful, but probably not in the same workflow as a 100ms one.