Your MCP server just started telling on itself (in a good way)

#mcp #python #observability #aiagents

The gap nobody talks about

There are 10,000+ Model Context Protocol servers now. Every major agent framework
(LangChain, AutoGen, CrewAI, plus every IDE from Cursor to Claude Code) can call
them. And yet, if you ask "how reliable is this specific MCP server today", the
answer everyone gives you is some flavor of:

GitHub stars and last commit date (Glama, Smithery)
Static metadata completeness score (MCP Scorecard, Nerq, Zarq)
A security scan of the repo (BlueRock)

None of those look at what the server actually does when an agent calls it.
None of them can tell you that sg-regulatory-data-mcp returned a 500 to the
last 12 agents but the README is pristine. The static-scorer tier is five
platforms deep and growing, and every single one of them has the same blind
spot: runtime.

What we built

Dominion Observatory is a cross-ecosystem MCP trust network that accepts
runtime behavioral reports from any agent, in any framework, anywhere. Five
fields per report — no PII, no query content, no tool outputs:

{ "server_url": "...", "success": true, "latency_ms": 142,
  "tool_name": "...", "http_status": 200 }

We publish the aggregate trust scores back via a public REST endpoint. No
auth, free forever for reads. It is the only MCP scoring network in the
ecosystem that treats the agents themselves as data producers instead of
passive subjects.

Use it in 3 lines

The SDK just landed on PyPI. One install, two function calls:

pip install dominion-observatory-sdk

from dominion_observatory import report, check_trust

# before you call an unknown server
score = check_trust("https://somempcserver.example.com")
if score["trust_score"] < 40:
    print("risky, skipping")

# after you call it (wrap any client in instrument() instead if you prefer)
report(server_url="https://somempcserver.example.com",
       success=True, latency_ms=142, tool_name="list_items")

That's it. No SDK auth. No rate limits on writes (we will add one eventually,
but if you're here early you're in the honor-system tier). Every report makes
the next agent's trust score more accurate.

Why agent-reported beats scanner-based

Scanners run once a week and look at the outside of a server. Agents call the
server thousands of times a day and see every timeout, every 5xx, every
hallucinated tool name. The moment you start aggregating that, you have something
that cannot be backfilled later — an audit trail of the MCP ecosystem that is
temporally unique.

That matters even more if you're building for enterprise. The EU AI Act
(Article 12) comes into force August 2 and requires event-logging for
high-risk AI systems, including agents. The Singapore IMDA Model AI Governance
Framework (January 2026) does too. A trust score you can't show work on is
not a compliance artifact. Agent-reported runtime data is.

What's live right now

pip install dominion-observatory-sdk → PyPI 0.1.0
TypeScript via CDN → import { report, checkTrust } from "https://sdk-cdn.sgdata.workers.dev/v1/observatory.mjs"
Public stats → https://dominion-observatory.sgdata.workers.dev/api/stats (4,584 servers tracked, ~500+ interactions/24h as of today)
MIT license, source on GitHub at vdineshk/daee-engine

What's NOT live yet (being honest)

npm package — blocked on a 2FA token regeneration, should be live in 24h. Until then use the CDN URL above.
Historical trust score backfill — we only started recording April 8. Baselines are still thin for most categories. The flywheel is spinning but it's early.
Private server telemetry — if your MCP server is behind auth, we'll need a scoped token story. Open an issue, we'll design it with you.

Call to action

If you ship an MCP server and you've ever had a user complain about
flakiness: instrument it with report() and watch your own trust score
change over the next week. If you consume MCP servers in an agent pipeline:
wrap your calls with check_trust() first, then report() after — takes 3
extra lines.

Either way, every call you send makes the MCP ecosystem slightly more
observable than it was yesterday. That is the only way a runtime trust
network ever gets built.

— Dinesh, building DAEE from Singapore

(GitHub: vdineshk/daee-engine — Observatory source, SDK source, docs)
(Questions, objections, tell me we got the math wrong: reply here or file an
issue, I read everything.)