I wrote a read-only scanner for MCP / agent-gateway production-readiness

#ai #security #devops #mcp

I build and maintain an MCP server that works in Claude Code, Cursor, and Gemini CLI. Doing that for a while taught me something uncomfortable: the distance between a registered MCP server and full access to everything it exposes is roughly zero. The convenience is the exposure.

So when teams move an agent from "works in the demo" to "touches real systems," the same gaps show up every time. Tool access gets granted per-server instead of per-operation. Fail-open becomes the default nobody chose on purpose. New servers show up by editing a config and restarting — no review gate, no pinned versions. Often nothing traces a call across agent → gateway → model → tool, and multi-model usage runs with no routing policy, which becomes a surprise bill at the end of the month.

Most of that is visible by reading the repo. I got tired of reading it by hand, so I wrote mcp-gateway-scan.

What it checks

It's a static, read-only scan across seven dimensions:

Tool-access / RBAC — are tools scoped per-operation against an identity, or does registering a server hand over its whole surface?
Fail-close — when a model, gateway, or tool dependency degrades, does the system refuse safely or improvise?
Onboarding / supply-chain pinning — is adding a server a reviewed, declarative change with pinned versions, or a config edit and a restart?
Observability / OTel — is there end-to-end tracing you can reconstruct a call from?
Multi-LLM routing & cost — is there a routing policy, fallback behavior, and per-team cost attribution, or one hardcoded model and hope?
Secrets / identity — credentials in configs vs. references; is identity propagated to tool calls?
Production-readiness — kill-switch, rate limits, eval gates, the operational table stakes.

It's read-only by design. It never executes your code and never prints a secret value. At most it points to a secret literal sitting inline in a config it shouldn't be — a location to check, not a verdict.

The seven dimensions track failure modes the official MCP security guidance and the OWASP LLM Top 10 call out — over-broad tool access, fail-open agency, and unpinned supply chains among them (MCP security best practices, OWASP LLM Top 10).

Three ways to run it

CLI, for a one-off read:

npx mcp-gateway-scan ./your-repo

CI gate, exits non-zero on any red:

npx mcp-gateway-scan --ci ./your-repo

MCP server, so your agent can scan on request:

claude mcp add gateway-scan -- npx -y mcp-gateway-scan mcp

Then ask Claude or Cursor "scan my gateway" and the agent calls the tool itself. The thing that scans agent stacks is, fittingly, an agent tool.

What the output looks like

mcp-gateway-scan ./your-repo

  Tool-access / RBAC ............ yellow  tools registered per-server; no per-operation grants
  Fail-close .................... red     no fallback policy; gateway errors return last cached response
  Onboarding / supply-chain ..... green   declarative config, versions pinned
  Observability / OTel .......... yellow  request logging present; no trace propagation to tool calls
  Multi-LLM routing & cost ...... green   routing policy + per-team attribution found
  Secrets / identity ............ red     API key inline in gateway.config.json
  Production-readiness .......... yellow  rate limits present; no kill-switch

  2 red · 3 yellow · 2 green

Reds are the ones I'd fix before the next deploy. Yellows are "you have the foundation, it's not wired through." Greens mean the pattern is actually in place, not just intended.

Wiring the CI gate

For most teams, the --ci flag is where this earns its keep. Drop it in a workflow and a new red blocks the merge:

- name: Gateway readiness gate
  run: npx mcp-gateway-scan --ci ${{ github.workspace }}

Non-zero exit on red, zero otherwise. You can ratchet: ship today with yellows allowed, then tighten the gate to fail on yellow once you've cleared the reds. The scanner is fast and has no network dependency, so it's cheap to run on every PR.

What it doesn't do

It's a first read, not a verdict. A static scan can tell you there's no fallback policy in the config; it can't tell you whether your fail-close behavior actually holds under load, because that needs fault injection against a running system. It reads what you declared, not what happens at 3 a.m. when a provider rate-limits you.

For calibration I took the seven-dimension methodology to LiteLLM, a mature and widely-deployed proxy, pinned at a commit. It came back 4 green / 3 yellow / 0 red — production-ready, with the usual edges a structured pass surfaces in a good codebase: one authorization resolver that fails open where its siblings fail closed, unpinned third-party MCP servers, and per-tool least-privilege left opt-in. Then I built a deliberately-broken reference gateway that fails the same checks by construction. The scanner here automates the static slice of that methodology — the part you can run in seconds, before any deeper review.

Try it

Run it: npx mcp-gateway-scan ./your-repo — MIT, github.com/willianpinho/mcp-gateway-scan
npm: mcp-gateway-scan

If you want the deeper version, I run a full Provenwright MCP Gateway Readiness Audit (live fault-injection, trace verification, a 90-day remediation roadmap) at willianpinho.com/mcp-audit. The free scanner stands on its own; that's just there if you want the rest.