MCP servers are starting to look like normal infrastructure.
That means they need boring infrastructure checks.
The mistake I kept seeing is this:
"The server starts, and
tools/listreturns a clean schema. Therefore it works."
That is not enough.
An MCP server can pass initialize, advertise every expected tool, and still fail every real call because auth, scopes, tenant boundaries, environment variables, downstream permissions, or read-only roles are broken.
So I pushed mcp-probe@1.8.0 further toward being a real CI readiness gate for MCP servers.
npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warn
What changed
1. Warnings can now fail CI
By default, warnings still exit 0. That keeps existing users from getting surprise CI failures.
But production gates often need stricter behavior:
mcp-probe --config mcp-probe.config.json --fail-on-warn
With --fail-on-warn, auth handoff issues, permission warnings, or incomplete readiness receipts can block the workflow.
That matters because many MCP failures are not hard crashes. They are degraded states:
- OAuth flow requires a browser redirect the agent cannot complete
- a server starts but every tool call returns
401 - a database tool works with admin credentials but fails with the intended read-only role
- the workflow mentions a probe but does not actually run the production boundary check
2. Doctor now checks the actual workflow receipt
mcp-probe doctor already checked whether a GitHub Actions workflow existed.
But that is not enough either.
The new behavior is stricter: the required flags must appear on the same actual mcp-probe run step.
This should pass:
- run: npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warn
This should not count as a complete gate:
- run: npx @k08200/mcp-probe --config mcp-probe.config.json
- run: npx @k08200/mcp-probe ./server.js --github-summary --fail-on-warn
The flags are present somewhere in the workflow, but no single run step proves the intended config is actually being checked with CI summaries and strict warning handling.
That is the difference between "we have a gate" and "the gate is enforcing the thing we trust."
3. Tool call coverage is now tied to expected tools
For config-based checks, you can declare the expected tool catalog:
{
"servers": [
{
"name": "datadog",
"target": "https://mcp.example.com/mcp",
"transport": "http",
"headers": {
"Authorization": "Bearer ${DATADOG_MCP_TOKEN}"
},
"expectedTools": ["logs_query"],
"forbiddenTools": ["delete_dashboard", "rotate_api_key"],
"toolsFile": "./datadog.tools.json"
}
]
}
If expectedTools and toolsFile are both set, every expected tool needs a sidecar sample input.
That means CI checks not just "is the tool advertised?" but "did we actually provide a meaningful dry-run sample for the tool an agent depends on?"
4. Sidecar inputs are the real contract
Auto-generated inputs are useful for smoke tests, but they mostly hit schema validation.
Real readiness checks need meaningful inputs:
{
"tools": {
"logs_query": {
"input": {
"query": "service:web status:error",
"timeframe": "1h"
},
"expect": {
"status": "pass",
"not_error_code": [401, 403],
"requiredFields": ["source", "freshness"],
"maxRows": 100
}
}
}
}
For database-backed MCP servers, these assertions are the interesting part:
- does the read-only role work?
- are row limits enforced?
- are broad exports/admin actions absent or gated?
- are denied writes structured enough for agents to recover?
- do results include provenance fields like source and freshness?
- does the response avoid leaking secrets, stack traces, or raw internals?
Install
npm install -D @k08200/mcp-probe
Or run directly:
npx @k08200/mcp-probe@latest doctor
npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warn
GitHub: https://github.com/k08200/mcp-probe
npm: https://www.npmjs.com/package/@k08200/mcp-probe
The goal is simple: CI for MCP should test the contract an agent will actually depend on, not just whether the process starts.
Top comments (0)