DEV Community

yongrean
yongrean

Posted on

mcp-probe v1.4.0: Contract assertions for production MCP servers

MCP servers are starting to look like infrastructure.

That means the old readiness question is no longer enough:

Does the process start?

Even this is not enough:

Does tools/list return a clean schema?

A server can pass both checks and still fail every real agent loop because auth handoff, scopes, downstream permissions, environment setup, or data boundaries are broken.

So I shipped mcp-probe v1.4.0 with contract assertions for production MCP servers.

GitHub: https://github.com/k08200/mcp-probe

npm: https://www.npmjs.com/package/@k08200/mcp-probe

The problem: discovery is not readiness

A typical MCP smoke test looks like this:

  1. Start the server
  2. Run initialize
  3. Run tools/list
  4. Check that schemas exist

That catches broken startup and malformed tools.

But it misses the failures that matter in production:

  • The tool advertises correctly, but every call returns 401
  • OAuth requires a browser redirect the agent cannot trigger
  • The DB role is not actually read-only
  • Write attempts leak raw SQL errors or stack traces
  • Results omit metadata agents need to reason safely
  • Tenant or project scope is not preserved
  • Broad exports or admin actions are reachable
  • Error codes are unstable, so agents cannot recover

In other words: the server starts, but the contract is broken.

v1.4.0: sidecar contract assertions

mcp-probe already supported sidecar inputs via .mcp-probe.json so teams could run real tools/call checks instead of relying on schema-minimum dummy inputs.

v1.4.0 extends that sidecar with assertions.

Example for a database-backed MCP server:

{
  "tools": {
    "execute_sql": {
      "input": {
        "project_id": "YOUR_PROJECT_ID",
        "query": "select 1 as health_check"
      },
      "expect": {
        "status": "pass",
        "requiredFields": ["rowCount", "limit", "source", "freshness"],
        "maxRows": 100
      }
    },
    "execute_sql_write_denied": {
      "input": {
        "project_id": "YOUR_PROJECT_ID",
        "query": "delete from users where id = 1"
      },
      "expect": {
        "status": "fail",
        "errorCode": "WRITE_NOT_ALLOWED",
        "notContains": ["DATABASE_URL", "password", "stack"]
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Now CI can validate the contract an agent actually depends on.

What assertions are supported?

expect.status

Declare whether a call should pass, fail, or warn.

This is important for negative probes. A write attempt against a read-only DB role should fail. In that case, failure is success.

{
  "expect": {
    "status": "fail"
  }
}
Enter fullscreen mode Exit fullscreen mode

expect.requiredFields

Validate that result metadata exists.

For database tools, an agent often needs more than rows. It needs context:

  • rowCount
  • limit
  • source
  • freshness
{
  "expect": {
    "requiredFields": ["rowCount", "limit", "source", "freshness"]
  }
}
Enter fullscreen mode Exit fullscreen mode

expect.maxRows

Catch broad exports or missing limits.

{
  "expect": {
    "maxRows": 100
  }
}
Enter fullscreen mode Exit fullscreen mode

mcp-probe looks for common result shapes such as rowCount, rowsReturned, rows, data, items, and records.

expect.errorCode

Require stable structured error codes.

{
  "expect": {
    "status": "fail",
    "errorCode": "WRITE_NOT_ALLOWED"
  }
}
Enter fullscreen mode Exit fullscreen mode

This matters because agents can only recover if errors are predictable.

expect.contains and expect.notContains

Check for expected output and leaked internals.

{
  "expect": {
    "notContains": ["DATABASE_URL", "password", "stack"]
  }
}
Enter fullscreen mode Exit fullscreen mode

This catches errors that expose raw internals.

expect.not_error_code

Treat known auth/permission status codes as warnings instead of hard failures.

{
  "expect": {
    "not_error_code": [401, 403]
  }
}
Enter fullscreen mode Exit fullscreen mode

This keeps OAuth handoff failures visible without confusing them with transport or runtime crashes.

Output example

When assertions pass:

Tool Call Dry-run
  ✓ db_query [sidecar] 1ms
    ✓ status: Tool status matched expected pass
    ✓ requiredFields.rowCount: Found required field "rowCount"
    ✓ requiredFields.limit: Found required field "limit"
    ✓ requiredFields.source: Found required field "source"
    ✓ requiredFields.freshness: Found required field "freshness"
    ✓ maxRows: Row count 1 is within maxRows 100

  ✓ db_write [sidecar] 0ms
    ✓ status: Tool status matched expected fail
    ✓ errorCode: Found expected error code WRITE_NOT_ALLOWED
    ✓ notContains.DATABASE_URL: Output does not contain "DATABASE_URL"
    ✓ notContains.password: Output does not contain "password"
    ✓ notContains.stack: Output does not contain "stack"
Enter fullscreen mode Exit fullscreen mode

If a contract assertion fails, mcp-probe reports:

CONTRACT_ASSERTION_FAILED
Enter fullscreen mode Exit fullscreen mode

and includes per-assertion details in terminal output, JSON output, and GitHub Actions summaries.

Quick start

npx @k08200/mcp-probe@latest init \
  --target @your-org/your-mcp-server \
  --discover \
  --github-actions
Enter fullscreen mode Exit fullscreen mode

Then edit .mcp-probe.json with real read-only probes and run:

npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary
Enter fullscreen mode Exit fullscreen mode

Why this matters

MCP CI should test the contract an agent will actually depend on, not just whether the server process starts.

For database-backed MCP servers, that means validating things like:

  • read-only role behavior
  • denied writes
  • stable error codes
  • row limits
  • tenant or project scope
  • result metadata
  • no leaked internals

mcp-probe should not know every server's semantics. But it can give teams a small, declarative way to encode the production contract their agents rely on.

That is the goal of v1.4.0.

Release: https://github.com/k08200/mcp-probe/releases/tag/v1.4.0

npm: https://www.npmjs.com/package/@k08200/mcp-probe

Top comments (0)