DEV Community

Martin Havel
Martin Havel

Posted on

Your MCP Server Passes Every Test — and Claude Still Rejects the Tool

We shipped what looked like a routine improvement to one of our MCP tools: a declared outputSchema, generated from our existing Zod types. Server-side smoke tests passed. The structured output validated cleanly against JSON Schema 2020-12 with an independent validator. We deployed.

Then Claude Desktop refused to call the tool at all.

This post is a write-up of that incident—what failed, why all our tests missed it, and what an actual end-to-end test for an MCP server needs to look like. One caveat up front: this is n=1, observed on our specific setup (@modelcontextprotocol/sdk ^1.10, zodToJsonSchema, Claude Desktop as the client, June 2026). Treat it as a mechanism worth knowing about, not a universal law.

"MCP tool not showing up": outputSchema rejected by the client ingest layer

That heading is the literal symptom, because it's what you'll be searching for at 11 p.m. The variants we tried ourselves: MCP tool not showing up, outputSchema rejected, MCP tool ingest failed, Claude Desktop tool error request_id.

Here's what we saw, concretely:

  • Calling the tool from Claude Desktop produced an error carrying a request_id—the request reached Anthropic's side and was rejected there.
  • Our server logs showed a new session line and then... nothing. No [tool] invocation line. The call never arrived at our handler.
  • A curl against our own server with the same payload? Worked perfectly. Valid response, valid structuredContent.

So the tool definition itself was being rejected somewhere between the client and our server—at what I'll call the tool-ingest layer: the validation Anthropic's infrastructure runs on tool definitions before the model is ever allowed to use them. Our server was never consulted.

The setup: adding outputSchema via zodToJsonSchema

The tool in question was watch_entity from our open-source Czech company due-diligence MCP server (cz-agents on GitHub). We already returned structuredContent and wanted to declare its shape, which the MCP spec supports via outputSchema:

import { zodToJsonSchema } from "zod-to-json-schema";

const watchEntityOutput = z.object({
  ico: z.string(),
  status: z.literal("watching"),
  expires_at: z.string().nullable(),  // <- this matters later
});

server.registerTool(
  "watch_entity",
  {
    description: "Watch a Czech company for changes",
    inputSchema: watchEntityInput,
    outputSchema: zodToJsonSchema(watchEntityOutput), // <- the change
  },
  handler
);
Enter fullscreen mode Exit fullscreen mode

Looks innocent. But look at what zodToJsonSchema actually emits by default:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "ico": { "type": "string" },
    "status": { "const": "watching" },
    "expires_at": {
      "anyOf": [
        { "type": "string" },
        { "type": "null" }
      ]
    }
  },
  "required": ["ico", "status", "expires_at"]
}
Enter fullscreen mode Exit fullscreen mode

Three constructs in this schema are worth noticing:

  1. The $schema URI is draft-07, not the JSON Schema 2020-12 dialect the MCP spec gravitates toward.
  2. z.null() / .nullable() produces "type": "null" branches.
  3. z.literal() produces const.

All three are completely standard JSON Schema. Every off-the-shelf validator we threw at this—including an independent jsonschema check of our structuredContent against the schema—passed without complaint.

But the ingest layer that vets tool definitions on Anthropic's side was, at the time we hit this, stricter than a generic validator. It didn't accept this combination, and the failure mode wasn't "schema ignored"—it was the entire tool being rejected.

I want to be careful with framing here: this isn't a story about anyone doing something wrong. The SDK emitted valid draft-07. Our validator correctly validated it. The ingest layer enforced a tighter profile than "any valid JSON Schema." Validation layers differ, and the only place all of them meet is a live round-trip. That lesson survives even if the ingest layer accepts these constructs tomorrow.

The diagnostic key: asymmetry

What saved us from days of wrong theories was one observation: only the tool with outputSchema failed. Our text-only tool on the same server, same session, same deploy—get_dd_report—kept working the whole time.

That asymmetry rules out almost everything else you'd suspect first:

  • A network or transport issue would hit both tools.
  • A server crash or bad deploy would hit both tools.
  • A client-side transient would not reproduce selectively, every time, on exactly one tool.

A generic outage doesn't aim. When one tool fails deterministically and its siblings don't, diff the tool definitions, not the infrastructure. In our case, the diff was one field: outputSchema.

The second diagnostic key was the log shape. A new session line with no [tool] line means the handshake happened but the tool call was never dispatched to us. Combined with an error that carries a request_id, that places the rejection firmly on the ingest side—not in our process, not in the user's network.

The fix: drop outputSchema, keep structuredContent

Here's the part that surprised me: structuredContent works fine without a declared outputSchema. The schema declaration is metadata; the structured payload travels regardless.

server.registerTool(
  "watch_entity",
  {
    description: "Watch a Czech company for changes",
    inputSchema: watchEntityInput,
    // outputSchema removed — structuredContent still flows
  },
  async (args) => {
    const result = await watchEntity(args);
    return {
      content: [{ type: "text", text: summarize(result) }],
      structuredContent: result, // <- still delivered to the client
    };
  }
);
Enter fullscreen mode Exit fullscreen mode

We removed the outputSchema declaration, redeployed, and the end-to-end flow worked perfectly: Claude Desktop called the tool, the [tool] line showed up in docker logs, and structured content arrived at the client. You lose client-side schema validation of your output, which is a real (if modest) loss—but you keep the structured data, and you keep a working tool.

If you do want to keep outputSchema, the cautious path based on what we observed is to post-process the generated schema—strip the draft-07 $schema URI, replace "type": "null" branches with a non-null type plus optionality, replace const with a single-value enum—and then test it through a real client before trusting it. Which brings us to the actual point.

Why your curl smoke test will never catch this

Our smoke test did what most MCP smoke tests do: hit the server over HTTP, list tools, call each one, and validate the response. It's a fine test of our code. It exercises exactly zero of the validation that happens on the client/platform side.

The chain for a hosted MCP tool call looks roughly like this:

Claude (model) → Anthropic tool-ingest/validation → your MCP server → back
Enter fullscreen mode Exit fullscreen mode

A curl test starts at step 3. Everything that can reject your tool in steps 1–2—schema dialect restrictions, definition size limits, naming rules, whatever else the platform enforces—is invisible to it. Server-side green means "my half works," and nothing more.

So our deploy checklist gained one non-negotiable step. The real E2E test for an MCP server is a live client round-trip:

  1. Deploy to a staging endpoint.
  2. Connect a real Claude client (Claude Desktop, claude.ai, or Claude Code) to it.
  3. Ask it to invoke every tool—especially any tool whose definition changed, not just its handler.
  4. Watch your server logs for the invocation marker. For us: docker logs -f mcp-server | grep '\[tool\]'. A session line without a tool line indicates a rejection upstream of your server.
  5. Confirm the result rendered correctly in the client.

It's manual, it's slightly annoying, and it takes five minutes. It is also the only test in our suite that would have caught this—and the only one that exercises the same path your users do.

Takeaways

  • A tool definition change is riskier than a handler change: it gets re-validated by every layer between you and the model, and the failure mode is the whole tool disappearing.
  • zodToJsonSchema defaults to draft-07 with type: "null" and const—valid JSON Schema that stricter ingest profiles may not accept. Inspect the generated schema; don't assume.
  • Selective failure is information. One tool down, siblings up → diff the definitions.
  • A new session log line without a [tool] log line localizes the rejection upstream of your server.
  • structuredContent does not require outputSchema. When in doubt, ship the data without the declaration.
  • Server-side smoke tests validate your half of the contract. Only a live Claude client round-trip validates the whole thing. Put one in your deploy checklist.

*Observed June 2026 on @modelcontextprotocol/sdk ^1.10 with Claude Desktop. If the ingest behavior has changed since, the specific constructs may pass—the testing lesson stands either way. The server involved is open source: cz-agents MCP servers

Top comments (0)