We shipped what looked like a routine improvement to one of our MCP tools: a declared outputSchema, generated from our existing Zod types. Server-side smoke tests passed. The structured output validated cleanly against JSON Schema 2020-12 with an independent validator. We deployed.
Then Claude Desktop refused to call the tool at all.
This post is a write-up of that incident—what failed, why all our tests missed it, and what an actual end-to-end test for an MCP server needs to look like. One caveat up front: this is n=1, observed on our specific setup (@modelcontextprotocol/sdk ^1.10, zodToJsonSchema, Claude Desktop as the client, June 2026). Treat it as a mechanism worth knowing about, not a universal law.
"MCP tool not showing up": outputSchema rejected by the client ingest layer
That heading is the literal symptom, because it's what you'll be searching for at 11 p.m. The variants we tried ourselves: MCP tool not showing up, outputSchema rejected, MCP tool ingest failed, Claude Desktop tool error request_id.
Here's what we saw, concretely:
- Calling the tool from Claude Desktop produced an error carrying a
request_id—the request reached Anthropic's side and was rejected there. - Our server logs showed a
new sessionline and then... nothing. No[tool]invocation line. The call never arrived at our handler. - A
curlagainst our own server with the same payload? Worked perfectly. Valid response, validstructuredContent.
So the tool definition itself was being rejected somewhere between the client and our server—at what I'll call the tool-ingest layer: the validation Anthropic's infrastructure runs on tool definitions before the model is ever allowed to use them. Our server was never consulted.
The setup: adding outputSchema via zodToJsonSchema
The tool in question was watch_entity from our open-source Czech company due-diligence MCP server (cz-agents on GitHub). We already returned structuredContent and wanted to declare its shape, which the MCP spec supports via outputSchema:
import { zodToJsonSchema } from "zod-to-json-schema";
const watchEntityOutput = z.object({
ico: z.string(),
status: z.literal("watching"),
expires_at: z.string().nullable(), // <- this matters later
});
server.registerTool(
"watch_entity",
{
description: "Watch a Czech company for changes",
inputSchema: watchEntityInput,
outputSchema: zodToJsonSchema(watchEntityOutput), // <- the change
},
handler
);
Looks innocent. But look at what zodToJsonSchema actually emits by default:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"ico": { "type": "string" },
"status": { "const": "watching" },
"expires_at": {
"anyOf": [
{ "type": "string" },
{ "type": "null" }
]
}
},
"required": ["ico", "status", "expires_at"]
}
Three constructs in this schema are worth noticing:
-
The
$schemaURI is draft-07, not the JSON Schema 2020-12 dialect the MCP spec gravitates toward. -
z.null()/.nullable()produces"type": "null"branches. z.literal()producesconst.
All three are completely standard JSON Schema. Every off-the-shelf validator we threw at this—including an independent jsonschema check of our structuredContent against the schema—passed without complaint.
But the ingest layer that vets tool definitions on Anthropic's side was, at the time we hit this, stricter than a generic validator. It didn't accept this combination, and the failure mode wasn't "schema ignored"—it was the entire tool being rejected.
I want to be careful with framing here: this isn't a story about anyone doing something wrong. The SDK emitted valid draft-07. Our validator correctly validated it. The ingest layer enforced a tighter profile than "any valid JSON Schema." Validation layers differ, and the only place all of them meet is a live round-trip. That lesson survives even if the ingest layer accepts these constructs tomorrow.
The diagnostic key: asymmetry
What saved us from days of wrong theories was one observation: only the tool with outputSchema failed. Our text-only tool on the same server, same session, same deploy—get_dd_report—kept working the whole time.
That asymmetry rules out almost everything else you'd suspect first:
- A network or transport issue would hit both tools.
- A server crash or bad deploy would hit both tools.
- A client-side transient would not reproduce selectively, every time, on exactly one tool.
A generic outage doesn't aim. When one tool fails deterministically and its siblings don't, diff the tool definitions, not the infrastructure. In our case, the diff was one field: outputSchema.
The second diagnostic key was the log shape. A new session line with no [tool] line means the handshake happened but the tool call was never dispatched to us. Combined with an error that carries a request_id, that places the rejection firmly on the ingest side—not in our process, not in the user's network.
The fix: drop outputSchema, keep structuredContent
Here's the part that surprised me: structuredContent works fine without a declared outputSchema. The schema declaration is metadata; the structured payload travels regardless.
server.registerTool(
"watch_entity",
{
description: "Watch a Czech company for changes",
inputSchema: watchEntityInput,
// outputSchema removed — structuredContent still flows
},
async (args) => {
const result = await watchEntity(args);
return {
content: [{ type: "text", text: summarize(result) }],
structuredContent: result, // <- still delivered to the client
};
}
);
We removed the outputSchema declaration, redeployed, and the end-to-end flow worked perfectly: Claude Desktop called the tool, the [tool] line showed up in docker logs, and structured content arrived at the client. You lose client-side schema validation of your output, which is a real (if modest) loss—but you keep the structured data, and you keep a working tool.
If you do want to keep outputSchema, the cautious path based on what we observed is to post-process the generated schema—strip the draft-07 $schema URI, replace "type": "null" branches with a non-null type plus optionality, replace const with a single-value enum—and then test it through a real client before trusting it. Which brings us to the actual point.
Why your curl smoke test will never catch this
Our smoke test did what most MCP smoke tests do: hit the server over HTTP, list tools, call each one, and validate the response. It's a fine test of our code. It exercises exactly zero of the validation that happens on the client/platform side.
The chain for a hosted MCP tool call looks roughly like this:
Claude (model) → Anthropic tool-ingest/validation → your MCP server → back
A curl test starts at step 3. Everything that can reject your tool in steps 1–2—schema dialect restrictions, definition size limits, naming rules, whatever else the platform enforces—is invisible to it. Server-side green means "my half works," and nothing more.
So our deploy checklist gained one non-negotiable step. The real E2E test for an MCP server is a live client round-trip:
- Deploy to a staging endpoint.
- Connect a real Claude client (Claude Desktop, claude.ai, or Claude Code) to it.
- Ask it to invoke every tool—especially any tool whose definition changed, not just its handler.
- Watch your server logs for the invocation marker. For us:
docker logs -f mcp-server | grep '\[tool\]'. A session line without a tool line indicates a rejection upstream of your server. - Confirm the result rendered correctly in the client.
It's manual, it's slightly annoying, and it takes five minutes. It is also the only test in our suite that would have caught this—and the only one that exercises the same path your users do.
Takeaways
- A tool definition change is riskier than a handler change: it gets re-validated by every layer between you and the model, and the failure mode is the whole tool disappearing.
-
zodToJsonSchemadefaults to draft-07 withtype: "null"andconst—valid JSON Schema that stricter ingest profiles may not accept. Inspect the generated schema; don't assume. - Selective failure is information. One tool down, siblings up → diff the definitions.
- A
new sessionlog line without a[tool]log line localizes the rejection upstream of your server. -
structuredContentdoes not requireoutputSchema. When in doubt, ship the data without the declaration. - Server-side smoke tests validate your half of the contract. Only a live Claude client round-trip validates the whole thing. Put one in your deploy checklist.
*Observed June 2026 on @modelcontextprotocol/sdk ^1.10 with Claude Desktop. If the ingest behavior has changed since, the specific constructs may pass—the testing lesson stands either way. The server involved is open source: cz-agents MCP servers
Top comments (0)