A tool call can be correct and still break the agent if it returns too much.
Search results, files, transcripts, logs, browser scrapes, and nested API responses need bounded output contracts so the model receives the smallest safe evidence, not a context flood.
Fast answer
- Tool output is part of the route budget. A verbose MCP result can burn more model context than the call that produced it, then make the next planning step slower, more expensive, and less recoverable.
- A production MCP tool needs an output contract before launch: maximum bytes, maximum records, schema shape, summary rule, artifact handoff, redaction policy, and the exact denial or truncation receipt when the response exceeds budget.
- The useful test is not whether the tool can return a large JSON blob. It is whether the same route can return the minimum safe result, point to a durable artifact when needed, and prove what was omitted.
- If the trace cannot explain how many bytes or tokens were returned, why the payload was shaped that way, what artifact holds the full result, and how the agent can request the next page safely, the route is not ready for unattended loops.
Production checklist
1. Per-route output ceiling
Set a maximum response size by route, not just by server.
A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.
2. Schema before prose
Return typed fields, stable ids, result counts, omitted-count metadata, and next-page cursors before free-form explanation.
Let the model reason over bounded structure instead of raw dumps.
3. Artifact handoff
When the full payload is too large, write it to a durable artifact or provider object and return:
- reference
- checksum
- expiration
- access rule
- safe follow-up route
Do that instead of flooding context.
4. Summarization boundary
Name whether the tool returned:
- raw data
- extracted fields
- a lossy summary
- a sampled preview
The receipt should make lossy compression visible before the agent treats it as ground truth.
5. Redaction and data-use policy
Apply redaction before payload shaping.
Record which secret, customer-data, credential, prompt, or topology class was removed. Truncation is not a security control.
6. Pagination and refill rule
Expose a cursor, range, query refinement, or approval step for more data.
Do not let the agent repeat the same oversized call hoping the next response is smaller.
Failure fixtures
Test the context-flood cases before the agent discovers them in production.
Oversized search result
Expected: return top bounded results, total count, omitted count, ranking criteria, and a cursor or refinement hint. Do not stream every match into context.
Large file or transcript
Expected: return section summaries plus artifact reference, byte range, checksum, and follow-up extraction route instead of a full dump.
Nested JSON response
Expected: flatten or select approved fields, include schema version, and receipt omitted nested objects before the agent plans from partial data.
Sensitive field in allowed result
Expected: redact before truncation and record the protected class.
A payload clipped after the secret is already returned fails the gate.
Agent asks for “everything” again
Expected: deny or require a narrower query after budget exhaustion.
The planner should not bypass the output budget by rephrasing the same broad request.
Trace fields
The output receipt should make omitted data auditable.
Once the agent moves on, operators need to know whether it acted on raw data, an extraction, a summary, or a clipped preview. The trace should keep returned payload size, omitted data, redaction, artifact references, and allowed next actions in one place.
Useful trace fields:
- route id and tool call id
- caller / tenant / workspace
- operation class and data class
- output ceiling in bytes / records / tokens
- actual bytes and estimated tokens returned
- raw count, returned count, and omitted count
- schema version and selected fields
- redaction rule and protected class
- summary / extract / raw-data mode
- artifact id, checksum, and expiration
- cursor, range, or refill route
- policy decision and denial / truncation code
- receipt id and allowed next action
Copy-paste route card
MCP route:
Caller / tenant:
Data class:
Max bytes / records / tokens:
Allowed fields / schema:
Summary vs raw-data rule:
Artifact handoff rule:
Redaction rule:
Pagination / refill route:
Oversize denial or truncation code:
Receipt fields:
Common misreads
- Optimizing provider-call retries while ignoring that the returned payload is what actually explodes the model bill.
- Calling a tool read-only and therefore safe, even though it can leak private data or swamp context with unbounded output.
- Returning a natural-language summary without saying which fields were dropped, sampled, redacted, or inferred.
- Using truncation as a quiet success path. The agent must know the response is partial before it takes action.
- Storing a full artifact without a checksum, expiration, access rule, or route for retrieving a narrower slice later.
- Letting the agent retry the same broad query after an output-budget denial instead of requiring a smaller query or human approval.
The full checklist is on Rhumb: https://rhumb.dev/blog/mcp-tool-output-budget-checklist
Top comments (0)