DEV Community

Rhumb
Rhumb

Posted on • Originally published at rhumb.dev

MCP Tool Output Budget Checklist

A tool call can be correct and still break the agent if it returns too much.

Search results, files, transcripts, logs, browser scrapes, and nested API responses need bounded output contracts so the model receives the smallest safe evidence, not a context flood.

Fast answer

  • Tool output is part of the route budget. A verbose MCP result can burn more model context than the call that produced it, then make the next planning step slower, more expensive, and less recoverable.
  • A production MCP tool needs an output contract before launch: maximum bytes, maximum records, schema shape, summary rule, artifact handoff, redaction policy, and the exact denial or truncation receipt when the response exceeds budget.
  • The useful test is not whether the tool can return a large JSON blob. It is whether the same route can return the minimum safe result, point to a durable artifact when needed, and prove what was omitted.
  • If the trace cannot explain how many bytes or tokens were returned, why the payload was shaped that way, what artifact holds the full result, and how the agent can request the next page safely, the route is not ready for unattended loops.

Production checklist

1. Per-route output ceiling

Set a maximum response size by route, not just by server.

A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.

2. Schema before prose

Return typed fields, stable ids, result counts, omitted-count metadata, and next-page cursors before free-form explanation.

Let the model reason over bounded structure instead of raw dumps.

3. Artifact handoff

When the full payload is too large, write it to a durable artifact or provider object and return:

  • reference
  • checksum
  • expiration
  • access rule
  • safe follow-up route

Do that instead of flooding context.

4. Summarization boundary

Name whether the tool returned:

  • raw data
  • extracted fields
  • a lossy summary
  • a sampled preview

The receipt should make lossy compression visible before the agent treats it as ground truth.

5. Redaction and data-use policy

Apply redaction before payload shaping.

Record which secret, customer-data, credential, prompt, or topology class was removed. Truncation is not a security control.

6. Pagination and refill rule

Expose a cursor, range, query refinement, or approval step for more data.

Do not let the agent repeat the same oversized call hoping the next response is smaller.

Failure fixtures

Test the context-flood cases before the agent discovers them in production.

Oversized search result

Expected: return top bounded results, total count, omitted count, ranking criteria, and a cursor or refinement hint. Do not stream every match into context.

Large file or transcript

Expected: return section summaries plus artifact reference, byte range, checksum, and follow-up extraction route instead of a full dump.

Nested JSON response

Expected: flatten or select approved fields, include schema version, and receipt omitted nested objects before the agent plans from partial data.

Sensitive field in allowed result

Expected: redact before truncation and record the protected class.

A payload clipped after the secret is already returned fails the gate.

Agent asks for “everything” again

Expected: deny or require a narrower query after budget exhaustion.

The planner should not bypass the output budget by rephrasing the same broad request.

Trace fields

The output receipt should make omitted data auditable.

Once the agent moves on, operators need to know whether it acted on raw data, an extraction, a summary, or a clipped preview. The trace should keep returned payload size, omitted data, redaction, artifact references, and allowed next actions in one place.

Useful trace fields:

  • route id and tool call id
  • caller / tenant / workspace
  • operation class and data class
  • output ceiling in bytes / records / tokens
  • actual bytes and estimated tokens returned
  • raw count, returned count, and omitted count
  • schema version and selected fields
  • redaction rule and protected class
  • summary / extract / raw-data mode
  • artifact id, checksum, and expiration
  • cursor, range, or refill route
  • policy decision and denial / truncation code
  • receipt id and allowed next action

Copy-paste route card

MCP route:
Caller / tenant:
Data class:
Max bytes / records / tokens:
Allowed fields / schema:
Summary vs raw-data rule:
Artifact handoff rule:
Redaction rule:
Pagination / refill route:
Oversize denial or truncation code:
Receipt fields:
Enter fullscreen mode Exit fullscreen mode

Common misreads

  • Optimizing provider-call retries while ignoring that the returned payload is what actually explodes the model bill.
  • Calling a tool read-only and therefore safe, even though it can leak private data or swamp context with unbounded output.
  • Returning a natural-language summary without saying which fields were dropped, sampled, redacted, or inferred.
  • Using truncation as a quiet success path. The agent must know the response is partial before it takes action.
  • Storing a full artifact without a checksum, expiration, access rule, or route for retrieving a narrower slice later.
  • Letting the agent retry the same broad query after an output-budget denial instead of requiring a smaller query or human approval.

The full checklist is on Rhumb: https://rhumb.dev/blog/mcp-tool-output-budget-checklist

Top comments (0)