I built an MCP server that publishes HTML files, and I hit a wall I haven't
seen documented anywhere: you can't pass a large file as an MCP tool
argument. Not "it's slow" or "it's awkward" — the model is physically
incapable of doing it.
Here's the failure, why it happens, and the one-line design change that fixes it.
The setup
My agents (Claude Code, mostly) generate a lot of interactive HTML — dashboards
with Chart.js, data-heavy reports, PRDs. I wanted them to publish those files to
the web with one tool call, so I did the obvious thing first:
server.tool(
"publish_html",
{ html: z.string(), title: z.string() },
async ({ html, title }) => {
const url = await upload(html, title);
return { content: [{ type: "text", text: url }] };
}
);
The agent generates the HTML, passes it as the html argument, the server
uploads it. It demoed beautifully on a 5 KB "hello world" page.
Then I tried it on a real artifact — a 1.4 MB dashboard with inlined data —
and it fell apart.
Why it can never work
When a model calls a tool, the arguments aren't a file handle or a pointer.
They are text the model emits, token by token, inside its response. A tool
call's arguments are part of the model's output, which means they're bounded by
the model's maximum output tokens.
Do the math: 1 MB of HTML is roughly 250k–350k tokens. Typical max output is
far below that. The model literally cannot finish "saying" the argument. In
practice you get one of:
- a truncated tool call that fails to parse,
- the model "helpfully" summarizing or abbreviating your HTML (corrupting it),
- or a refusal-ish stall where it tries to avoid emitting the blob at all.
And even when a file fits, you're paying output-token prices (the expensive
ones) to make the model retype a file that already exists on disk, byte for
byte, with a nonzero chance of it "fixing" something along the way.
This isn't an MCP bug. It's the nature of tool calling: arguments are model
output. Any MCP tool designed to receive bulk content as an argument has a
ceiling it will hit the first time someone uses it for real work.
The fix: pass a reference, not the bytes
The server runs locally over stdio. It has the same filesystem the agent is
working in. So the tool takes a path:
server.tool(
"publish_file",
{
path: z.string().describe("Absolute path to the HTML file to publish"),
title: z.string(),
},
async ({ path, title }) => {
const html = await fs.readFile(path, "utf8"); // server reads from disk
const url = await uploadMultipart(html, title); // server does the upload
return { content: [{ type: "text", text: url }] };
}
);
Now the model's tool call is ~50 tokens regardless of file size:
{ "path": "/Users/me/reports/q2-dashboard.html", "title": "Q2 Dashboard" }
The agent never carries the bytes. It writes the file with its normal file
tools (which stream and don't have this constraint the same way), then hands
the reference to the MCP server, which reads from disk and does a multipart
upload itself. 1.4 MB or 14 MB — the model's job is the same size.
This one design decision is the difference between a demo that works on toy
files and a tool that's useful on real artifacts.
The general rule
If you're building an MCP server, audit every tool argument and ask: could
this be big? If yes, take a reference instead:
| Instead of accepting… | Accept… |
|---|---|
| file contents | a file path |
| a big dataset | a path, URL, or query the server executes |
| an image/binary blob | a path or URL |
| "the whole document" to edit | a path + edit instructions |
The corollary for outputs is the same: a tool that returns a huge payload
fills the model's context window. Return a reference (a URL, a path, a
summary + handle) and let the model fetch slices if it needs them.
Local stdio servers are perfect for this pattern because they share a
filesystem with the agent. For remote MCP servers you'd reach for the same
idea with URLs or pre-signed uploads — anything but the bytes-in-arguments
trap.
Try it / source
The server is stelaspace-mcp on npm (MIT) — it publishes HTML files to
StelaSpace, which gives each artifact a permanent,
sandboxed, access-controlled link (I built it; free tier exists). One-line
setup with Claude Code:
claude mcp add stelaspace --scope user \
--env STELASPACE_API_KEY=ss_sk_... \
-- npx -y stelaspace-mcp
Here's a live dashboard published this way —
1 MB+ of interactive Chart.js HTML that no model could ever have passed as a
tool argument.
If you've hit other walls building MCP servers, I'd genuinely like to hear
them — I'm collecting these gotchas.
Top comments (0)