DEV Community

Cover image for The 22,000 Token Tax: Why I Killed My MCP Server
Benjamin Eckstein
Benjamin Eckstein

Posted on • Originally published at codewithagents.de

The 22,000 Token Tax: Why I Killed My MCP Server

I was at a company workshop, arguing with beginners about token costs.

They wanted to save money. Reasonable instinct. They were spending maybe €25 a week on API calls and wanted to cut it to €20. I pushed back hard: "You're at the learning stage. Spend more, not less. Explore. Break things. Create costs.
Because while you're saving €5, I'm spending €600 a week — and I'll gladly spend €20 more if it means finishing a ticket in one session instead of two."

Then I told them the one scenario where token consumption actually matters: when you need to prolong a session. Not to save money — to preserve context. Because when your session compacts or resets, you lose everything the model was holding in its head. And in the early days of Claude Code, there was no auto-compact. Your session just died with an error when you hit the limit. Auto-compact made this better, but you never know what survives the squeeze. Research confirms what I've felt in practice: context length alone hurts LLM performance, even when the relevant information is right there. The longer your context, the worse the output — a phenomenon sometimes called context rot. So every unnecessary token you load at startup is a tax on the quality of everything that follows.

I came home that evening and opened a new session. Ran /context. Stared at the breakdown.

22,000 tokens in MCP tools alone. Before I typed a single prompt.

The Receipt

I had three MCP servers running: mcp-atlassian for Jira and Confluence, chrome-devtools for browser automation, and context7 for documentation lookups. Together they cost 22K tokens. But the Atlassian server was the one I could kill — it was registering 33 tools for a service where I used six.

I'd gone through the settings and disabled as many as I could — but the server kept loading all of them. Confluence tools I never used. Batch operations. Sprint management. Worklog tracking. None of it mattered.

All 33 tools. About 10,000 tokens. Every single session.

I compared the numbers. One skill — 40 tokens of metadata. One MCP tool — 300 tokens of schema. The Atlassian MCP was loading tools I had explicitly told it not to load.

The Setting That Doesn't

Here's what disabledTools actually does in Claude Code: it prevents the AI from calling a tool. That's it.

It does not prevent the MCP server from starting. It does not prevent the server from registering its tools. It does not prevent those tool schemas from being injected into the context window. The Docker container still spins up. The tool definitions still flow in. The tokens still burn. disabledTools is a runtime filter, not a context optimization. I was disappointed — if the setting exists in the configuration, you'd expect the platform to be smart enough to not load what you've explicitly disabled. But that's not how it works.

The only way to actually save the tokens is to remove the MCP server entirely.

The Replacement: 7 Scripts

I looked at what I actually use. Six Jira operations. Zero Confluence operations. Out of 33 registered tools, I needed six.

So I wrote shell scripts. The same pattern I already use for Jenkins and Slack — credentials in a JSON file under ~/.config/, curl calls with Bearer token auth, jq for parsing responses.

The first script took five minutes. Authentication worked on the first try — just Authorization: Bearer <token> with the same personal access token the MCP had been using. No Docker container. No protocol negotiation. No tool registration. Just curl.

  TOKEN=$(jq -r '.personal_token' ~/.config/jira/credentials.json)
  BASE_URL=$(jq -r '.base_url' ~/.config/jira/credentials.json)

  curl -s -k -H "Authorization: Bearer $TOKEN" \
    "$BASE_URL/rest/api/2/issue/PROJ-123"
Enter fullscreen mode Exit fullscreen mode

The credentials file should be chmod 600 (owner-only read/write).

The -k flag skips SSL certificate verification because our internal Jira uses a self-signed cert — don't copy that for public endpoints. And yes, the token ends up in the process list briefly via shell variable expansion. For a local developer workstation running personal scripts, that's an acceptable trade-off. For a shared server or CI pipeline, you'd want to pipe credentials through stdin instead.

Cairn built all six scripts in under an hour. I fed the Jira REST API documentation into the session for context, described the pattern I wanted, and Cairn wrote the scripts, tested them against our live Jira, and verified each one worked. I gave it a real ticket number to go wild on — fetch, update, transition, comment, the full lifecycle. Then we fine-tuned the scripts to bake in our project defaults: the right component, the right team label, the custom fields our board requires. Get issue. Search with JQL. Update fields. Add comment. Get transitions. Transition status. Each script reads credentials, makes a curl call, formats the output. No abstraction layer. No protocol. No 300-token tool schema.

Then I added a seventh: create issue.

The Thing MCP Could Never Do

Creating Jira tickets through MCP never worked reliably. I'd hit the MCP permission wall before — specialized agents couldn't even access the tools. But even when access worked, the actual creation flow — with custom fields, project-specific components, team assignments — always hit edge cases that the MCP abstraction couldn't handle cleanly.

The curl script created a ticket on the first try.

  curl -s -k -X POST \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"fields": {
      "project": {"key": "PROJ"},
      "issuetype": {"name": "Task"},
      "summary": "Test ticket",
      "components": [{"name": "Frontend"}],
      "customfield_12345": [{"value": "Team-A"}]
    }}' \
    "$BASE_URL/rest/api/2/issue"
Enter fullscreen mode Exit fullscreen mode

HTTP 201. The ticket existed. With the right component, the right team, the right assignee. First try.

The MCP had been sitting between me and a REST API that was perfectly willing to cooperate. It was abstracting away complexity that didn't exist.

The Abstraction Tax

MCP is a good idea for getting started. You install a server, you get tools, you're productive in minutes. For someone spending €25 a week who's still learning, that's the right trade-off. The setup cost is zero and the token cost doesn't matter because you're not pushing session limits.

When you're 5,428 prompts deep into a persistent agent system, running multi-agent workflows that eat 100K+ tokens per ticket, every unnecessary token at startup compresses the useful work you can do before quality starts degrading. I've learned this lesson before — 23K tokens burned loading a bloated memory file. Now it was 10K tokens burned loading Jira tools I'd explicitly disabled. Same tax, different landlord.

And here's the part that bothered me most: I couldn't partially load the MCP server. It's all or nothing. Want 6 tools? You get 33. Want to disable the other 27? You can — but you still pay for all 33 in your context. The protocol has no mechanism for selective tool registration based on client preferences.

So I replaced it:

  1. 33 MCP tools with 7 shell scripts
  2. ~10,000 tokens per session with 0 tokens at startup
  3. Docker container on every launch with no container
  4. Issue creation broken with Issue creation works
  5. Tool schemas you can't customize with Scripts you own completely

The seven scripts total about 700 lines of bash. They live in my skill directory, version-controlled, testable. I can read them. I can debug them. I can add project-specific defaults — like auto-applying the default component and team for every ticket in our project. Try doing that in an MCP tool schema.

And I know exactly what they do. That MCP server was a Docker image pulled from a third-party registry, running with my Jira credentials baked into environment variables. I never audited that image. I never read its source. Every docker pull could have shipped a different binary. When your integration is 700 lines of bash that you wrote and can read end to end, supply chain risk isn't a concern — it's just curl.

When to Graduate

MCP stops making sense the moment you're paying for tools you don't use and can't shed. When you need 6 tools but get 33. When 10K tokens burn before your first prompt. When you need capabilities the server doesn't expose. When you need project-specific behavior that the protocol can't express. That's when you graduate.

The graduation path is simple: credentials file, curl, jq. The same tools that powered the internet before every API got wrapped in an abstraction layer. They still work. They're still faster. And you own them completely.

They don't cost you a single token to say hello.

What I Actually Learned

This isn't new. It's what every software engineer has done since the beginning: make it work first, then optimize. The MCP got me running. It was the right choice when I was figuring out how to wire an AI agent to Jira at all. But once it worked, the job was to look at the bill and cut the waste. That's not AI-specific wisdom — that's just engineering.

Integrations have carrying costs. An MCP server isn't free just because it's open-source. A tool registry isn't free just because the tools are disabled. Every abstraction layer between your code and the API it talks to has a price — in tokens, in debuggability, in flexibility, in the things you can't do because the abstraction didn't anticipate your use case.

Sometimes the best integration is the one with no integration layer at all.


Originally published at CodeWithAgents.de


Top comments (0)