The Setup
I've been building AI agents that use tools — reading files, running commands, calling APIs. There are two main ways to give agents these tools:
- MCP (Model Context Protocol) — the new standard everyone's adopting
- Direct CLI calls — good old command-line execution
Everyone says MCP is the future. But nobody talks about the token cost. So I measured it.
The Test
I built a simple file-reading tool and measured the exact token consumption for each approach:
| Method | Tokens per Call | Latency (avg) |
|---|---|---|
| MCP (structured) | ~3,400 tokens | 280ms |
| CLI + raw output | ~200 tokens | 45ms |
| Ratio | 17x | 6x |
Why MCP Uses So Many Tokens
The overhead comes from three places:
1. Tool Schema in Every Request
MCP sends the full JSON Schema of every available tool with each request to the LLM. My simple file-reader schema alone is ~800 tokens. With 10+ tools, that's 8,000+ tokens of schema on every single call.
{
"name": "read_file",
"description": "Read contents of a file at given path",
"parameters": {
"type": "object",
"properties": {
"path": { "type": "string", "description": "File path to read" }
},
"required": ["path"]
}
}
2. Structured Response Wrapping
MCP wraps every response in a structured envelope with metadata, status codes, and typed content blocks. A simple "file not found" error becomes a 200-token JSON object.
3. Round-Trip Protocol Overhead
Each MCP call involves: request → server parse → execute → format response → return → client parse → extract. Each step adds tokens for protocol framing.
The CLI Alternative
With direct CLI execution:
$ cat /path/to/file.txt
[raw file content]
That's it. Raw input, raw output. No schemas, no envelopes, no metadata.
When MCP Is Worth It
Despite the token cost, MCP shines when:
- You need standardized discovery — agents dynamically finding available tools
- You're building reusable tool servers — one MCP server serves many agents
- Security sandboxing matters — MCP's permission model is more granular
- Team collaboration — shared tool definitions across projects
The Hybrid Approach (What I Use Now)
Here's my practical setup:
- Simple, frequent operations → CLI (file reads, basic shell commands)
- Complex, structured operations → MCP (database queries, API calls with schemas)
- Cache aggressively — regardless of method, never call twice when once suffices
This hybrid cut my token usage by 60% while keeping MCP's benefits where they matter.
The Numbers Over a Day of Agent Work
| Metric | MCP-only | Hybrid |
|---|---|---|
| Total tool calls | 847 | 847 |
| Token cost (tools) | 2.88M | 1.15M |
| Cost (@ $3/1M tokens) | $8.64 | $3.45 |
| Savings | — | $5.19/day (60%) |
Takeaways
- Measure your own costs — token usage varies wildly by tool complexity
- Not all tools need MCP — simple operations are cheaper as direct calls
- Schema size matters — minimize your MCP tool parameter definitions
- Hybrid is pragmatic — use MCP where it adds value, CLI where it doesn't
- The 17x ratio isn't fixed — simpler tools = smaller gap, complex tools = larger gap
Have you measured your agent's token efficiency? What did you find? Let me know in the comments.
Top comments (0)