tim zhang

Posted on Jun 3

I Measured MCP vs CLI for Agent Tool Use — MCP Used 17x More Tokens Per Call

#ai #mcpllm

The Setup

I've been building AI agents that use tools — reading files, running commands, calling APIs. There are two main ways to give agents these tools:

MCP (Model Context Protocol) — the new standard everyone's adopting
Direct CLI calls — good old command-line execution

Everyone says MCP is the future. But nobody talks about the token cost. So I measured it.

The Test

I built a simple file-reading tool and measured the exact token consumption for each approach:

Method	Tokens per Call	Latency (avg)
MCP (structured)	~3,400 tokens	280ms
CLI + raw output	~200 tokens	45ms
Ratio	17x	6x

Why MCP Uses So Many Tokens

The overhead comes from three places:

1. Tool Schema in Every Request

MCP sends the full JSON Schema of every available tool with each request to the LLM. My simple file-reader schema alone is ~800 tokens. With 10+ tools, that's 8,000+ tokens of schema on every single call.

{
  "name": "read_file",
  "description": "Read contents of a file at given path",
  "parameters": {
    "type": "object",
    "properties": {
      "path": { "type": "string", "description": "File path to read" }
    },
    "required": ["path"]
  }
}

2. Structured Response Wrapping

MCP wraps every response in a structured envelope with metadata, status codes, and typed content blocks. A simple "file not found" error becomes a 200-token JSON object.

3. Round-Trip Protocol Overhead

Each MCP call involves: request → server parse → execute → format response → return → client parse → extract. Each step adds tokens for protocol framing.

The CLI Alternative

With direct CLI execution:

$ cat /path/to/file.txt
[raw file content]

That's it. Raw input, raw output. No schemas, no envelopes, no metadata.

When MCP Is Worth It

Despite the token cost, MCP shines when:

You need standardized discovery — agents dynamically finding available tools
You're building reusable tool servers — one MCP server serves many agents
Security sandboxing matters — MCP's permission model is more granular
Team collaboration — shared tool definitions across projects

The Hybrid Approach (What I Use Now)

Here's my practical setup:

Simple, frequent operations → CLI (file reads, basic shell commands)
Complex, structured operations → MCP (database queries, API calls with schemas)
Cache aggressively — regardless of method, never call twice when once suffices

This hybrid cut my token usage by 60% while keeping MCP's benefits where they matter.

The Numbers Over a Day of Agent Work

Metric	MCP-only	Hybrid
Total tool calls	847	847
Token cost (tools)	2.88M	1.15M
Cost (@ $3/1M tokens)	$8.64	$3.45
Savings	—	$5.19/day (60%)

Takeaways

Measure your own costs — token usage varies wildly by tool complexity
Not all tools need MCP — simple operations are cheaper as direct calls
Schema size matters — minimize your MCP tool parameter definitions
Hybrid is pragmatic — use MCP where it adds value, CLI where it doesn't
The 17x ratio isn't fixed — simpler tools = smaller gap, complex tools = larger gap

Have you measured your agent's token efficiency? What did you find? Let me know in the comments.

ai #llm #agents #mcp #productivity

Top comments (1)

Andrii Krugliak • Jun 3

The 17x tracks with what I saw, MCP's structured envelope is great for discovery and brutal on hot-path calls. I ended up sending the high-frequency tools through raw CLI and keeping MCP for the ones where the schema actually earns its token cost.