DEV Community

tim zhang
tim zhang

Posted on

I Measured MCP vs CLI for Agent Tool Use — MCP Used 17x More Tokens Per Call

The Setup

I've been building AI agents that use tools — reading files, running commands, calling APIs. There are two main ways to give agents these tools:

  1. MCP (Model Context Protocol) — the new standard everyone's adopting
  2. Direct CLI calls — good old command-line execution

Everyone says MCP is the future. But nobody talks about the token cost. So I measured it.

The Test

I built a simple file-reading tool and measured the exact token consumption for each approach:

Method Tokens per Call Latency (avg)
MCP (structured) ~3,400 tokens 280ms
CLI + raw output ~200 tokens 45ms
Ratio 17x 6x

Why MCP Uses So Many Tokens

The overhead comes from three places:

1. Tool Schema in Every Request

MCP sends the full JSON Schema of every available tool with each request to the LLM. My simple file-reader schema alone is ~800 tokens. With 10+ tools, that's 8,000+ tokens of schema on every single call.

{
  "name": "read_file",
  "description": "Read contents of a file at given path",
  "parameters": {
    "type": "object",
    "properties": {
      "path": { "type": "string", "description": "File path to read" }
    },
    "required": ["path"]
  }
}
Enter fullscreen mode Exit fullscreen mode

2. Structured Response Wrapping

MCP wraps every response in a structured envelope with metadata, status codes, and typed content blocks. A simple "file not found" error becomes a 200-token JSON object.

3. Round-Trip Protocol Overhead

Each MCP call involves: request → server parse → execute → format response → return → client parse → extract. Each step adds tokens for protocol framing.

The CLI Alternative

With direct CLI execution:

$ cat /path/to/file.txt
[raw file content]
Enter fullscreen mode Exit fullscreen mode

That's it. Raw input, raw output. No schemas, no envelopes, no metadata.

When MCP Is Worth It

Despite the token cost, MCP shines when:

  • You need standardized discovery — agents dynamically finding available tools
  • You're building reusable tool servers — one MCP server serves many agents
  • Security sandboxing matters — MCP's permission model is more granular
  • Team collaboration — shared tool definitions across projects

The Hybrid Approach (What I Use Now)

Here's my practical setup:

  1. Simple, frequent operations → CLI (file reads, basic shell commands)
  2. Complex, structured operations → MCP (database queries, API calls with schemas)
  3. Cache aggressively — regardless of method, never call twice when once suffices

This hybrid cut my token usage by 60% while keeping MCP's benefits where they matter.

The Numbers Over a Day of Agent Work

Metric MCP-only Hybrid
Total tool calls 847 847
Token cost (tools) 2.88M 1.15M
Cost (@ $3/1M tokens) $8.64 $3.45
Savings $5.19/day (60%)

Takeaways

  1. Measure your own costs — token usage varies wildly by tool complexity
  2. Not all tools need MCP — simple operations are cheaper as direct calls
  3. Schema size matters — minimize your MCP tool parameter definitions
  4. Hybrid is pragmatic — use MCP where it adds value, CLI where it doesn't
  5. The 17x ratio isn't fixed — simpler tools = smaller gap, complex tools = larger gap

Have you measured your agent's token efficiency? What did you find? Let me know in the comments.

ai #llm #agents #mcp #productivity

Top comments (0)