Vuong Ngo

Posted on Dec 7, 2025

MCP, Code, or Commands? A Decision Framework for AI Tool Integration

#ai #vibecoding #productivity #webdev

When building AI-assisted development workflows, the documentation explains what each approach does—but not the real cost implications or when to use which.

I instrumented network traffic and ran controlled experiments across five approaches using identical tasks: same 500-row dataset, same analysis requirements, same model (Claude Sonnet). The results revealed that architecture matters more than protocol choice.

MCP Optimized consumed 60,420 tokens. MCP Vanilla consumed 309,053 tokens. Same protocol. Same task. 5x difference—driven entirely by one decision: file-path references vs. data-array parameters.

This article provides a decision framework based on measured data, not marketing claims.

The Decision Framework

Before diving into data, here's the framework I developed from these experiments:

Quick Decision Guide

If your situation is...	Use this approach
Repeating task (>20 executions), large datasets, need predictable costs	MCP Optimized
One-off exploration, evolving requirements, prototyping	Code-Driven (Skills)
User must control when it runs, deterministic behavior needed	Slash Commands
Production system with security requirements	MCP Optimized (never Skills)

Decision Flowchart

Q1: One-off task (< 5 executions)?
    YES → Code-Driven or direct prompting
    NO  → Continue

Q2: Dataset > 100 rows AND need < 5% cost variance?
    YES → MCP Optimized
    NO  → Continue

Q3: User needs explicit control over invocation?
    YES → Slash Commands
    NO  → Continue

Q4: Execution count > 20 AND requirements stable?
    YES → MCP Optimized
    NO  → Code-Driven (prototype, then migrate)

NEVER:
  - MCP Vanilla for production (always suboptimal)
  - Skills for multi-user or sensitive systems

The Three Approaches Explained

MCP (Model Context Protocol)

A structured protocol for AI-tool communication. The model calls tools with JSON parameters, the server executes and returns structured results.

// MCP tool call - structured, typed, validated
await call_tool('analyze_csv_file', {
  file_path: '/data/employees.csv',
  analysis_type: 'salary_by_department'
});

Characteristics: Structured I/O, access-controlled, model-decided invocation, reusable across applications.

Critical distinction: There's a 5x token difference between vanilla MCP (passing data directly) and optimized MCP (passing file references). Same protocol, vastly different economics.

Code-Driven (Skills & Code Generation)

The model writes and executes code to accomplish tasks. Claude Code's "skills" feature lets the model invoke capabilities based on semantic matching.

# Claude writes this, executes it, iterates
import pandas as pd
df = pd.read_csv('/data/employees.csv')
result = df.groupby('department')['salary'].mean()
print(result)

Characteristics: Maximum flexibility, unstructured I/O, higher variance between runs, requires sandboxing.

Slash Commands

Pure string substitution. You type /review @file.js, the command template expands, and the result injects into your message.

<!-- .claude/commands/review.md -->
Review the following file for security vulnerabilities,
performance issues, and code quality:

{file_content}

Focus on: authentication, input validation, error handling.

Characteristics: User-explicit, deterministic, single-turn, zero tool-call overhead.

Measured Data: What the Numbers Show

Methodology

Same workload: load 500-row CSV, perform grouping, summary stats, two plots
Same model: Claude Sonnet, default settings
3-4 runs per approach with logged request/response payloads
Costs calculated at current Claude Sonnet pricing

Token Consumption

Token consumption per API request. MCP Optimized achieves consistently low usage through file-path architecture.

Approach	Avg tokens/run	vs Baseline	Why
MCP Optimized	60,420	-55%	File-path parameters; zero data duplication
MCP Proxy (warm)	81,415	-39%	Shared context + warm cache
Code-Skill (baseline)	133,006	—	Model-written Python; nothing cached
UTCP Code-Mode	204,011	+53%	Extra prompt framing
MCP Vanilla	309,053	+133%	JSON-serialized data in every call

Cost at Scale

At 1,000 monthly executions:

Approach	Per Execution	Monthly	Annual
MCP Optimized	$0.21	$210	$2,520
Code-Skill	$0.44	$440	$5,280
MCP Vanilla	$0.99	$990	$11,880

$9,360 annual difference between optimized and vanilla MCP for a single workflow.

Scalability

Cumulative token consumption. MCP Optimized maintains low growth; vanilla approaches accumulate steeply.

Approach	Scaling Factor	10K Row Projection
MCP Optimized	1.5x	~65K tokens
Code-Skill	1.1-1.6x	~150-220K tokens
MCP Vanilla	2.0-2.9x	~500-800K tokens

MCP Optimized exhibits sub-linear scaling because file paths cost the same tokens regardless of file size. MCP Vanilla exhibits super-linear scaling because larger datasets require proportionally more tokens for JSON serialization.

Variance

Approach	Coefficient of Variation	Consistency
MCP Optimized	0.6%	Excellent
MCP Proxy (warm)	0.5%	Excellent
Code-Skill	18.7%	Poor
MCP Vanilla	21.2%	Poor

MCP Optimized hit 60,307, 60,144, and 60,808 tokens across three runs. Code-Skill ranged from 108K to 158K. High variance breaks capacity planning and makes cost prediction unreliable.

Latency

Skills and sub-agents use tool-calling, which means two LLM invocations instead of one:

User message → Model decides → Tool call → Tool result → Final response

Slash commands avoid this—they're just prompt injection with direct response.

Key Lessons

1. Architecture Trumps Protocol

The 5x token difference between MCP Optimized and MCP Vanilla uses the same protocol. The difference is entirely architectural: file paths vs data arrays. Focus on data flow design, not protocol debates.

2. The File-Path Pattern

The single biggest efficiency gain: eliminate data duplication.

// Anti-pattern: 10,000 tokens just for data
await call_tool('analyze_data', {
  data: [/* 500 rows serialized */]
});

// Pattern: 50 tokens for the same operation
await call_tool('analyze_csv_file', {
  file_path: '/data/employees.csv'
});

The MCP server handles file I/O internally. Data never enters the context window.

3. Prototype with Skills, Ship with MCP

Skills execute arbitrary code—bash commands, file system access, network calls. They're excellent for figuring out what tools you need. They're inappropriate for production systems where security matters.

4. Slash Commands Are Underrated

When you need deterministic, user-controlled workflows, slash commands win. No tool-call overhead, no model surprises, no latency penalty. Use them for repeatable tasks like code review checklists or deployment procedures.

5. Sub-Agent Context Isolation

Sub-agents can't see your main conversation history. If they need context, you must explicitly pass it in the delegation prompt. This is by design—clean delegation—but requires explicit information passing.

6. CLAUDE.md Costs Compound

CLAUDE.md content injects into every message, including sub-agent conversations. Keep it concise. Use file references to pull in additional docs only when needed:

<!-- CLAUDE.md -->
# Project Standards
See @docs/CODING_STANDARDS.md for detailed guidelines.

Key rules:
- Use TypeScript strict mode
- No any types

7. Measure Before Optimizing

Instrument your network traffic. The Anthropic API returns token usage in every response—log it. You might be surprised where tokens are actually going.

Implementation Patterns

Parallel Tool Execution

File-path architecture enables parallel calls:

// Four visualizations, one API call, ~400 tokens total
await Promise.all([
  call_tool('create_viz', { file: '/data/emp.csv', type: 'bar', x: 'dept', y: 'salary' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'scatter', x: 'exp', y: 'salary' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'pie', col: 'department' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'bar', x: 'location', y: 'salary' }),
]);

Progressive Tool Discovery

For large tool catalogs (20+ tools), use meta-tools for on-demand discovery instead of loading all tools upfront:

// Initial context: 2 tools, ~400 tokens
const meta_tools = [
  { name: 'describe_tools', description: 'Discover available tools' },
  { name: 'use_tool', description: 'Execute a specific tool' }
];

// Instead of: 50 tools, ~50,000 tokens upfront

Phased Migration Strategy

For uncertain repeatability:

Phase 1: Use code-driven to validate the task. Accept higher per-execution cost for flexibility.
Phase 2: If the task stabilizes and will repeat, invest in MCP Optimized.
Phase 3: Track actual execution count and token consumption. Migrate when patterns are clear.

Summary

Approach	Best For	Avoid When
MCP Optimized	Production workloads, large datasets, predictable costs, security requirements	One-off tasks, evolving requirements
Code-Driven	Prototyping, novel requirements, maximum flexibility	Production systems, multi-user environments
Slash Commands	User-controlled workflows, deterministic behavior, zero overhead	Automation, context-dependent decisions

The core insight: how you architect data flow matters more than which protocol you choose. The 5x token difference between optimized and vanilla MCP—for the same task—demonstrates this clearly.

Match the tool to your constraints. Measure the results.

References

Token Efficiency in AI-Assisted Development - Full analysis of token consumption across approaches
Claude Code Internals: Reverse Engineering Prompt Augmentation - Deep dive into how Claude Code's prompt mechanisms work
MCP Specification
AICode Toolkit (GitHub) - MCP servers and tools for AI-assisted development
Token efficiency experiments (GitHub)
Prompt augmentation analysis (GitHub)

All claims are reproducible using the open-source data and tooling in the referenced repositories.

DEV Community