DEV Community

Cover image for MCP, Code, or Commands? A Decision Framework for AI Tool Integration
Vuong Ngo
Vuong Ngo

Posted on

MCP, Code, or Commands? A Decision Framework for AI Tool Integration

When building AI-assisted development workflows, the documentation explains what each approach does—but not the real cost implications or when to use which.

I instrumented network traffic and ran controlled experiments across five approaches using identical tasks: same 500-row dataset, same analysis requirements, same model (Claude Sonnet). The results revealed that architecture matters more than protocol choice.

MCP Optimized consumed 60,420 tokens. MCP Vanilla consumed 309,053 tokens. Same protocol. Same task. 5x difference—driven entirely by one decision: file-path references vs. data-array parameters.

This article provides a decision framework based on measured data, not marketing claims.


The Decision Framework

Before diving into data, here's the framework I developed from these experiments:

Quick Decision Guide

If your situation is... Use this approach
Repeating task (>20 executions), large datasets, need predictable costs MCP Optimized
One-off exploration, evolving requirements, prototyping Code-Driven (Skills)
User must control when it runs, deterministic behavior needed Slash Commands
Production system with security requirements MCP Optimized (never Skills)

Decision Flowchart

Q1: One-off task (< 5 executions)?
    YES → Code-Driven or direct prompting
    NO  → Continue

Q2: Dataset > 100 rows AND need < 5% cost variance?
    YES → MCP Optimized
    NO  → Continue

Q3: User needs explicit control over invocation?
    YES → Slash Commands
    NO  → Continue

Q4: Execution count > 20 AND requirements stable?
    YES → MCP Optimized
    NO  → Code-Driven (prototype, then migrate)

NEVER:
  - MCP Vanilla for production (always suboptimal)
  - Skills for multi-user or sensitive systems
Enter fullscreen mode Exit fullscreen mode

The Three Approaches Explained

MCP (Model Context Protocol)

A structured protocol for AI-tool communication. The model calls tools with JSON parameters, the server executes and returns structured results.

// MCP tool call - structured, typed, validated
await call_tool('analyze_csv_file', {
  file_path: '/data/employees.csv',
  analysis_type: 'salary_by_department'
});
Enter fullscreen mode Exit fullscreen mode

Characteristics: Structured I/O, access-controlled, model-decided invocation, reusable across applications.

Critical distinction: There's a 5x token difference between vanilla MCP (passing data directly) and optimized MCP (passing file references). Same protocol, vastly different economics.

Code-Driven (Skills & Code Generation)

The model writes and executes code to accomplish tasks. Claude Code's "skills" feature lets the model invoke capabilities based on semantic matching.

# Claude writes this, executes it, iterates
import pandas as pd
df = pd.read_csv('/data/employees.csv')
result = df.groupby('department')['salary'].mean()
print(result)
Enter fullscreen mode Exit fullscreen mode

Characteristics: Maximum flexibility, unstructured I/O, higher variance between runs, requires sandboxing.

Slash Commands

Pure string substitution. You type /review @file.js, the command template expands, and the result injects into your message.

<!-- .claude/commands/review.md -->
Review the following file for security vulnerabilities,
performance issues, and code quality:

{file_content}

Focus on: authentication, input validation, error handling.
Enter fullscreen mode Exit fullscreen mode

Characteristics: User-explicit, deterministic, single-turn, zero tool-call overhead.


Measured Data: What the Numbers Show

Methodology

  • Same workload: load 500-row CSV, perform grouping, summary stats, two plots
  • Same model: Claude Sonnet, default settings
  • 3-4 runs per approach with logged request/response payloads
  • Costs calculated at current Claude Sonnet pricing

Token Consumption

Token consumption per API request. MCP Optimized achieves consistently low usage through file-path architecture.

Approach Avg tokens/run vs Baseline Why
MCP Optimized 60,420 -55% File-path parameters; zero data duplication
MCP Proxy (warm) 81,415 -39% Shared context + warm cache
Code-Skill (baseline) 133,006 Model-written Python; nothing cached
UTCP Code-Mode 204,011 +53% Extra prompt framing
MCP Vanilla 309,053 +133% JSON-serialized data in every call

Cost at Scale

At 1,000 monthly executions:

Approach Per Execution Monthly Annual
MCP Optimized $0.21 $210 $2,520
Code-Skill $0.44 $440 $5,280
MCP Vanilla $0.99 $990 $11,880

$9,360 annual difference between optimized and vanilla MCP for a single workflow.

Scalability

Cumulative token consumption. MCP Optimized maintains low growth; vanilla approaches accumulate steeply.

Approach Scaling Factor 10K Row Projection
MCP Optimized 1.5x ~65K tokens
Code-Skill 1.1-1.6x ~150-220K tokens
MCP Vanilla 2.0-2.9x ~500-800K tokens

MCP Optimized exhibits sub-linear scaling because file paths cost the same tokens regardless of file size. MCP Vanilla exhibits super-linear scaling because larger datasets require proportionally more tokens for JSON serialization.

Variance

Approach Coefficient of Variation Consistency
MCP Optimized 0.6% Excellent
MCP Proxy (warm) 0.5% Excellent
Code-Skill 18.7% Poor
MCP Vanilla 21.2% Poor

MCP Optimized hit 60,307, 60,144, and 60,808 tokens across three runs. Code-Skill ranged from 108K to 158K. High variance breaks capacity planning and makes cost prediction unreliable.

Latency

Skills and sub-agents use tool-calling, which means two LLM invocations instead of one:

User message → Model decides → Tool call → Tool result → Final response
Enter fullscreen mode Exit fullscreen mode

Slash commands avoid this—they're just prompt injection with direct response.


Key Lessons

1. Architecture Trumps Protocol

The 5x token difference between MCP Optimized and MCP Vanilla uses the same protocol. The difference is entirely architectural: file paths vs data arrays. Focus on data flow design, not protocol debates.

2. The File-Path Pattern

The single biggest efficiency gain: eliminate data duplication.

// Anti-pattern: 10,000 tokens just for data
await call_tool('analyze_data', {
  data: [/* 500 rows serialized */]
});

// Pattern: 50 tokens for the same operation
await call_tool('analyze_csv_file', {
  file_path: '/data/employees.csv'
});
Enter fullscreen mode Exit fullscreen mode

The MCP server handles file I/O internally. Data never enters the context window.

3. Prototype with Skills, Ship with MCP

Skills execute arbitrary code—bash commands, file system access, network calls. They're excellent for figuring out what tools you need. They're inappropriate for production systems where security matters.

4. Slash Commands Are Underrated

When you need deterministic, user-controlled workflows, slash commands win. No tool-call overhead, no model surprises, no latency penalty. Use them for repeatable tasks like code review checklists or deployment procedures.

5. Sub-Agent Context Isolation

Sub-agents can't see your main conversation history. If they need context, you must explicitly pass it in the delegation prompt. This is by design—clean delegation—but requires explicit information passing.

6. CLAUDE.md Costs Compound

CLAUDE.md content injects into every message, including sub-agent conversations. Keep it concise. Use file references to pull in additional docs only when needed:

<!-- CLAUDE.md -->
# Project Standards
See @docs/CODING_STANDARDS.md for detailed guidelines.

Key rules:
- Use TypeScript strict mode
- No any types
Enter fullscreen mode Exit fullscreen mode

7. Measure Before Optimizing

Instrument your network traffic. The Anthropic API returns token usage in every response—log it. You might be surprised where tokens are actually going.


Implementation Patterns

Parallel Tool Execution

File-path architecture enables parallel calls:

// Four visualizations, one API call, ~400 tokens total
await Promise.all([
  call_tool('create_viz', { file: '/data/emp.csv', type: 'bar', x: 'dept', y: 'salary' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'scatter', x: 'exp', y: 'salary' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'pie', col: 'department' }),
  call_tool('create_viz', { file: '/data/emp.csv', type: 'bar', x: 'location', y: 'salary' }),
]);
Enter fullscreen mode Exit fullscreen mode

Progressive Tool Discovery

For large tool catalogs (20+ tools), use meta-tools for on-demand discovery instead of loading all tools upfront:

// Initial context: 2 tools, ~400 tokens
const meta_tools = [
  { name: 'describe_tools', description: 'Discover available tools' },
  { name: 'use_tool', description: 'Execute a specific tool' }
];

// Instead of: 50 tools, ~50,000 tokens upfront
Enter fullscreen mode Exit fullscreen mode

Phased Migration Strategy

For uncertain repeatability:

  1. Phase 1: Use code-driven to validate the task. Accept higher per-execution cost for flexibility.
  2. Phase 2: If the task stabilizes and will repeat, invest in MCP Optimized.
  3. Phase 3: Track actual execution count and token consumption. Migrate when patterns are clear.

Summary

Approach Best For Avoid When
MCP Optimized Production workloads, large datasets, predictable costs, security requirements One-off tasks, evolving requirements
Code-Driven Prototyping, novel requirements, maximum flexibility Production systems, multi-user environments
Slash Commands User-controlled workflows, deterministic behavior, zero overhead Automation, context-dependent decisions

The core insight: how you architect data flow matters more than which protocol you choose. The 5x token difference between optimized and vanilla MCP—for the same task—demonstrates this clearly.

Match the tool to your constraints. Measure the results.


References

All claims are reproducible using the open-source data and tooling in the referenced repositories.

Top comments (0)