jimquote

Posted on Dec 16, 2025

Claude Skills vs MCP: Complete Guide to Token-Efficient AI Agent Architecture

#ai #agents #performance #architecture

Learn when to use Claude Skills, MCP (Model Context Protocol), or both — with real-world examples and token consumption analysis.

Introduction

As AI agents become more capable, developers face a critical architectural decision: how do you extend Claude's capabilities efficiently without burning through your context window?

Anthropic offers two powerful extension mechanisms:

Agent Skills: Modular instruction packages that Claude loads on-demand
MCP (Model Context Protocol): Standardized tool interfaces for external integrations

Choosing the right approach — or combining them effectively — can dramatically impact both token consumption and reliability. This guide breaks down everything you need to know.

What Are Claude Skills?

Skills are self-contained folders that package expertise into discoverable capabilities. Each Skill contains:

A SKILL.md file with YAML frontmatter (metadata) and Markdown instructions
Optional scripts, templates, and reference files
Executable code that Claude can run

How Skills Work

User Request → Claude scans skill metadata → 
Matches relevant skill → Loads SKILL.md → 
Follows instructions → Executes task

Skills operate through progressive disclosure — Claude only loads what it needs, when it needs it.

Example Skill Structure

pdf-processor/
├── SKILL.md           # Core instructions (~2-4k tokens)
├── REFERENCE.md       # Additional docs (loaded if needed)
├── extract_text.py    # Executable script
└── templates/
    └── output.json    # Output template

What Is MCP (Model Context Protocol)?

MCP is a standardized protocol that allows Claude to interact with external services through well-defined tool interfaces. Think of it as a universal adapter for APIs.

How MCP Works

User Request → Claude identifies need for external data →
Calls MCP tool with parameters → MCP server executes →
Returns structured response → Claude continues

MCP servers expose functions that Claude can call directly, similar to function calling in traditional APIs.

Example MCP Integration

// MCP server exposes tools like:
{
  "name": "create_task",
  "description": "Create a new task in project management system",
  "parameters": {
    "title": "string",
    "due_date": "string",
    "assignee": "string"
  }
}

Skills vs MCP: Key Differences

Aspect	Skills	MCP
Nature	Instruction documents + scripts	Standardized tool protocol
Invocation	Model-invoked (Claude decides)	Model-invoked via tool calls
Environment	Requires code execution sandbox	Works without code execution
Token Pattern	Progressive loading	Fixed per tool definition
Flexibility	High (natural language instructions)	Structured (defined schemas)
Reliability	Depends on Claude's interpretation	Deterministic execution
Portability	Cross-model compatible	Anthropic ecosystem
Development	Low barrier (Markdown + scripts)	Requires server implementation

Token Consumption: A Deep Dive

Understanding token economics is crucial for building efficient agents. Here's how each approach consumes tokens:

Skills: Progressive Disclosure Architecture

Skills use a three-tier loading system that minimizes token waste:

Tier 1: Metadata Scanning (~100 tokens per skill)

At session start, Claude only sees the skill's name and description from the YAML frontmatter:

---
name: PDF Processor
description: Extract text and tables from PDF files, fill forms, merge documents
---

Cost: ~100 tokens × number of skills

If you have 10 skills installed, that's only ~1,000 tokens baseline — regardless of how large each skill's full content is.

Tier 2: Full Instructions (<5k tokens)

Only when Claude determines a skill is relevant does it load the complete SKILL.md:

# PDF Processor

## Instructions
1. Use PyMuPDF for text extraction
2. Use pdfplumber for tables
3. Always validate output format...

## Examples
[Detailed examples here]

Cost: Typically 2,000-5,000 tokens, but only for activated skills

Tier 3: Reference Files (Variable)

Additional files are loaded only when specifically needed:

# In SKILL.md:
For complex forms, refer to FORMS_REFERENCE.md

Cost: Only if Claude reads the file

Tier 4: Script Execution (Output Only)

When Claude runs a script, the script code never enters the context window. Only the output does:

# This script code = 0 tokens in context
def validate_pdf(path):
    # 50 lines of Python...
    return "Validation passed: 3 pages, 2 tables detected"

# Only this output consumes tokens: ~15 tokens

This is extremely efficient for complex operations.

MCP: Fixed Tool Definitions

MCP tools have a different token pattern:

Tool Definitions (Fixed Cost)

Each MCP tool definition consumes tokens in the system prompt:

{
  "name": "asana_create_task",
  "description": "Create a task in Asana",
  "parameters": {
    "workspace_id": "string",
    "project_id": "string", 
    "title": "string",
    "notes": "string",
    "due_date": "string",
    "assignee": "string"
  }
}

Cost: ~100-300 tokens per tool, loaded every request

Tool Calls and Responses

// Tool call: ~50 tokens
{
  "tool": "asana_create_task",
  "input": {
    "title": "Review Q3 report",
    "due_date": "2025-01-15"
  }
}

// Response: Variable based on data returned
{
  "task_id": "12345",
  "status": "created",
  "url": "https://app.asana.com/..."
}

Token Comparison Table

Scenario	Skills	MCP
10 capabilities installed, none used	~1,000 tokens	~2,000 tokens
1 capability activated	+2,000-5,000 tokens	~200 tokens
Complex operation with scripts	Output only (~100 tokens)	N/A
Simple API call	~3,000 tokens total	~400 tokens
Heavy reference docs needed	+5,000-20,000 tokens	N/A

Key Insight

Skills are more efficient when capabilities are installed but rarely used
MCP is more efficient for frequent, simple API calls
Skills excel when operations need complex scripts (code never enters context)
MCP excels for structured, predictable tool interactions

When to Use Skills

Choose Skills when you need:

1. Complex Business Logic

Skills shine when Claude needs to understand context, not just execute commands:

# Customer Support Skill

## Escalation Rules
- VIP customers (tier: platinum) → Always escalate billing issues
- Response SLA: 4 hours for critical, 24 hours for normal
- Never offer refunds > $500 without manager approval

## Tone Guidelines
- Match customer's formality level
- Acknowledge frustration before problem-solving

2. Multi-Step Workflows

When a task involves multiple tools and decision points:

# Data Pipeline Skill

## Workflow
1. Validate incoming CSV format
2. If validation fails → Run repair_csv.py
3. Transform using transform.py
4. Load to database using load.py
5. Generate summary report

3. Heavy Computation via Scripts

Offload processing to scripts — only results enter context:

# Image Processing Skill

For batch image optimization, run:
$ python optimize_images.py --input ./uploads --quality 85

Script output will contain file paths and compression ratios.

4. Rapid Prototyping

No server to maintain — just edit a Markdown file:

# Quick API Skill

## API Details
- Endpoint: https://api.example.com/v1
- Auth: Bearer token from $API_KEY
- Rate limit: 100 requests/minute

## Usage
Use curl or Python requests to call the API.

When to Use MCP

Choose MCP when you need:

1. Reliable, Structured Operations

For CRUD operations where consistency matters:

// MCP guarantees this exact call structure
await mcp.call("database_query", {
  table: "users",
  filter: { status: "active" },
  limit: 100
});

2. Real-Time External Data

When you need current information from external services:

// Slack MCP
await mcp.call("slack_search_messages", {
  query: "project update",
  channel: "#engineering"
});

3. Cross-Platform Compatibility

MCP works in Claude.ai, API, and Claude Code — even without code execution:

// Works everywhere Claude runs
mcp_servers: [
  { type: "url", url: "https://mcp.notion.com/sse" }
]

4. Existing MCP Ecosystem

Leverage community-built servers:

Slack, GitHub, Notion, Asana
Google Drive, Gmail, Calendar
Databases (PostgreSQL, MongoDB)
And many more...

The Hybrid Approach: Skills + MCP Together

The most powerful pattern combines both: Skills provide the "how" and "when," while MCP provides the "execute."

Architecture

my-crm-integration/
├── SKILL.md              # Business rules + decision logic
└── (references MCP)      # Actual API calls via MCP

Example: Sales Pipeline Skill with MCP

---
name: Sales Pipeline Manager
description: Manage sales opportunities with CRM integration
---

# Sales Pipeline Manager

## When to Use This Skill
- User asks about deals, opportunities, or pipeline
- User wants to update deal stages
- User needs sales forecasting

## Business Rules

### Stage Transitions
- "Qualified" → "Proposal" requires: Budget confirmed + Decision maker identified
- "Proposal" → "Negotiation" requires: Proposal sent + Follow-up scheduled
- "Negotiation" → "Closed Won" requires: Contract signed

### Data Enrichment
Before creating a new contact, always:
1. Search existing contacts via MCP: `salesforce_search`
2. If duplicate found, update instead of create
3. Log activity via MCP: `salesforce_log_activity`

## MCP Tools Available
Use these Salesforce MCP tools:
- `salesforce_search`: Find records
- `salesforce_create`: Create new records
- `salesforce_update`: Update existing records
- `salesforce_log_activity`: Log calls/emails

## Example Workflow

User: "Move the Acme deal to negotiation"

1. Search for deal: `salesforce_search({ object: "Opportunity", name: "Acme" })`
2. Verify stage requirements (see Business Rules above)
3. If requirements met: `salesforce_update({ id: "...", stage: "Negotiation" })`
4. Log activity: `salesforce_log_activity({ type: "Stage Change", notes: "..." })`

Token Flow in Hybrid Approach

Session Start:
├── Skill metadata loaded: ~100 tokens
└── MCP tool definitions: ~400 tokens
    Total baseline: ~500 tokens

User asks about Acme deal:
├── Full SKILL.md loaded: ~3,000 tokens
├── MCP search call: ~200 tokens
├── MCP update call: ~200 tokens  
└── MCP log call: ~150 tokens
    Total for task: ~3,550 tokens

Next simple query (skill already loaded):
├── MCP call only: ~200 tokens
    Incremental cost: ~200 tokens

Real-World Examples

Example 1: Document Processing Pipeline

Approach: Skill with embedded scripts

---
name: Contract Analyzer
description: Extract and analyze key terms from legal contracts
---

# Contract Analyzer

## Capabilities
- Extract party names, dates, terms
- Identify risk clauses
- Generate summary reports

## Workflow

1. **Extract Text**
   Run: `python extract_pdf.py {input_file}`
   Output: Raw text + structure metadata

2. **Analyze Terms**
   Run: `python analyze_contract.py {extracted_text}`
   Output: JSON with key terms

3. **Risk Assessment**
   Apply these rules:
   - Unlimited liability → HIGH RISK
   - No termination clause → MEDIUM RISK
   - Auto-renewal > 1 year → MEDIUM RISK

Token Efficiency:

Script execution: 0 tokens (code never loaded)
Only outputs enter context: ~500 tokens per document
Full analysis: ~4,000 tokens total

Example 2: Multi-Service Integration

Approach: Skill + Multiple MCP Servers

---
name: Standup Report Generator
description: Generate daily standup reports from multiple sources
---

# Standup Report Generator

## Data Sources (via MCP)
- GitHub: Pull requests, commits
- Jira: Ticket updates, status changes
- Slack: Team channel highlights

## Report Format
Generate markdown with:
1. Yesterday's completed items
2. Today's planned work
3. Blockers

## MCP Queries

### GitHub
`github_list_prs({ author: "@me", since: "yesterday" })`
`github_list_commits({ author: "@me", since: "yesterday" })`

### Jira
`jira_search({ assignee: "currentUser", updated: ">-1d" })`

### Slack
`slack_search({ query: "from:@me", after: "yesterday" })`

Token Efficiency:

Skill instructions: ~2,000 tokens (loaded once)
6 MCP calls: ~1,200 tokens
Each subsequent day: Only MCP calls (~1,200 tokens)

Example 3: API Integration Without MCP

Approach: Pure Skill with curl/Python

---
name: Weather Integration
description: Fetch weather data for travel planning
---

# Weather Integration

## API Details
- Provider: OpenWeatherMap
- Key: Use $WEATHER_API_KEY environment variable
- Base URL: https://api.openweathermap.org/data/2.5

## Usage

For current weather:

bash
curl "https://api.openweathermap.org/data/2.5/weather?q={city}&appid=$WEATHER_API_KEY"


For 5-day forecast:

bash
curl "https://api.openweathermap.org/data/2.5/forecast?q={city}&appid=$WEATHER_API_KEY"


## Response Handling
Parse JSON response and extract:
- temp: Convert from Kelvin to Celsius
- weather[0].description: Human-readable condition
- wind.speed: In meters/second

When to Use This Pattern:

No existing MCP server for the API
Simple REST calls
Quick integration without server maintenance

Best Practices for Token Optimization

1. Structure Skills for Progressive Loading

# Main SKILL.md (~2k tokens)
Core instructions that apply to most requests.

For advanced features, see ADVANCED.md
For API reference, see API_REFERENCE.md

Separate rarely-used content into reference files.

2. Use Scripts for Heavy Lifting

Don't: Put 100 lines of data transformation logic in SKILL.md

Do: Create a script and reference it:

For data transformation, run:
$ python transform.py --input data.json

The script handles all edge cases and validation.

3. Batch MCP Calls When Possible

Inefficient:

await mcp.call("get_user", { id: 1 });
await mcp.call("get_user", { id: 2 });
await mcp.call("get_user", { id: 3 });
// 3 calls × ~200 tokens = 600 tokens

Efficient:

await mcp.call("get_users", { ids: [1, 2, 3] });
// 1 call × ~250 tokens = 250 tokens

4. Keep MCP Tool Descriptions Concise

Every token in tool definitions is loaded every request:

Verbose (~300 tokens):

{
  "description": "This tool allows you to create a new task in the project management system. You can specify the task title, description, due date, priority level, assignee, project, tags, and custom fields. The task will be created and a confirmation with the task ID will be returned."
}

Concise (~80 tokens):

{
  "description": "Create a task. Returns task ID on success."
}

5. Choose the Right Tool for the Job

Use Case	Recommendation	Why
Simple API call	MCP	Lower overhead, deterministic
Complex workflow	Skill	Natural language instructions
Data processing	Skill + Script	Code never enters context
Real-time data	MCP	Direct external access
Business rules	Skill	Easy to read/update
Frequent operations	MCP	Consistent execution

Conclusion

Skills and MCP are complementary technologies, not competitors:

Skills provide flexible, context-rich instructions with excellent token efficiency for complex workflows
MCP provides reliable, structured tool access for external integrations
Combined, they create powerful AI agents that are both intelligent and efficient

The key to token optimization is understanding the loading patterns:

Skills use progressive disclosure — only pay for what you use
MCP definitions are always loaded — keep them concise
Scripts are the ultimate token saver — code never enters context
The hybrid approach often provides the best of both worlds

Start simple, measure your token usage, and evolve your architecture based on real-world patterns.