DEV Community

jimquote
jimquote

Posted on

Claude Skills vs MCP: Complete Guide to Token-Efficient AI Agent Architecture

Learn when to use Claude Skills, MCP (Model Context Protocol), or both — with real-world examples and token consumption analysis.


Introduction

As AI agents become more capable, developers face a critical architectural decision: how do you extend Claude's capabilities efficiently without burning through your context window?

Anthropic offers two powerful extension mechanisms:

  • Agent Skills: Modular instruction packages that Claude loads on-demand
  • MCP (Model Context Protocol): Standardized tool interfaces for external integrations

Choosing the right approach — or combining them effectively — can dramatically impact both token consumption and reliability. This guide breaks down everything you need to know.


What Are Claude Skills?

Skills are self-contained folders that package expertise into discoverable capabilities. Each Skill contains:

  • A SKILL.md file with YAML frontmatter (metadata) and Markdown instructions
  • Optional scripts, templates, and reference files
  • Executable code that Claude can run

How Skills Work

User Request → Claude scans skill metadata → 
Matches relevant skill → Loads SKILL.md → 
Follows instructions → Executes task
Enter fullscreen mode Exit fullscreen mode

Skills operate through progressive disclosure — Claude only loads what it needs, when it needs it.

Example Skill Structure

pdf-processor/
├── SKILL.md           # Core instructions (~2-4k tokens)
├── REFERENCE.md       # Additional docs (loaded if needed)
├── extract_text.py    # Executable script
└── templates/
    └── output.json    # Output template
Enter fullscreen mode Exit fullscreen mode

What Is MCP (Model Context Protocol)?

MCP is a standardized protocol that allows Claude to interact with external services through well-defined tool interfaces. Think of it as a universal adapter for APIs.

How MCP Works

User Request → Claude identifies need for external data →
Calls MCP tool with parameters → MCP server executes →
Returns structured response → Claude continues
Enter fullscreen mode Exit fullscreen mode

MCP servers expose functions that Claude can call directly, similar to function calling in traditional APIs.

Example MCP Integration

// MCP server exposes tools like:
{
  "name": "create_task",
  "description": "Create a new task in project management system",
  "parameters": {
    "title": "string",
    "due_date": "string",
    "assignee": "string"
  }
}
Enter fullscreen mode Exit fullscreen mode

Skills vs MCP: Key Differences

Aspect Skills MCP
Nature Instruction documents + scripts Standardized tool protocol
Invocation Model-invoked (Claude decides) Model-invoked via tool calls
Environment Requires code execution sandbox Works without code execution
Token Pattern Progressive loading Fixed per tool definition
Flexibility High (natural language instructions) Structured (defined schemas)
Reliability Depends on Claude's interpretation Deterministic execution
Portability Cross-model compatible Anthropic ecosystem
Development Low barrier (Markdown + scripts) Requires server implementation

Token Consumption: A Deep Dive

Understanding token economics is crucial for building efficient agents. Here's how each approach consumes tokens:

Skills: Progressive Disclosure Architecture

Skills use a three-tier loading system that minimizes token waste:

Tier 1: Metadata Scanning (~100 tokens per skill)

At session start, Claude only sees the skill's name and description from the YAML frontmatter:

---
name: PDF Processor
description: Extract text and tables from PDF files, fill forms, merge documents
---
Enter fullscreen mode Exit fullscreen mode

Cost: ~100 tokens × number of skills

If you have 10 skills installed, that's only ~1,000 tokens baseline — regardless of how large each skill's full content is.

Tier 2: Full Instructions (<5k tokens)

Only when Claude determines a skill is relevant does it load the complete SKILL.md:

# PDF Processor

## Instructions
1. Use PyMuPDF for text extraction
2. Use pdfplumber for tables
3. Always validate output format...

## Examples
[Detailed examples here]
Enter fullscreen mode Exit fullscreen mode

Cost: Typically 2,000-5,000 tokens, but only for activated skills

Tier 3: Reference Files (Variable)

Additional files are loaded only when specifically needed:

# In SKILL.md:
For complex forms, refer to FORMS_REFERENCE.md
Enter fullscreen mode Exit fullscreen mode

Cost: Only if Claude reads the file

Tier 4: Script Execution (Output Only)

When Claude runs a script, the script code never enters the context window. Only the output does:

# This script code = 0 tokens in context
def validate_pdf(path):
    # 50 lines of Python...
    return "Validation passed: 3 pages, 2 tables detected"

# Only this output consumes tokens: ~15 tokens
Enter fullscreen mode Exit fullscreen mode

This is extremely efficient for complex operations.

MCP: Fixed Tool Definitions

MCP tools have a different token pattern:

Tool Definitions (Fixed Cost)

Each MCP tool definition consumes tokens in the system prompt:

{
  "name": "asana_create_task",
  "description": "Create a task in Asana",
  "parameters": {
    "workspace_id": "string",
    "project_id": "string", 
    "title": "string",
    "notes": "string",
    "due_date": "string",
    "assignee": "string"
  }
}
Enter fullscreen mode Exit fullscreen mode

Cost: ~100-300 tokens per tool, loaded every request

Tool Calls and Responses

// Tool call: ~50 tokens
{
  "tool": "asana_create_task",
  "input": {
    "title": "Review Q3 report",
    "due_date": "2025-01-15"
  }
}

// Response: Variable based on data returned
{
  "task_id": "12345",
  "status": "created",
  "url": "https://app.asana.com/..."
}
Enter fullscreen mode Exit fullscreen mode

Token Comparison Table

Scenario Skills MCP
10 capabilities installed, none used ~1,000 tokens ~2,000 tokens
1 capability activated +2,000-5,000 tokens ~200 tokens
Complex operation with scripts Output only (~100 tokens) N/A
Simple API call ~3,000 tokens total ~400 tokens
Heavy reference docs needed +5,000-20,000 tokens N/A

Key Insight

  • Skills are more efficient when capabilities are installed but rarely used
  • MCP is more efficient for frequent, simple API calls
  • Skills excel when operations need complex scripts (code never enters context)
  • MCP excels for structured, predictable tool interactions

When to Use Skills

Choose Skills when you need:

1. Complex Business Logic

Skills shine when Claude needs to understand context, not just execute commands:

# Customer Support Skill

## Escalation Rules
- VIP customers (tier: platinum) → Always escalate billing issues
- Response SLA: 4 hours for critical, 24 hours for normal
- Never offer refunds > $500 without manager approval

## Tone Guidelines
- Match customer's formality level
- Acknowledge frustration before problem-solving
Enter fullscreen mode Exit fullscreen mode

2. Multi-Step Workflows

When a task involves multiple tools and decision points:

# Data Pipeline Skill

## Workflow
1. Validate incoming CSV format
2. If validation fails → Run repair_csv.py
3. Transform using transform.py
4. Load to database using load.py
5. Generate summary report
Enter fullscreen mode Exit fullscreen mode

3. Heavy Computation via Scripts

Offload processing to scripts — only results enter context:

# Image Processing Skill

For batch image optimization, run:
$ python optimize_images.py --input ./uploads --quality 85

Script output will contain file paths and compression ratios.
Enter fullscreen mode Exit fullscreen mode

4. Rapid Prototyping

No server to maintain — just edit a Markdown file:

# Quick API Skill

## API Details
- Endpoint: https://api.example.com/v1
- Auth: Bearer token from $API_KEY
- Rate limit: 100 requests/minute

## Usage
Use curl or Python requests to call the API.
Enter fullscreen mode Exit fullscreen mode

When to Use MCP

Choose MCP when you need:

1. Reliable, Structured Operations

For CRUD operations where consistency matters:

// MCP guarantees this exact call structure
await mcp.call("database_query", {
  table: "users",
  filter: { status: "active" },
  limit: 100
});
Enter fullscreen mode Exit fullscreen mode

2. Real-Time External Data

When you need current information from external services:

// Slack MCP
await mcp.call("slack_search_messages", {
  query: "project update",
  channel: "#engineering"
});
Enter fullscreen mode Exit fullscreen mode

3. Cross-Platform Compatibility

MCP works in Claude.ai, API, and Claude Code — even without code execution:

// Works everywhere Claude runs
mcp_servers: [
  { type: "url", url: "https://mcp.notion.com/sse" }
]
Enter fullscreen mode Exit fullscreen mode

4. Existing MCP Ecosystem

Leverage community-built servers:

  • Slack, GitHub, Notion, Asana
  • Google Drive, Gmail, Calendar
  • Databases (PostgreSQL, MongoDB)
  • And many more...

The Hybrid Approach: Skills + MCP Together

The most powerful pattern combines both: Skills provide the "how" and "when," while MCP provides the "execute."

Architecture

my-crm-integration/
├── SKILL.md              # Business rules + decision logic
└── (references MCP)      # Actual API calls via MCP
Enter fullscreen mode Exit fullscreen mode

Example: Sales Pipeline Skill with MCP

---
name: Sales Pipeline Manager
description: Manage sales opportunities with CRM integration
---

# Sales Pipeline Manager

## When to Use This Skill
- User asks about deals, opportunities, or pipeline
- User wants to update deal stages
- User needs sales forecasting

## Business Rules

### Stage Transitions
- "Qualified" → "Proposal" requires: Budget confirmed + Decision maker identified
- "Proposal" → "Negotiation" requires: Proposal sent + Follow-up scheduled
- "Negotiation" → "Closed Won" requires: Contract signed

### Data Enrichment
Before creating a new contact, always:
1. Search existing contacts via MCP: `salesforce_search`
2. If duplicate found, update instead of create
3. Log activity via MCP: `salesforce_log_activity`

## MCP Tools Available
Use these Salesforce MCP tools:
- `salesforce_search`: Find records
- `salesforce_create`: Create new records
- `salesforce_update`: Update existing records
- `salesforce_log_activity`: Log calls/emails

## Example Workflow

User: "Move the Acme deal to negotiation"

1. Search for deal: `salesforce_search({ object: "Opportunity", name: "Acme" })`
2. Verify stage requirements (see Business Rules above)
3. If requirements met: `salesforce_update({ id: "...", stage: "Negotiation" })`
4. Log activity: `salesforce_log_activity({ type: "Stage Change", notes: "..." })`
Enter fullscreen mode Exit fullscreen mode

Token Flow in Hybrid Approach

Session Start:
├── Skill metadata loaded: ~100 tokens
└── MCP tool definitions: ~400 tokens
    Total baseline: ~500 tokens

User asks about Acme deal:
├── Full SKILL.md loaded: ~3,000 tokens
├── MCP search call: ~200 tokens
├── MCP update call: ~200 tokens  
└── MCP log call: ~150 tokens
    Total for task: ~3,550 tokens

Next simple query (skill already loaded):
├── MCP call only: ~200 tokens
    Incremental cost: ~200 tokens
Enter fullscreen mode Exit fullscreen mode

Real-World Examples

Example 1: Document Processing Pipeline

Approach: Skill with embedded scripts

---
name: Contract Analyzer
description: Extract and analyze key terms from legal contracts
---

# Contract Analyzer

## Capabilities
- Extract party names, dates, terms
- Identify risk clauses
- Generate summary reports

## Workflow

1. **Extract Text**
   Run: `python extract_pdf.py {input_file}`
   Output: Raw text + structure metadata

2. **Analyze Terms**
   Run: `python analyze_contract.py {extracted_text}`
   Output: JSON with key terms

3. **Risk Assessment**
   Apply these rules:
   - Unlimited liability → HIGH RISK
   - No termination clause → MEDIUM RISK
   - Auto-renewal > 1 year → MEDIUM RISK
Enter fullscreen mode Exit fullscreen mode

Token Efficiency:

  • Script execution: 0 tokens (code never loaded)
  • Only outputs enter context: ~500 tokens per document
  • Full analysis: ~4,000 tokens total

Example 2: Multi-Service Integration

Approach: Skill + Multiple MCP Servers

---
name: Standup Report Generator
description: Generate daily standup reports from multiple sources
---

# Standup Report Generator

## Data Sources (via MCP)
- GitHub: Pull requests, commits
- Jira: Ticket updates, status changes
- Slack: Team channel highlights

## Report Format
Generate markdown with:
1. Yesterday's completed items
2. Today's planned work
3. Blockers

## MCP Queries

### GitHub
`github_list_prs({ author: "@me", since: "yesterday" })`
`github_list_commits({ author: "@me", since: "yesterday" })`

### Jira
`jira_search({ assignee: "currentUser", updated: ">-1d" })`

### Slack
`slack_search({ query: "from:@me", after: "yesterday" })`
Enter fullscreen mode Exit fullscreen mode

Token Efficiency:

  • Skill instructions: ~2,000 tokens (loaded once)
  • 6 MCP calls: ~1,200 tokens
  • Each subsequent day: Only MCP calls (~1,200 tokens)

Example 3: API Integration Without MCP

Approach: Pure Skill with curl/Python

---
name: Weather Integration
description: Fetch weather data for travel planning
---

# Weather Integration

## API Details
- Provider: OpenWeatherMap
- Key: Use $WEATHER_API_KEY environment variable
- Base URL: https://api.openweathermap.org/data/2.5

## Usage

For current weather:
Enter fullscreen mode Exit fullscreen mode


bash
curl "https://api.openweathermap.org/data/2.5/weather?q={city}&appid=$WEATHER_API_KEY"


For 5-day forecast:
Enter fullscreen mode Exit fullscreen mode


bash
curl "https://api.openweathermap.org/data/2.5/forecast?q={city}&appid=$WEATHER_API_KEY"


## Response Handling
Parse JSON response and extract:
- temp: Convert from Kelvin to Celsius
- weather[0].description: Human-readable condition
- wind.speed: In meters/second
Enter fullscreen mode Exit fullscreen mode

When to Use This Pattern:

  • No existing MCP server for the API
  • Simple REST calls
  • Quick integration without server maintenance

Best Practices for Token Optimization

1. Structure Skills for Progressive Loading

# Main SKILL.md (~2k tokens)
Core instructions that apply to most requests.

For advanced features, see ADVANCED.md
For API reference, see API_REFERENCE.md
Enter fullscreen mode Exit fullscreen mode

Separate rarely-used content into reference files.

2. Use Scripts for Heavy Lifting

Don't: Put 100 lines of data transformation logic in SKILL.md

Do: Create a script and reference it:

For data transformation, run:
$ python transform.py --input data.json

The script handles all edge cases and validation.
Enter fullscreen mode Exit fullscreen mode

3. Batch MCP Calls When Possible

Inefficient:

await mcp.call("get_user", { id: 1 });
await mcp.call("get_user", { id: 2 });
await mcp.call("get_user", { id: 3 });
// 3 calls × ~200 tokens = 600 tokens
Enter fullscreen mode Exit fullscreen mode

Efficient:

await mcp.call("get_users", { ids: [1, 2, 3] });
// 1 call × ~250 tokens = 250 tokens
Enter fullscreen mode Exit fullscreen mode

4. Keep MCP Tool Descriptions Concise

Every token in tool definitions is loaded every request:

Verbose (~300 tokens):

{
  "description": "This tool allows you to create a new task in the project management system. You can specify the task title, description, due date, priority level, assignee, project, tags, and custom fields. The task will be created and a confirmation with the task ID will be returned."
}
Enter fullscreen mode Exit fullscreen mode

Concise (~80 tokens):

{
  "description": "Create a task. Returns task ID on success."
}
Enter fullscreen mode Exit fullscreen mode

5. Choose the Right Tool for the Job

Use Case Recommendation Why
Simple API call MCP Lower overhead, deterministic
Complex workflow Skill Natural language instructions
Data processing Skill + Script Code never enters context
Real-time data MCP Direct external access
Business rules Skill Easy to read/update
Frequent operations MCP Consistent execution

Conclusion

Skills and MCP are complementary technologies, not competitors:

  • Skills provide flexible, context-rich instructions with excellent token efficiency for complex workflows
  • MCP provides reliable, structured tool access for external integrations
  • Combined, they create powerful AI agents that are both intelligent and efficient

The key to token optimization is understanding the loading patterns:

  1. Skills use progressive disclosure — only pay for what you use
  2. MCP definitions are always loaded — keep them concise
  3. Scripts are the ultimate token saver — code never enters context
  4. The hybrid approach often provides the best of both worlds

Start simple, measure your token usage, and evolve your architecture based on real-world patterns.


Further Reading


Last updated: December 2025

Top comments (0)