Edward Burton

Posted on Mar 11

I Built the Wrong Thing Three Times Before Learning Claude Code's Extensibility Triangle

#ai #claude #productivity #automation

You are staring at a Claude Code session, and you need it to do something it does not do out of the box. Maybe enforce your team's SQL conventions. Maybe query a production database. Maybe run code reviews on Haiku instead of burning Opus tokens on boilerplate checks.

Three options. Skills. Subagents. MCP servers. You pick one. You build it. A week later you realise you picked wrong.

I did this three times before the pattern clicked.

The Expensive Lesson

First mistake: I built a full MCP server in Rust to serve code review prompts. Forty hours of work. The server accepted tool calls, formatted review checklists, returned structured responses. Proper error handling, logging, the lot. I even wrote integration tests.

Then someone on the team dropped a markdown file in .claude/skills/review/SKILL.md with the same checklist. Ten minutes. Same result. I had confused "instructions Claude should follow" with "external systems Claude should access."

The MCP server was technically impressive and completely unnecessary. It ran as a separate process, maintained a connection, and handled JSON-RPC messages, all to serve what was essentially static text. A markdown file does that without any of the ceremony.

Second mistake: I created an elaborate skill for database analysis. The skill instructed Claude to "check the production metrics table and identify anomalies." Claude tried. It hallucinated table schemas. It could not actually query anything through a markdown prompt. No amount of clever prompt engineering changes the fact that Claude cannot reach through a skill to touch an external system. That required an MCP server with a real database connection.

Third mistake: I used the main Claude session for everything (code review, debugging, deployment checks) all in one context window at Opus pricing. A code review that Haiku could handle in three seconds was running on Opus with accumulated context from an hour of debugging. The cost was absurd, and the reviews were actually worse because of the polluted context. What I actually needed was subagents: a review subagent running on Haiku with read-only access, a debug subagent on Opus with full tool access. Scoped work at scoped cost.

Three mechanisms. Three different problems. Zero overlap once you see it.

The Decision Tree

Here is the framework I use now. It has not failed me once across 8 production plugins:

Is the capability about connecting to something external? → It depends. If the external system has a CLI and you only need local, single-agent access, the Bash tool is simpler. MCP servers are necessary when you need remote execution, permission scoping, persistent connections, or stateful operations that a one-shot CLI call cannot handle. Databases, long-lived API sessions, deployment pipelines with authentication: those need MCP servers. A quick curl or psql command does not.

Is the capability about changing how Claude behaves for a specific task? → Subagent. Different model, restricted tools, isolated context, specialised system prompt. If you want Claude to work differently for certain jobs, that is a subagent.

Is the capability about reusable knowledge, conventions, or workflows? → Skill. Team standards, review checklists, deployment procedures, coding conventions. If Claude just needs to follow instructions, that is a skill.

The question that trips people up: "I want Claude to review code against our standards AND check the CI pipeline." That is composition. A subagent running on Haiku with a review skill loaded and access to a CI MCP server. All three mechanisms, working together. More on that later.

Skills in Practice

Skills live as markdown files in .claude/skills/. No code. No compilation. No protocol knowledge. You write what you want Claude to do, and it does it.

.claude/skills/
  review/SKILL.md       → /review
  migration/SKILL.md    → /migration
  deploy-check/SKILL.md → /deploy-check
  api-design/SKILL.md   → /api-design

Here is a complete, production skill from our codebase. This is the entire file, not a snippet:

# Code Review Standards

Review the code changes in the specified files against the following criteria.

## Security
- Check for SQL injection vulnerabilities in any database queries
- Verify that user input is validated before processing
- Ensure authentication checks exist on protected routes
- Flag any hardcoded secrets or credentials

## Error Handling
- Every external API call must have error handling
- Database operations must handle connection failures
- File operations must handle missing files gracefully
- Never swallow errors silently — log or propagate

## Style
- British English in all user-facing strings
- Consistent use of snake_case in Rust code
- No TODO comments without a linked issue number
- Functions over 40 lines should be flagged for potential extraction

## Output Format
Provide fixes as code blocks, not just descriptions. Group findings by severity:
1. **Blocking** — must fix before merge
2. **Warning** — should fix, not urgent
3. **Suggestion** — optional improvement

Focus on: $ARGUMENTS

Type /review src/handlers/auth.rs and Claude applies your team's exact standards. The $ARGUMENTS variable captures everything after the slash command, so you can direct the review to specific files or concerns. Every developer, every time, same checklist.

Skills also support dynamic context injection. You can embed shell commands that run when the skill is invoked:

# Migration Review

Current database schema:
$(cat db/schema.sql)

Review the proposed migration for:
- Backwards compatibility with the current schema above
- Index coverage for new columns
- Data type consistency with existing tables

Migration to review: $ARGUMENTS

The $(cat db/schema.sql) executes at invocation time, injecting your actual current schema into the prompt. Claude reviews the migration against real schema state, not hallucinated table structures. This is the critical difference between a skill and my second mistake: the skill injects local file content, but it does not connect to external services.

The overhead is near zero. Creating a skill takes minutes. Modifying it takes seconds. They are version controlled with your project, so when conventions change, you update one file and everyone gets the update on their next pull.

For teams with non-developers, skills are particularly powerful. Anyone can write markdown. A QA engineer, a product manager, a technical writer can all create skills that encode their expertise. We covered this in depth in our guide on skills for non-technical teams.

Subagents in Practice

Subagents are the most debated of the three mechanisms. They are specialised AI agents that run inside Claude Code with their own system prompt, tool restrictions, model selection, and context window.

They live as markdown files in .claude/agents/. Here is a full subagent configuration for a code reviewer:

---
name: code-reviewer
description: "Reviews code for quality, security, and style"
model: haiku
tools: Read, Grep, Glob, Bash
disallowedTools: Write, Edit
mcpServers:
  - github
---

You are a code review specialist. You have read-only access to the codebase
and the GitHub API via MCP.

When asked to review code:
1. Read all changed files using the Read tool
2. Check each file against the review criteria below
3. Use Grep to check for known anti-patterns across the codebase
4. Report findings grouped by severity

Review criteria:
- No unwrap() calls in production code paths
- All public functions have documentation comments
- Error types implement std::fmt::Display
- No println! in library code (use tracing instead)
- Integration tests exist for new API endpoints

Format your review as a markdown checklist with pass/fail for each criterion.

That model: haiku line is doing serious work. A code review subagent running on Haiku costs a fraction of what a full Opus session costs. It is fast, focused, and follows an exact checklist defined in the system prompt.

The disallowedTools field enforces read-only access. The subagent literally cannot modify files. It reads, analyses, and reports. This is a safety boundary, not a suggestion. Claude will not attempt to use disallowed tools even if the system prompt contradicts the restriction.

And the mcpServers field scopes which external tools the subagent can access. The code reviewer above can access GitHub (to read PR diffs, comments, CI status) but nothing else. No database access. No deployment pipeline access.

Here is a more complex subagent for database work:

---
name: database-analyst
description: Analyses database performance and schema issues
model: sonnet
tools: Read, Grep, Bash
disallowedTools: Write, Edit
mcpServers:
  - postgresql
skills:
  - sql-standards
maxTurns: 25
---

You are a database performance specialist with read-only access to a
PostgreSQL database via MCP.

Your workflow:
1. Examine the schema using the postgresql MCP server
2. Identify missing indexes by analysing query patterns in the codebase
3. Check for N+1 query patterns in ORM usage
4. Review migration files for backwards compatibility
5. Suggest EXPLAIN ANALYZE commands for suspicious queries

Follow the sql-standards skill for naming conventions and query patterns.

Never suggest DROP operations. Always provide migration-safe alternatives
(CREATE INDEX CONCURRENTLY, ALTER TABLE with defaults, etc).

Notice the skills field. This subagent loads the sql-standards skill automatically, so every analysis follows your team's naming conventions and query patterns. The maxTurns: 25 prevents the subagent from running indefinitely on complex analyses; it forces concise reporting.

The model: sonnet puts this subagent on the middle tier. It needs more reasoning than Haiku provides (query plan analysis is genuinely complex) but doesn't need Opus-level capability. That model selection per subagent is the core cost lever.

MCP Servers in Practice

MCP servers are the bridge between Claude and external systems. If the capability already exists as an API or CLI tool, an MCP server wraps it for Claude's use.

The configuration lives in your project's .claude/settings.json or .mcp.json:

{
  "mcpServers": {
    "analytics-db": {
      "command": "node",
      "args": ["./mcp-servers/analytics.js"],
      "env": {
        "DATABASE_URL": "postgresql://localhost:5432/analytics",
        "MAX_ROWS": "1000",
        "TIMEOUT_MS": "5000"
      }
    },
    "deploy-pipeline": {
      "command": "./mcp-servers/deploy",
      "args": ["--read-only"],
      "env": {
        "DEPLOY_API_KEY": "${DEPLOY_API_KEY}",
        "ENVIRONMENT": "staging"
      }
    }
  }
}

Each MCP server runs as its own process. The analytics-db server above starts a Node.js process that connects to PostgreSQL and exposes tools like query_metrics, list_tables, and explain_query. The deploy-pipeline server is a compiled binary that wraps our deployment API.

Note the environment variables. DATABASE_URL is hardcoded for local development, but DEPLOY_API_KEY uses ${DEPLOY_API_KEY} to pull from your shell environment. This keeps secrets out of version control while allowing the server configuration to be shared.

The MAX_ROWS and TIMEOUT_MS on the analytics server are custom environment variables that the server reads to prevent runaway queries. This is important. An MCP server should have its own safety boundaries. Claude might ask for "all rows in the events table" and your server should say no.

We built MCP servers for our internal systems: the analytics database, the deployment pipeline, the content management system. Each one written in the language that made sense for its domain. The full process of building an MCP server in Rust is worth the investment for performance-critical integrations. For simpler use cases, a 50-line Node.js server works fine.

MCP servers can run locally over stdio or remotely over HTTP with SSE. The key insight is that you need them only when Claude has to reach outside its session. If Claude does not need external data or side effects, you do not need an MCP server. I have to remind myself of this regularly. The temptation to build an MCP server for everything is real.

One distinction worth calling out: not every external integration needs an MCP server. CLIs invoked through the Bash tool are perfectly good for local, single-agent workflows. If you can get what you need from psql, curl, aws, or gh in a single command, a Bash call is simpler and has zero setup overhead. MCP servers earn their complexity when you need remote execution, permission scoping, persistent connections, or stateful operations that a one-shot CLI call cannot handle. A database connection pool that enforces row limits and query timeouts is an MCP server. Running gh pr list is a Bash call.

For securing MCP servers in production, especially when they handle database connections or API keys, our guide on MCP server authentication and security covers the essentials.

Composing All Three

The real power is composition, and this is where understanding all three mechanisms together becomes essential.

In our production setup across 8 marketplace plugins:

Skills encode team conventions. /review for code review standards. /migration for database migration patterns. /deploy-check for pre-deployment verification. /api-design for REST endpoint conventions. Anyone can create or modify these, since they are just markdown files.

Subagents handle specialised tasks. A review subagent on Haiku for fast, cheap code checks. A debug subagent on Opus for deep analysis. A documentation subagent with read-only access. Each runs in its own context at the right price point.

MCP servers connect to external systems. The analytics database for querying metrics. The deployment pipeline for shipping code. GitHub for PR management. The content API for publishing.

And they nest. Here is our deployment-checker subagent, which uses all three layers:

---
name: deploy-checker
description: Pre-deployment verification across all systems
model: sonnet
tools: Read, Grep, Glob, Bash
disallowedTools: Write, Edit
mcpServers:
  - deploy-pipeline
  - analytics-db
skills:
  - deploy-check
  - sql-standards
maxTurns: 30
---

You are a deployment readiness checker. Before any deployment, verify:

1. Run the /deploy-check skill criteria against the current branch
2. Query the analytics-db for error rate trends over the past 24 hours
3. Check the deploy-pipeline for any pending or failed deployments
4. Review recent migration files against sql-standards
5. Verify all tests pass (use Bash to run the test suite)

Report a GO/NO-GO decision with supporting evidence for each check.
If any check fails, explain specifically what needs to be resolved.

That single subagent composes two MCP servers (for real data from the deployment pipeline and analytics database), two skills (for team conventions around deployment and SQL), tool restrictions (read-only), and model routing (Sonnet for the balance of capability and cost). One invocation, one clean context, multiple data sources, governed by team conventions.

The layered architecture looks like this:

Layer 3: Skills (Team Knowledge)
  /review, /migration, /deploy-check, /sql-standards, /api-design

Layer 2: Subagents (Specialised Behaviour)
  code-reviewer (Haiku, read-only, GitHub MCP)
  debugger (Opus, full access, all MCP servers)
  database-analyst (Sonnet, PostgreSQL MCP, sql-standards skill)
  deploy-checker (Sonnet, deploy + analytics MCP, deploy-check + sql skills)
  doc-writer (Haiku, read-only, no MCP)

Layer 1: MCP Servers (External Connections)
  analytics-db, deploy-pipeline, github, content-api, postgresql

For the full picture of how plugins, MCP servers, and skills work together as a layered architecture, and for guidance on building custom Claude agents, we have deeper guides that walk through the complete setup.

The Mistake Detector

Before you build anything, run through this:

"I want Claude to follow our team's coding standards" → Skill. Write a markdown file with the standards. Done.

"I want Claude to query our production database" → MCP server. You need code that makes a real database connection.

"I want cheap, fast code reviews that cannot modify files" → Subagent. Set model to Haiku, disallow Write and Edit tools.

"I want Claude to access GitHub and follow our PR template" → Subagent + MCP server + skill. The GitHub MCP server for access, a PR-template skill for conventions, and a subagent to scope the work.

"I want Claude to enforce our SQL naming conventions when reviewing migrations" → Skill. The conventions are just text. Claude does not need external access to check that a column is named in snake_case.

"I want Claude to check if a migration will lock a table in production" → MCP server. Claude needs to actually run EXPLAIN against the real database to determine lock behaviour. A skill cannot do that.

"I want Claude to list open pull requests" → CLI via Bash. Running gh pr list is a single command with no need for persistent connections or permission scoping. An MCP server would be overengineering this.

"I want junior developers to get the same quality of code review as senior developers" → Skill + subagent. The skill encodes senior-level review criteria. The subagent ensures it runs on every review with consistent model and tool access, regardless of who invokes it.

If you are reaching for an MCP server to serve static prompts, stop. That is a skill.

If you are writing a skill that says "query the database," stop. That is an MCP server.

If you are running expensive Opus sessions for routine checks, stop. That is a subagent on Haiku.

And if you're doing all three simultaneously, you've probably been where I was six months ago. The good news: once you see the triangle, you never confuse the three again.