DEV Community

Cover image for Skill, MCP, Plugin, or just a CLI: how I pick a Claude Code extension, lightest first
Rapls
Rapls

Posted on • Originally published at zenn.dev

Skill, MCP, Plugin, or just a CLI: how I pick a Claude Code extension, lightest first

I was building a plugin release with Claude Code, and the changelog draft came together nicely. Pull git log from the last tag to now, drop it under == Changelog ==. That's a procedure, so it just worked.

The next step is where I tripped. I wanted to add the current WordPress.org active install count to the release post, so I added a line to the same procedure file: "fetch the stats and write them in." It didn't work. Of course it didn't. A file that holds a procedure teaches Claude the steps, but it has no legs to go out and fetch today's numbers from a website. To go get them, you need a mouth that talks to the outside. That was a different tool's job.

Claude Code has several ways to add capability: Skill, MCP, Plugin. The names sound alike and the explanations blur together, and a CLI is in the mix too. I build WordPress plugins and use coding agents most days, and I still hesitated every time about which one to reach for. This post is how I draw the line, settled into one rule: reach for the lightest thing first.

A quick note on setup

These four move fast. Treat the list below as true when I checked it, and verify on your own machine with /context, /mcp, and /plugins.

  • Claude Code: the value from claude --version (swap in your real number)
  • Slash commands are now folded into Skills. If you read this expecting "commands" to be a separate thing, it won't line up.
  • My targets are self-built WordPress plugins and themes.

First: it's nested, not a flat choice of three

Laid out in a comparison table, these three confuse people, because you end up comparing things with different jobs on the same axis. They're actually nested.

A Plugin is a distribution container. Inside it go Skills, MCP configs, slash commands, subagents, and hooks. A Skill is a bundle of procedure or knowledge, and from inside it you can call an MCP tool or a CLI. Slash commands now live inside Skills. Think recipe card (Skill), the plumbing that connects your kitchen to the outside market (MCP), and the whole stocked kitchen that packages it all (Plugin). Nobody argues about whether a recipe card is better than plumbing. They do different jobs. Same here: you don't compare, you ask which layer your problem lives in.

Hold that nesting in your head and my opening mistake explains itself in one line: I tried to do an MCP's job (fetch outside numbers) with a Skill (teach a procedure). I was confusing layers.

The rule: reach for the lightest first

Here's the whole decision. Apply it top to bottom.

  1. Just teaching a procedure or knowledge? Skill.
  2. Need live outside data? Is there a CLI for it? If yes and you use it rarely, call the CLI from a Skill.
  3. No CLI, or you use it deeply every day? MCP.
  4. Want to bundle and reuse or share all of the above? Plugin.

I say "lightest first" because the cost in context and effort grows as you go down. Most of what you want is item 1. But the shiny new tool pulls your eye, and you start thinking from MCP or Plugin. That was me. Ask whether a Skill is enough, then whether a CLI reaches it, and only then reach for the heavy tools.

A few recent calls, run through this. Standardize commit message style: a procedure, so Skill. Peek at the staging post count: outside, but wp-cli handles it and I use it rarely, so call the CLI from a Skill. Operate the production dashboard deeply, every day: a CLI isn't enough and the frequency is high, so MCP. Ship a shared lint config and release steps across plugins: distribution, so Plugin.

The rest of this is each layer in that order.

Skill: procedure and knowledge

A folder with a SKILL.md: YAML frontmatter on top, Markdown body below. The description sits ready, lightly, and the body opens only when a related task comes up.

---
name: release-build
description: "Release prep for a plugin. Version bumps, changelog entry, build the distribution zip."
disable-model-invocation: true
allowed-tools:
  - Read
  - Edit
  - Bash(git log *)
  - Bash(unzip -l *)
---
Enter fullscreen mode Exit fullscreen mode

One thing people misread is allowed-tools. It does not restrict which tools can run; it pre-approves the listed ones to run without a confirmation prompt. Tools not on the list can still be called, subject to your normal permission settings. So something you truly want blocked isn't stopped by leaving it off this list. The other is disable-model-invocation: true, which I use for side-effecting work like a release: I'd rather invoke /release-build myself than have Claude helpfully start it.

The key point: a Skill reaches local files and commands Claude Code can run. Fetching a website's current numbers, anything live, won't happen no matter how you word the steps. That's where I tripped. A Skill memorized the steps; it has no legs to go outside.

MCP: live data and real systems

MCP (Model Context Protocol) connects Claude to outside tools and data. You declare servers in config, for example .mcp.json:

{
  "mcpServers": {
    "github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"] }
  }
}
Enter fullscreen mode Exit fullscreen mode

WordPress.org stats, a staging database, GitHub issues. When you want the value of the moment from outside, that's MCP. Active install counts change daily, so yesterday's number baked into a Skill is wrong tomorrow. Fixed things you can write down go in a Skill; things that change each time need MCP. /mcp lists connected servers and lets you disconnect them.

But MCP is heavy, and not by a little. I'll get to the numbers below, but it's enough weight that "connect everything that looks useful" is a bad default.

Call a CLI from a Skill: think before reaching for MCP

When I want outside data, I don't jump straight to MCP. First I check whether a CLI already does it. GitHub has gh. WordPress has wp-cli. If the CLI exists, calling it from a Skill's steps via Bash is usually lighter.

wp post list --post_type=post --post_status=publish --format=count
wp option get tmfs_settings --format=json
Enter fullscreen mode Exit fullscreen mode

The reason is how context loads. An MCP server loads its full tool schema the moment you connect, and carries it every turn. A CLI called through Bash loads nothing at startup; you pay only when the command runs. For something you touch occasionally, that difference is large. One reported benchmark put the GitHub CLI at 4 to 32 times cheaper per operation than the GitHub MCP. A daily, deep integration may justify MCP's structured results, but holding a full schema in context for a tool you touch twice a month is a bad trade.

Plugin: bundle, reuse, share

A Plugin packages Skills, MCP config, commands, subagents, and hooks into one distributable unit with a plugin.json.

my-plugin/
├── .claude-plugin/plugin.json
├── skills/
└── .mcp.json
Enter fullscreen mode Exit fullscreen mode

Install with /install from a marketplace, manage with /plugins. If a release procedure is common across several plugins, bundle the release-build Skill and its build script into a Plugin and pull it into a new repo in one shot, instead of copy-pasting files around. A Plugin doesn't solve the content; the content is Skills and MCP. It solves distribution only. Decide the content first, wrap it in a Plugin when you want to ship it. Starting from "let's make a Plugin" usually stalls.

One thing I watch when installing someone else's Plugin: a Plugin can carry MCP configs and hooks, and those run code on my machine. The official marketplace is curated; community ones vary. I check what MCP and hooks are inside before installing.

The context cost, with real numbers

"Lightest first" is grounded in a real weight difference. Numbers vary by version and setup, so measure your own, but the reported figures are stark.

A session starts heavy before you type anything: system prompt, CLAUDE.md, memory, the tool schemas of connected MCP servers, and the names and descriptions of your Skills, all loaded at startup. Reported sessions begin around 20k to 30k tokens with nothing typed.

MCP adds the most. A connected server loads its whole schema regardless of use, and it rides every turn, even in a session that never touches it. A publicly shared /context breakdown looked like this on a 200k window:

System prompt    3.2k   (1.6%)
System tools    16.1k   (8.0%)
MCP tools       98.7k   (49.3%)   <- tool definitions of connected servers
Memory files     3.x k
Enter fullscreen mode Exit fullscreen mode

Nearly half the window on MCP tool definitions. Another report had 7 servers eating 67,300 tokens (about 34%) before any conversation. Per server: the official GitHub MCP runs about 18k for its 27 tools even when you never touch GitHub, and a fuller build hits roughly 55k across 93 tools. In one /doctor dump, a 20-tool server cost about 14.1k, Playwright (21 tools) about 13.6k, a SQLite server (19 tools) about 13.4k, roughly 700 tokens per tool. And it reloads every turn: at 15k of overhead over a 20-turn session, that's 300k tokens spent on tool definitions alone, whether or not you used them.

Skills are the opposite. At startup only the names and descriptions load, tens of tokens each, with the body opening on demand. You can keep many Skills and barely move the startup cost. Same "add capability," completely different weight. That's why lightest first holds up.

To measure and cut: run /context and read the MCP Tools section, then /mcp to disconnect what you're not using. Claude Code's tool search defers schema loading until a tool is needed; one measurement dropped main-thread usage from 51k to 8.5k, about a 47% cut. Allow-listing only the tools you use can cut a 50-tool server to 5 for roughly a tenth of the cost.

One caution: /context historically overstated MCP usage by counting shared overhead per tool, showing nearly 3x the real figure, which was corrected around late January 2026. Discount old numbers and old posts, and measure on your own version.

Common mix-ups

  • Making a fixed procedure or template into an MCP server. That's a Skill. You carry the weight and the capability is no better, and you burn startup tokens for nothing.
  • Standing up an MCP for something gh or wp-cli already do. The startup load grows and the capability doesn't.
  • Cramming long task procedures into CLAUDE.md and bloating it. CLAUDE.md is always loaded; per-task steps belong in a Skill that opens on demand.
  • Treating slash commands as separate. They're folded into Skills now.
  • Standing up a Plugin or MCP for a one-off. A prompt in the moment is enough. Build machinery only for what you know you'll repeat.

Skill lives in the repo, MCP lives on the machine

Sharing differs too, and that affects the choice. A Skill in .claude/skills/ is a file in git: committed, reviewed, traveling with the code. MCP connection config tends to hold secrets like API keys, so it doesn't go in git and stays tied to a machine or environment, different per person. Keep keys in environment variables rather than inline in .mcp.json. A Plugin is the explicit way to package and ship even the machine-bound parts. What you want to share, and with whom, is another axis for choosing.

A note to my next self

When in doubt: procedure, data, or distribution. Procedure is a Skill; outside data is a CLI first, then MCP; bundling to share is a Plugin. Apply lightest first, and stop at Skill if Skill is enough. Don't start from the heavy tool. That alone prevented most of my mix-ups, including trying to fetch live numbers with a procedure file.

How much execution you let an agent do, and how you stop the irreversible commands, is a separate question I wrote up elsewhere. Bundling my release setup into a Plugin is still on my list. When I do it, I'll write that up too. Leaving this map here so future-me doesn't freeze in front of the three again.

References

Token figures above are public measurements from other setups. Measure your own with /context.


Originally written in Japanese on Zenn. I build WordPress plugins.

Top comments (1)

Collapse
 
xulingfeng profile image
xulingfeng

The "lightest first" rule hit home. We've been running into the exact same trap — reaching for MCP every time something needs to reach outside, when a CLI call from a Skill would do the job at a fraction of the context cost. That 49% MCP tool definitions stat is brutal but real.

The nesting model (Skill → CLI → MCP → Plugin) is the clearest explanation I've seen. The way you framed it — "recipe card vs plumbing vs stocked kitchen" — makes it stick. How do you handle the case where a CLI exists but its output needs structured parsing before it's useful? Do you pipe it through a small script in the Skill, or does that push it into MCP territory?