Cesar Vega

Posted on Nov 20

Skills, MCPs, and Commands are the same context engineering trend.

#ai #llm #webdev #programming

What are skills:

Skills are folders of instructions and scripts that Claude loads to do specialized tasks. The point of skills is to make Claude follow guidelines and be more deterministic.

Benefits.

More deterministic
Easy to edit, share, and set up.
Efficient context use.*

Make AI more deterministic.

This is one of the biggest issues with LLMs right now: they spout a lot of code, and they can and will come up with two different approaches and answers to the same prompt. With skills, we add guardrails, instructions, and scripts that tell the LLM exactly what to do. For example, the playwright-skill I downloaded has a helpers.js file which has a bunch of scripts like launchBrowser or safeClick, so Claude can use those functions and the result will always be the same.
* This was also a benefit that MCPs had.

Easy to edit.

Since Skills are just .md files and scripts, they are really easy to edit and share if necessary. The downside is that the files can be hard to maintain if they don't have a single source of truth.

Context

Understanding Claude context

Running /context in Claude code (And in most other CLIs) gives you a breakdown of how context is used in the CLI:

The context is divided into:

System prompt: The instructions set that defines Claude's identity, behavior, and core operational guidelines
- Skills are loaded here.
System tools: Built-in capabilities available in every conversation (file operations, execution, web access, agents).
MCP tools: MCP tools and descriptions of each tool.
Memory files: Project-specific context files (like CLAUDE.md) that provide persistent knowledge about your projects and preferences.
Messages: The actual conversation history between you and Claude in the current session.
Free space: Available context window capacity for additional content.
Autocompact buffer: Reserved space that triggers automatic context compression when filled to prevent overflow.

Skills context

Claude's path when using skills works like this:

At the start of a session, Claude scans all available skill files and reads a short explanation for each one from the frontmatter YAML in the Markdown file - this is very token efficient: each skill only takes up a few dozen extra tokens, with the full details only loaded in should the user request a task that the skill can help solve.

For example, at the very beginning of each session, Claude generates a system context file. In which skills are loaded in a section like this:

   <available_skills>
     <skill>
       <name>analyzing-financial-statements</name>
       <description>This skill calculates key financial ratios and metrics from financial statement data for investment analysis
  (project)</description>
       <location>managed</location>
     </skill>
     <skill>
       <name>example-skill</name>
       <description>Demonstrates the basic structure of a Claude Skill. Use this as a template when creating new skills. (project)</description>
       <location>managed</location>
     </skill>
     <skill>
       <name>playwright-skill</name>
       <description>Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms,
  take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test
  websites, automate browser interactions, validate web functionality, or perform any browser-based testing. (project)</description>
       <location>managed</location>
     </skill>
   </available_skills>

You can see an example of the whole prompts at leaked prompts at ageirtj's repo

All my skills take approximately 200-250 tokens (including XML tags, descriptions, and metadata). The actual descriptions alone (just the text content) would be 35~40 tokens total.

Claude only loads the .yaml-like description and data at the top of the document, and from there it decides if it's worth reading more (Progressive Disclosure):

NOTE: Skills depend on a coding environment

One thing to note is that the skills mechanism is entirely dependent on the model having access to a filesystem, tools to navigate it, and the ability to execute commands in that environment. The filesystem is your computer, and it can change things in YOUR computer. So you should always verify whether your functions or commands have side effects that could affect your system.

This can also be beneficial because you are not dealing with external dependencies, but it brings back the "It works in my computer" problem.

Skills vs MCP

I mentioned that skills are not FULLY loaded upfront; they load the basic description, and then the agent decides whether it's worth reading more. MCPs used to be designed (And some still are) in a way that all of their tools were loaded up front, taking a lot of context. But new MCPs are improving their architecture, and they are using the same patterns as Skills to have separate code execution, which brings their context size to similar levels as Skills.

Note: Technically, you can use scripts in skills and avoid MCPs. (And vice-versa)
You can use both together: MCP connections give Claude access to tools, while Skills teach Claude how to use those tools effectively.
MCP connects Claude to external services and data sources. Skills provide instructions for how to complete specific workflows.

Skills PROS:

Skills are simple: The core simplicity of the skills design is why I'm so excited about it. MCP is a whole protocol specification, covering hosts, clients, servers, resources, prompts, tools, sampling, roots, elicitation, and three different transports (stdio, streamable HTTP, and originally SSE). Skills are Markdown with a tiny bit of YAML metadata and some optional scripts in whatever you can make executable in the environment.

MCP PROS:

Can connect to external applications: This is the biggest pro in my opinion; If you want an API-like interface that can connect to other applications, this is the way to go, especially since there are many officially supported MCPs from big organizations like Linear/Slack/Stripe. You can also benefit from MCP servers that run the code in the owner's cloud instead of locally.

Things to consider that I didn't delve into:

Reliability, security/isolation, cross-device availability, latency, required runtime permissions, collaboration features, etc...

Plawright MCP vs skill example.

DISCLAIMER: It's not the same maintainer, so some of the features in the MCP are not in the SKILL and vice versa.
The Playwright skill uses around 100 tokens in context, while the MCP uses ONE HUNDRED times more (10'000 tokens). The main difference is that the Skill has a lot of the functionality the MCP has, but it gets loaded ONLY when it's necessary, while the MCP is always in memory.

Here is a breakdown of the approximate token use that Playwrite MCP uses:

  Based on the JSON schema definitions visible to me, here's a rough estimate:

  Simple tools (minimal parameters):
  - browser_close, browser_navigate_back, browser_install, browser_console_messages
  - ~150-250 tokens each

  Medium tools (moderate parameters):
  - browser_resize, browser_navigate, browser_press_key, browser_take_screenshot, browser_wait_for, browser_tabs
  - ~300-500 tokens each

  Complex tools (extensive parameters with nested schemas):
  - browser_evaluate, browser_fill_form, browser_click, browser_type, browser_drag, browser_hover, browser_select_option, browser_run_code
  - ~500-800 tokens each

  Rough total estimate: 8,000-10,000 tokens for all 22 Playwright MCP tools, including their full JSON schema definitions, descriptions, and parameters
  specifications, and metadata.

I will say this again, 10'000 tokens in context for something we might not use is not worth it. I have personally disabled the Playwright MCP, but I had to change the Skill a bit to fit my work better (which is waay easier to do in Skills, since I have the files right there in my codebase).

Semi-outdated MCP criticisms (In my opinion):

I saw this quote by Simon Willson, but I think the main point about spending tokens can be mitigated by good MCP design:
* "My own interest in MCPs has waned ever since I started taking coding agents seriously. Almost everything I might achieve with an MCP can be handled by a CLI tool instead. LLMs know how to call cli-tool --help, which means you don't have to spend many tokens describing how to use them—the model can figure it out later when it needs to."
* My comment about this is that there is a new trend with MCPs that do the same as Skills, they display a short description and then point to relevant files and scripts to be used. - But note: For most tasks, it's probably still better to use Skills.
- https://www.anthropic.com/engineering/code-execution-with-mcp

What about RAG?

RAG is a concept that was very hyped before, and it's still relevant. RAG is a middle step where the agent searches and retrieves external data with whatever tools are available to it - grep, vector database, or any relevancy algorithm.

It can be similar to Skills or MCPs in the way that it's a tool that an LLM can use to improve its answer.

What about Claude commands?

Custom slash commands allow you to define frequently-used prompts as Markdown files that Claude Code can execute. There are plenty of pre-loaded commands in Claude code like /context, /mcp, /github, etc...
They behave almost the same as skills in the way that they are some kind of script or prompt that the LLM runs when you tell it to.

It seems that commands haven't gained a lot of traction, and I think it's because they tend to be focused on small changes. I think of Skills as an orchestrator of commands. If a command is a single function or bash, skills are a group of functions or commands.

Understanding that context is stateless,

We pass the whole conversation every time
One thing that really helped me understand why keeping context clean is important is that LLMs are stateless; in every new message, we are just passing the whole conversation back and forth to the model (Maybe some providers optimize for this?). So every new message ends up being your whole conversation and the latest message attached to it.

Sending a single-line message at the end of a 400-line discussion with a lot of back-and-forth is a lot different than sending one at the end of a 100-line structured discussion. MCPs and Skills can help us keep this context as small as possible and even save their thoughts in concise documents that we can reference in a new chat with an LLM.

Conclusion

Claude, it's becoming a general agent, an orchestrator. Skills, Commands, MCPs, and Agents make this explicit. And we as developers will have to focus a lot more on managing Context and providing guardrails, scripts, and instructions to restrict the LLM when necessary and let it work in what it does best.

Even when models are pushing context windows of 1 million tokens, they still struggle to keep track of things we have done in long conversations, so keeping things understandable and documented is still critical.

Feel free to send me sources and comments about this.

Sources:

https://support.claude.com/en/articles/12512176-what-are-skills
https://simonwillison.net/2025/Oct/16/claude-skills/
https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills
https://github.com/lackeyjb/playwright-skill?tab=readme-ov-file
https://news.ycombinator.com/item?id=45619537

https://github.com/asgeirtj/system_prompts_leaks/blob/main/Anthropic/claude-4.5-sonnet.md
https://www.anthropic.com/engineering/code-execution-with-mcp
https://aws.amazon.com/what-is/retrieval-augmented-generation/
https://code.claude.com/docs/en/slash-commands
https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents

DEV Community