Why I chose a CLI over MCP for my Dev Tool

#ai #cli #opensource #rag

MCP is everywhere right now. If you are building any kind of developer tool that talks to AI agents, the default assumption is that you should build an MCP server. Every major agent platform already supports it, including Cursor, Claude Desktop, and VS Code, and the ecosystem is growing fast, with the developer experience improving every month.

So when I was building docmancer, a tool that gives AI coding agents access to up-to-date documentation, MCP was the obvious integration path. I prototyped it, tested it across a few agents, and then deliberately decided to go in a completely different direction.

I shipped plain CLI commands and markdown skill files instead.

MCP, as a protocol, isn't all bad. It is a well-designed protocol with real strengths. But for what I was building, MCP introduced problems that a simpler architecture did not, and the benefits it offered were ones I could achieve without it. I'll walk through the reasoning because it applies to a broader class of developer tools than just mine.

What the tool actually needs to do

Docmancer does one thing: it lets AI coding agents look up documentation instead of guessing. You ingest docs from a URL or local files; they are chunked and embedded locally, and then your agent can query the local index to retrieve the specific sections relevant to what it is working on. A few hundred tokens of real documentation instead of tens of thousands of tokens of an entire doc site pasted into context.

The interaction model is simple. The agent needs to run a command. It gets back text. That is the entire integration surface. There is no bidirectional communication, no streaming, no state that persists between calls, and no need for the agent to subscribe to updates. It is a request-response pattern that occurs through the terminal that every AI coding agent already has access to.

The context window tax

Every MCP server you add to an AI coding agent costs context window tokens before a single line of code gets written. The agent needs to know that the server exists, what tools it exposes, what parameters those tools accept, and how to format requests. That metadata lives in the context window alongside your code, your conversation history, and everything else the agent is trying to keep track of.

For some MCP servers, the tool definitions alone consume over 16% of the available context window. That is not the tool responses or the conversation about them. That is just the schema definitions sitting there, taking up space, in every single session, whether you use them or not.

For a tool like docmancer, where the whole point is to reduce the amount of documentation tokens landing in context, adding infrastructure that inflates the context window felt like solving a problem while making it worse at the same time.

A skill file, by contrast, is a markdown document that gets read once. It contains instructions for the agent on when to call the CLI and which commands are available. The agent reads it, understands the workflow, and then executes commands through its existing terminal.

The infrastructure that comes with MCP

Running an MCP server means running a server. That sounds obvious, but the implications add up in practice. When I tried it with Claude Code and Codex locally, the MCP server was quite flaky and kept failing for numerous reasons.

You need a process running on the developer's machine (or cloud) that starts before or alongside the agent session, binds to a port or at least a transport mechanism, and stays alive for the entire duration. If it crashes, the agent loses access to the tool until someone restarts it. When the developer runs multiple agents in parallel, each needs its own connection to the server, which means you need to consider concurrency, connection pooling, and process lifecycle management.

The configuration is not trivial either. You are editing JSON files in agent-specific config directories to specify server paths, transport types, and environment variables. When the MCP server updates, you might need to update the config. When the agent updates, the config format might change. There is a coordination cost between the server, the agent, and the developer's environment that accumulates over time.

With a CLI tool, the developer installs it once with npx or pipx, and there is no process to manage, no port to bind, and no config file in the agent's directory beyond a markdown skill file that the install command writes automatically. When the CLI updates, the developer runs pipx upgrade docmancer, and everything continues working. The agent calls the CLI via the terminal, gets a response, and moves on.

I wanted docmancer to be the kind of tool you install, forget about, and it just works every time you open a new agent session. MCP made that harder, not easier.

What does an MCP give you that a CLI does not

I want to be honest about what you lose when you skip MCP, because the trade-offs are real.

MCP gives you structured tool definitions with typed parameters and return values. The agent knows exactly what the tool accepts and returns, reducing the risk of malformed calls. In a CLI, the agent parses text output, and while modern agents are very good at this, it is less reliable than a structured response schema.

MCP also gives you discoverability. An agent connected to an MCP server can enumerate available tools at runtime, which means it can learn what is available without reading documentation first. With a CLI and a skill file, the agent depends on the instructions in the markdown being accurate and complete. If the CLI adds a new command and the skill file is not updated, the agent does not know about it.

These are legitimate advantages. For tools with complex interaction patterns, bidirectional communication needs, or deep integration requirements, MCP is probably the right choice. A tool that needs to push notifications to the agent, maintain session state, or orchestrate multi-step workflows across different systems benefits significantly from MCP's architecture.

docmancer does not need any of that. It needs to answer a question and return text. A CLI does that perfectly well, and everything MCP adds on top is overhead for this particular use case.

The skill file approach

Here is what the actual integration looks like. When you run docmancer install claude-code, the CLI writes a markdown file to ~/.claude/skills/docmancer/SKILL.md. That file contains:

A description of when the agent should use docmancer (documentation questions, API lookups, version-specific behavior)
The available CLI commands with examples
A workflow that tells the agent to list available sources first, then query, and optionally ingest new docs if needed

The agent reads this file and, from that point on, knows how to use docmancer. When a user asks about webhook signature verification in Stripe, for example, the agent runs docmancer query "webhook signature verification", gets back the relevant chunks, and uses them in its response.

No server started, no port opened, no connection negotiated. The agent simply used its terminal, which it would have access to regardless.

The same pattern works across agents. docmancer install cursor writes a skill file to Cursor's directory. docmancer install codex writes one to Codex's directory. Every agent gets the same instructions, calls the same CLI, and queries the same local index. One ingest step covers all of them.

When this approach breaks down

This approach has a ceiling, and I think it is important to name it explicitly.

If docmancer ever needed real-time communication with the agent, where the tool pushes information without being asked, a CLI would not be sufficient. If it needed to maintain conversational state across multiple interactions within a single turn, the stateless nature of CLI calls would become a limitation. If it needed to integrate deeply with the agent's internal planning or tool-use pipeline beyond "run a command, read the output," MCP would be necessary.

For now, none of those apply. The tool answers documentation questions. The agent asks, the CLI responds, the answer lands in context. If the requirements change in a way that makes MCP genuinely necessary, I will add it. But I am not going to add architectural complexity in anticipation of requirements that do not exist yet.

The broader principle

I think many developer tools are adopting MCP because it is the new standard, not because their interaction model requires it. If your tool's entire integration surface is "the agent runs a command and reads the output," you might not need a server at all.

The best integration is the one the developer forgets exists. They install the tool; it shows up in their agent's capabilities, and it works without configuration, without a running process, and without consuming resources when it is not in use. For a meaningful category of developer tools, a CLI and a well-written skill file achieve that better than an MCP server does.

MCP will continue to grow, and for the right tools, it is the right choice. But "right tool for the right job" applies to integration patterns the same way it applies to everything else in software, and sometimes the right integration is just a terminal command.

docmancer is a free, open-source (MIT License) CLI that indexes documentation locally and lets AI coding agents query it directly, with no server, API key, or rate limits. Install it with pipx install docmancer --python python3.13.