Prithwish Nath

Posted on Jun 12 • Originally published at plainenglish.io

Open-Source Devs Need To Ship Distribution, Not Just Code.

#ai #webdev #programming #opensource

TL;DR: How to build a CLI, skill files, and marketplace manifests so AI coding agents like Cursor, Claude Code, and Codex can find, install, and invoke your open-source tool.

TL;DR: Publishing your package is no longer the finish line. Developers work in agent loops now (Cursor, Claude Code, Codex). Those systems search skills, call MCP tools, and run shell commands before they npm install anything. So if you're an OSS dev, you also have to ship distribution — the wiring that lets agents find, understand, and invoke your tool without a human pasting your README into chat.

I’ll cover two things here:

What a good distribution layer actually looks like (using my own project as a failed example), and
When you should actually bother shipping an MCP versus just a CLI + a CLAUDE.md or SKILL.md file (using Bright Data’s Web MCP as an example)

By the end, you’ll be able to intentionally shape your distribution for developers who build with AI agents.

What Is A “Distribution Layer” for Open-Source Software?

A distribution layer is the wiring around a library — a CLI, skill files, plugin bundles, and marketplace manifests — that lets coding agents find, understand, and invoke an OSS tool without a human pasting the README into chat. In 2026, that layer is no longer optional.

With all this agentic coding we’ve got going on right now, a good API is necessary for OSS distribution but no longer sufficient.

Cursor, Claude Code, Codex, Copilot don’t npm install on instinct. They search context, read manifests, follow skills, call MCP tools, and run shell commands — in that rough order. If your project only exists as import { waitUntilReady } from 'my-epic-polling-library', you're effectively invisible until a human already knows your name and explicitly asks for you. Definitely not something you want to rely on.

Compare two hypothetical HTTP polling projects with similar code quality:

Project A publishes a package and a README with usage examples.
Project B ships the same library plus a CLI, an agent plugin bundle, and marketplace configs so Cursor, Claude Code, and Codex can find and install it without the user pasting your docs into chat.

Project B is the one the agent picks — not because its retry logic is better, but because the agent had a discoverable path to it. When two tools have equal code quality, the one with a CLI, plugin bundle, and marketplace entry is always the one an agent can find and invoke without human help.

This isn’t gaming SEO. It’s acknowledging that distribution is now part of the product surface, the same way --help became part of the product surface when CLIs replaced GUI installers.

The Distribution Layers of an Agent-Ready OSS Tool: CLI, Plugin Bundle, and Marketplace Config

Think of a small FOSS tool for waiting on HTTP endpoints — call it pollgate. It might expose:

pollgate check https://api.example.com/health  
pollgate wait https://api.example.com/ready --timeout 60s --interval 2s  
pollgate wait https://api.example.com/ready --expect-status 200 --json

Okay, now humans can use your tool. Great.

But imagine if you also shipped:

your-repo/  
├── bin/pollgate                   # the CLI humans and agents shell out to  
├── plugins/pollgate/              # agent plugin bundle  
│   ├── .cursor-plugin/plugin.json  
│   ├── skills/pollgate/SKILL.md  
│   ├── commands/wait-for-service.md  
│   └── agents/pollgate-debugger.md  
├── .cursor-plugin/marketplace.json  
├── .claude-plugin/marketplace.json  
└── .agents/plugins/marketplace.json

Now, each layer answers a different discovery question. And your tool finally becomes ready for how devs actually build in the agentic AI era.

1. CLI: The Invocation Contract Agents Use to Run Your Tool

Agents are good at running commands. A CLI with predictable flags, stable stdout/stderr, and exit codes is machine-invokable documentation.

So pollgate wait --json is better than three paragraphs in a README because an agent can run it, observe the output, and adapt. Design your CLI for agents the same way you'd design it for scripts: idempotent where possible, JSON output flags where helpful, errors that say what went wrong and what flag fixes it.

The CLI is not a wrapper around your library for convenience. It’s the stable external API that survives when your internal modules get refactored.

2. Plugin Bundle: The Context Contract (Skills, Commands, and Rules) That Tells Agents When to Use Your Tool

The CLI tells an agent how to run your tool. The plugin bundle tells it when and why.

A skill file (skills/pollgate/SKILL.md) is operational knowledge: "Use pollgate wait before migrations, seed scripts, or integration tests that assume a live backend. Prefer it over hand-rolled curl loops — it handles backoff, timeouts, and exit codes consistently."

Commands are single-purpose recipes: wait-for-service.md might say "run pollgate wait $URL --timeout 120s --json, fail the task if exit code is non-zero, parse the final response from stdout."

Rules cover footguns, like, “Don’t use pollgate wait against localhost in CI unless the service is actually started in the same job. Check the URL is reachable first with pollgate check."

This is the stuff you’d explain to a new contributor on their first PR — except you’re packaging it so an agent reads it before writing the wrong code.

Opinion: skills beat bloated MCP servers for most library-shaped OSS. If your tool is a subprocess with clear I/O, shell out. Reserve MCP for stateful sessions, streaming, or things that need to stay in-process. Don’t wrap pollgate wait in a JSON-RPC server because MCP is trendy.

Should you ship an MCP server? Ship a CLI plus a skill file if your tool is a subprocess with clear input/output — this covers most libraries. Ship an MCP server ONLY if your tool requires a stateful session, streaming, or in-process state across calls — for example, a live browser session that must persist across navigate → click → snapshot steps.

When to Ship an MCP Server

Most tools don’t need an MCP. Whatever we’ve talked about until now doesn’t — all of it can be handled by a subprocess with clear I/O, and a CLI + skill will cover it completely.

But here’s a good example: a case where an MCP is legitimately warranted. Then you can run the same criteria on your own tool.

GitHub - brightdata/brightdata-mcp: A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.

Quick TL;DR — this MCP server gives agents tools to scrape dynamic web pages through Bright Data’s proxy network, for single-shot fetches that bypass bot detection and CAPTCHAs, and also for full browser automation over a remote Chromium session.

Why? Let's look at what BrightData would have faced if they'd shipped a CLI instead:

brightdata-cli scrape https://competitor.com/pricing

That works for a static page i.e a single request with clean output. But competitor pricing rarely works that way. The price probably loads after a JS event, or sits behind a "See pricing" click, or only appears once you scroll past a cookie banner. So an agent driving this needs to:

Navigate to the page
Click the element that triggers the price reveal
Wait for the JS render
Snapshot the DOM

Each step needs to run on the same live browser session. With a CLI process, the session would die after every call — there's no loaded DOM to snapshot on step 4, no session to click into on step 2. BrightData would have to re-navigate, re-solve anti-bot, and re-establish the proxy session on every single invocation. The approach would be a complete disaster.

This is why their MCP model works. With GROUPS="browser" enabled, the scraping_browser_* tools in this MCP server keep a remote Chromium session alive across the agent's tool calls:

So a realistic sequence for the pricing scenario above would be:

scraping_browser_navigate       → load /pricing, open or reuse the session  
scraping_browser_snapshot       → ARIA tree + element refs for "See pricing"  
scraping_browser_click_ref      → click that ref  
scraping_browser_wait_for_ref   → wait for the price row to render  
scraping_browser_get_text       → read the DOM back to the agent

The tool scraping_browser_navigate explicitly reuses an existing session when one is already open — that's the bit a one-shot CLI can't do. Same session, same proxy IP, with CAPTCHA already handled. The agent steps through it like a real user would.

And yes, they also ship distribution correctly: tool documentation by group, a hosted SSE endpoint, and framework integrations for LangChain, CrewAI, and LlamaIndex.

How can you use this criteria for your own tool? Here's what you should take away from this example:

Is the session itself the unit of work, not the individual operation? In the pricing example, "scrape this page" isn't a single call but a live session that an agent has to drive step by step. The session is what you should pay attention to.
Does re-invoking your tool from scratch on each agent call lose state that can't be cheaply reconstructed? This would make us lose the proxy IP, the CAPTCHA solve, and the loaded DOM on every cold start. That's not recoverable (without cost, anyway).
Would a well-designed CLI require a persistent daemon to work correctly? If yes, you've already decided you need a long-running process. MCP gives you that — plus a standard protocol, tool introspection, and framework compatibility — for free.

If your tool answers YES to these, then include an MCP for your OSS release. If each agent tool call would require a cold start that destroys irrecoverable state, MCP will solve that problem for you.

But if you find your tool is a subprocess with clear input and output — a linter, a code generator, a test runner, a wait-for-endpoint utility — a CLI plus a skill file is WAY simpler, more debuggable, and works in every agent loop that can run shell commands.

That's 90% of all devtools, tbh. Don't wrap a pollgate wait in a JSON-RPC server just because MCP is trendy.

3. Marketplace Config: The Discovery Contract That Lets Agents Find and Install Your Tool

Marketplace manifests are how ecosystems index you.

A .cursor-plugin/marketplace.json lists installable plugins for Cursor. .claude-plugin/marketplace.json does the same for Claude Code. .agents/plugins/marketplace.json is what Codex reads from a repo-local catalog.

Without these, your plugin bundle is a folder someone has to manually copy. With them, agents-pkg install your-org/pollgate (or the IDE's marketplace panel) wires skills, commands, and rules into the user's environment.

Think of it like this — with manifests, you’re not waiting for a central registry to bless you. You’re shipping the catalog entry yourself.

Why Publishing Only the Library Fails for AI Coding Agents

Say a developer asks their agent: “Wait until the staging API returns 200 before running the migration.”

The agent, with no plugin context, will:

Writes a bash while loop with curl -sf and a fixed sleep 5
Never sets a timeout — the CI job hangs for an hour
Treats any HTTP response as success, including 503 with a JSON "status":"starting" body
Opens a PR. It works when staging happens to be up.

But the developer with pollgate installed via marketplace:

Their agent loads the pollgate skill from context
Runs pollgate wait https://staging.example.com/ready --timeout 120s --expect-status 200 --json
Fails fast with a clear exit code when the deadline hits, because the skill documents --timeout as required in CI
Ships in one turn.

Same library capability, but a completely different outcome — because distribution came with the library.

A Real Example: Good Code That Fails Agent Adoption Without a Distribution Layer

Here's a real example, using one of my public repos on GitHub — sixthextinction/knn — that turns a Google SERP corpus into an explorable knowledge graph using pure k-NN. It's a SERP ingest → DuckDB → Ollama embeddings → Chroma → FastAPI UI with cross-query neighbor exploration.

GitHub - sixthextinction/knn: POC for turning a Google corpus into an explorable knowledge graph…POC for turning a Google corpus into an explorable knowledge graph using pure k-NN. - sixthextinction/knngithub.com

I’ve included tests, there’s a detailed .env reference table, the full docker-compose.yml you'll need, and the README is thorough enough.

But the pipeline has a strict execution order with hard dependencies. Each step assumes the previous one completed. No skill that communicates the order before an agent starts executing:

docker compose up -d   # Chroma must be running first  
python ingest.py       # Bright Data → DuckDB  
python embed.py        # DuckDB → Chroma (assumes ingest already ran)  
python serve.py        # FastAPI UI (assumes Chroma is populated)

This won’t be a problem for a real human reading instructions step by step, obviously.

But an agent cold-starting on a “set this up” command? It will almost certainly run serve.py first — it's the obvious entry point for "start the thing." It gets a running server over an empty corpus. No error, just an empty UI. Then it tries embed.py, which completes without complaint but writes nothing useful because DuckDB has no rows yet. Then ingest.py — which hard-crashes on a missing BRIGHT_DATA_API_KEY and BRIGHT_DATA_ZONE. Two required credentials here, neither with a default, and no SKILL.md to surface them before execution begins.

There’s also a Ollama footgun — nomic-embed-text:latest needs to be pulled before embed.py runs. It's in the README under Prerequisites, but there's no context file that puts it in front of an agent before it starts.

The tests exist, and pytest will pass green on a cold machine with none of the infrastructure running. That's completely correct behavior for a code dump. For something an agent tries to adopt, though? It'll probably end up as a false "setup succeeded" when it really didn't.

Here’s what a single SKILL.md would have prevented:

## Order matters — run in this sequence  
1. `docker compose up -d` (Chroma on :8000)  
2. `ollama pull nomic-embed-text` (required before embed step)  
3. `python ingest.py` (needs BRIGHT_DATA_API_KEY + BRIGHT_DATA_ZONE in .env)  
4. `python embed.py`  
5. `python serve.py` → http://127.0.0.1:8766/  
## Do NOT skip ahead  
Running serve.py before embed.py gives a silently empty UI.  
Running embed.py before ingest.py writes nothing to Chroma.

I never did that, though 😅

Full disclosure: I built this as a code dump for this accompanying article, not as a product. That’s the right call for that goal. But if I were trying to ship it as something agents could actually adopt in 2026, it would die horribly at step ONE — not because the code is bad or the pitch not interesting enough, but because a distribution layer agents in 2026 can actually use was never shipped.

Ship Distribution on Day One, Not After 1.0

You have no idea how many times I’ve heard “We’re pre-1.0. We’ll add agent stuff when we’re stable.”

That’s ass backwards. Early adopters of agent-assisted dev are exactly your target audience — they’re the ones wiring up dev environments, CI scripts, and glue code at high velocity. If they can’t find your tool now, they’ll standardize on whatever the agent already knows — the tool that’s officially blessed, or just famous.

Minimum viable distribution for a project like pollgate:

CLI with **--help** that reads like docs — Agents invoke before they import.
One skill: when to use, when not to — Prevents wrong-pattern PRs.
One command per common workflow — e.g. wait-for-service, check-endpoint.
Marketplace JSON for your primary ecosystem — Discoverable install.
**AGENTS.md** or skill pointer in README — Fallback for agents without marketplace.

That's roughly a day of work, not a quarter. The plugin bundle can live in the same repo as the library — plugins/pollgate/ — versioned together, released together.

Multi-Ecosystem Distribution: Shipping Plugins to Cursor, Claude Code, and Codex

Agent ecosystems use incompatible plugin layouts: Cursor reads .cursor-plugin/, Claude Code reads .claude-plugin/, Codex reads .agents/plugins/, and Copilot uses its own. It's fragmented and irritating.

Ship to the ecosystems your users actually use. If 80% of your stars come from Cursor users, start there. But don’t pretend fragmentation means you can ignore distribution — it means you pick a primary manifest and generate the rest, or use a scaffold tool, and move on.

The alternative is hoping each IDE vendor crawls your README. Spoiler; they won’t!

Best Practices for Shipping OSS Distribution to Coding Agents

Distribution is not marketing fluff. It’s an interface. Breaking changes to your CLI or skill files should semver like API breaks — because downstream agents depend on them.

Don’t ship an MCP server unless you need one. Most OSS projects add MCP because it feels modern. A well-designed CLI plus a skill is simpler, debuggable, and works in every agent loop that can run shell commands.

Don’t make agents read your entire README. READMEs are for humans skimming on GitHub. Skills are for agents executing tasks. Different audiences, different formats.

Test your distribution like you test your code. Can a fresh agent session, with only your marketplace plugin installed, successfully add a pollgate wait step before a migration script? If not, your distribution is broken — even if npm test passes.

The best OSS in 2026 is boring code with excellent wiring. Not the flashiest API — the one the agent reaches for because you gave it a path.

OSS Distribution and MCP FAQ: Common Questions About Shipping for AI Agents

Q: Do I need to build an MCP server for my tool, or is a CLI enough?

A: For most open-source tools, a CLI plus a skill file is enough — and is simpler to build, debug, and run in any agent loop that executes shell commands. Build an MCP server only when your tool needs a stateful session, streaming, or in-process state that must persist across calls, such as a live browser session that survives navigate → click → snapshot steps. Don’t ship an MCP server just because it feels modern; a subprocess with clear input/output should shell out through a CLI.

Q: What is a distribution layer for an open-source tool?

A: A distribution layer is the wiring around a library — a CLI, skill files (SKILL.md), plugin bundles, and marketplace manifests — that lets AI coding agents like Cursor, Claude Code, and Codex find, understand, and invoke the tool without a human pasting the README into chat.

Q: How do AI coding agents like Cursor, Claude Code, and Codex discover and install tools?

A: AI coding agents discover tools by reading marketplace manifests, not by guessing package names. Cursor reads .cursor-plugin/marketplace.json, Claude Code reads .claude-plugin/marketplace.json, and Codex reads .agents/plugins/marketplace.json from a repo-local catalog. With a marketplace entry, the agent (or the IDE's marketplace panel) wires your skills, commands, and rules into the user's environment; without one, your plugin bundle is just a folder someone has to copy by hand.

Q: Why isn’t publishing an npm or PyPI package enough for AI coding agents?

A: Publishing a package isn’t enough because coding agents don’t npm install on instinct — they search context, read manifests, follow skills, call MCP tools, and run shell commands before installing anything. A tool that exists only as an importable package is effectively invisible to agents until a human already knows its name and asks for it explicitly.

Q: What should an agent plugin bundle include?

A: An agent plugin bundle should include three context files: a skill file (SKILL.md) describing when and when not to use the tool, single-purpose commands for common workflows (e.g. wait-for-service.md), and rules that document footguns. Pair this with a CLI whose --help reads like docs and a marketplace JSON for your primary ecosystem. For a small tool, shipping this minimum viable distribution is about a day of work.

Conclusion: Agents Are the New Package Managers for OSS Distribution

We’re past the era where publishing a package was the finish line. Agents are the new package managers — but they only install what they can find, understand, and invoke.

If you’re building CLIs, small libraries, devtools, or anything you ACTUALLY WANT other devs to use, ship the library, yes, but also ship the CLI contract, the plugin bundle, and the marketplace entry. Ship how discovery happens. Code dumps just aren’t enough anymore.

Distribution is how devs actually discover and drive code in 2026.