Everyday Dev

Posted on Sep 1

Integrate Any LLMs.txt into Your MCP (with Stripe as the Example)

#mcp #llm #ai #tutorial

If you’re building or using AI agents, you want answers that are current, source‑backed, and token‑efficient. That’s exactly what llms.txt plus the Model Context Protocol (MCP) delivers: a way to discover the right docs, fetch only what you need, and wire it straight into your IDE or agent host.

In this guide, you’ll learn:

What llms.txt is and why it matters
How to consume any llms.txt (we’ll use Stripe’s as an example)
How to integrate docs via an MCP server into Cursor, Windsurf, or Claude Desktop
Best practices for security, performance, and governance
Troubleshooting tips and advanced patterns

We’ll use Stripe’s public llms.txt at:

https://docs.stripe.com/llms.txt

What is llms.txt?

llms.txt is a human‑ and LLM‑readable Markdown file at the root of a docs site (for example, /llms.txt). It’s like a compact, curated “sitemap for AI” that lists the most important, LLM‑friendly pages—often with .md mirrors for clean parsing and minimal tokens.

Why it’s useful:

Curated: Points your agent at the best sources, not every page
Efficient: Markdown mirrors parse cleanly and compress well
Reliable: Reduces hallucinations by nudging agents to fetch real docs

Core structure (typical):

H1 title
Optional summary in a blockquote
H2 sections with bullet‑listed links: [Title](URL) — optional note
Optional section (commonly titled “Optional” or “Additional”) for nice‑to‑have pages

Try opening Stripe’s: https://docs.stripe.com/llms.txt

How llms.txt and MCP Work Together

llms.txt: Discovery layer. It tells agents where the “good stuff” lives.
MCP (Model Context Protocol): Execution layer. It standardizes how an agent host (like an IDE) talks to tools/servers that can read resources (docs), call tools (APIs), and apply prompts.

Put simply: Use an MCP “docs server” that knows how to read llms.txt and fetch the linked pages on demand. Your IDE/agent then calls MCP tools to grab only the pages it needs to answer a question.

We’ll use an off‑the‑shelf server (mcpdoc) to make this easy.

Quick Start: Consume Stripe’s llms.txt with an MCP Server

We’ll run a local MCP server that:

Registers a “Stripe Docs” source via its llms.txt
Exposes tools to list sources and fetch docs
Speaks stdio (for IDEs) or SSE (for browser tooling/inspection)

Prerequisites

Install uv (a fast Python package runner):

curl -LsSf https://astral.sh/uv/install.sh | sh

Verify:

uvx --version

Option A: Run Over SSE (Great for Inspection)

uvx --from mcpdoc mcpdoc \
  --urls "Stripe:https://docs.stripe.com/llms.txt" \
  --transport sse \
  --port 8082 \
  --host localhost

The server will fetch and index Stripe’s llms.txt.
You can point an MCP inspector at it to explore tools/resources.

Optional: Open the MCP Inspector in another terminal:

npx @modelcontextprotocol/inspector

Then connect to:

URL: http://localhost:8082

Option B: Run Over stdio (For IDE Integration)

Many IDEs/hosts expect stdio transport. Just switch the flag:

uvx --from mcpdoc mcpdoc \
  --urls "Stripe:https://docs.stripe.com/llms.txt" \
  --transport stdio

We’ll wire this into popular IDEs next.

Integrate with Your IDE (Cursor, Windsurf, Claude Desktop)

Below are minimal configurations that register the server and nudge the agent to fetch docs before answering.

Cursor

Edit or create ~/.cursor/mcp.json:

{
  "mcpServers": {
    "stripe-docs-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "mcpdoc",
        "mcpdoc",
        "--urls",
        "Stripe:https://docs.stripe.com/llms.txt",
        "--transport",
        "stdio"
      ]
    }
  }
}

Optional: Add a simple “use‑the‑docs‑first” rule in Cursor’s User Rules:

For any questions about Stripe, use the MCP server "stripe-docs-mcp":
1) call list_doc_sources
2) call fetch_docs for Stripe's llms.txt to see curated pages
3) select and fetch the most relevant .md pages
4) answer citing the fetched URLs

Windsurf (Codeium)

Edit ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "stripe-docs-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "mcpdoc",
        "mcpdoc",
        "--urls",
        "Stripe:https://docs.stripe.com/llms.txt",
        "--transport",
        "stdio"
      ]
    }
  }
}

Add a corresponding “fetch before answer” instruction in Windsurf’s settings.

Claude Desktop

On macOS, edit:
~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "stripe-docs-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "mcpdoc",
        "mcpdoc",
        "--urls",
        "Stripe:https://docs.stripe.com/llms.txt",
        "--transport",
        "stdio"
      ]
    }
  }
}

Tip: If Python path issues arise, add an explicit interpreter:

"args": [
  "--python",
  "/usr/bin/python3",
  "--from",
  "mcpdoc",
  "mcpdoc",
  "--urls",
  "Stripe:https://docs.stripe.com/llms.txt",
  "--transport",
  "stdio"
]

How to Use It in Practice (Agent Flow)

Ask your IDE agent something like:

“How does Stripe recommend handling API version upgrades?”
“Show me how to verify webhook signatures with Stripe.”

The ideal flow:
1) list_doc_sources → confirms Stripe source is registered
2) fetch_docs on https://docs.stripe.com/llms.txt → returns curated links
3) fetch_docs on the most relevant .md pages (for example, api/versioning.md, upgrades.md, webhooks.md, webhooks/signature.md)
4) Agent answers and cites the exact URLs it fetched

This approach yields:

Current answers directly grounded in Stripe docs
Minimal token waste
Clear traceability via MCP tool logs

Useful Stripe Pages Typically Found via llms.txt

API overview: https://docs.stripe.com/api.md
Versioning: https://docs.stripe.com/api/versioning.md
Upgrades: https://docs.stripe.com/upgrades.md
Testing: https://docs.stripe.com/testing.md
Webhooks: https://docs.stripe.com/webhooks.md
Webhook signatures: https://docs.stripe.com/webhooks/signature.md
Connect: https://docs.stripe.com/connect.md

Note: Always rely on the live llms.txt to see the authoritative, current set.

Advanced: Combine Multiple llms.txt Sources

Your work might span multiple ecosystems (e.g., Stripe + LangChain + your internal docs). Point the server at several llms.txt files:

uvx --from mcpdoc mcpdoc \
  --urls \
  "Stripe:https://docs.stripe.com/llms.txt" \
  "LangChain:https://python.langchain.com/llms.txt" \
  "LangGraph:https://langchain-ai.github.io/langgraph/llms.txt" \
  --transport sse \
  --port 8082 \
  --host localhost

Your agent can now fetch across these sources, still grounded in curated pages.

Building Your Own Docs MCP Server (Optional)

If you prefer custom logic or internal/private docs:

Expose resources:
- One resource for the llms.txt itself
- One resource per linked .md page (on demand)
Expose tools:
- list_doc_sources to enumerate registered llms.txt endpoints
- fetch_docs(urls: string[]) to retrieve pages as needed
Add authentication for private sources (API keys, OAuth) and enforce domain allowlists.

Skeleton (Python) with a pseudo‑MCP server outline:

import asyncio
from typing import List
import httpx

ALLOWED_DOMAINS = {"docs.stripe.com"}

async def fetch_text(url: str) -> str:
    domain = url.split("/")[2]
    if domain not in ALLOWED_DOMAINS:
        raise ValueError(f"Domain not allowed: {domain}")
    async with httpx.AsyncClient(timeout=30) as client:
        r = await client.get(url, headers={"Accept": "text/markdown,text/plain,*/*"})
        r.raise_for_status()
        return r.text

# Pseudo-MCP handlers:
async def list_doc_sources():
    return [{"label": "Stripe", "url": "https://docs.stripe.com/llms.txt"}]

async def fetch_docs(urls: List[str]):
    results = []
    for url in urls:
        try:
            text = await fetch_text(url)
            results.append({"url": url, "ok": True, "content": text})
        except Exception as e:
            results.append({"url": url, "ok": False, "error": str(e)})
    return results

# Wire these into your MCP SDK of choice (FastMCP, custom, etc.)

Use an MCP SDK (for example, FastMCP) to register these as tools/resources with stdio/SSE transports.

Security, Performance, and Governance

Trust but verify: llms.txt is discovery, not trust. Enforce domain allowlists in your server and host.
Prefer .md mirrors: Faster, cleaner parsing. Many sites expose .md or content‑negotiated Markdown.
Rate limits & caching:
- Cache fetched pages by URL and ETag/Last‑Modified.
- Backoff on 429s; respect robots and publisher guidance if applicable.
Permissions:
- For private docs, authenticate and log all access.
- Keep IDE/host tool traces enabled for auditing.
Governance:
- If you publish docs, add /llms.txt and consider an “Optional” section for secondary content.
- Review and prune your llms.txt to keep it tight and useful.

Troubleshooting

Domain blocked errors:

Add --allowed-domains for off‑domain links, e.g.:

uvx --from mcpdoc mcpdoc \
  --urls "Stripe:https://docs.stripe.com/llms.txt" \
  --allowed-domains docs.stripe.com,anotherdomain.com \
  --transport stdio

Tools not visible in IDE:
- Ensure the process is running and using --transport stdio.
- Validate your JSON config paths and syntax.
Python/uv path issues:
- Add --python /path/to/python to uvx args (Claude Desktop often needs this).
Agent ignores the server:
- Add a short rule reminding it to use list_doc_sources and fetch_docs before answering Stripe questions.

Example: A Complete Developer Flow

1) Start server (stdio for IDE use):

uvx --from mcpdoc mcpdoc \
  --urls "Stripe:https://docs.stripe.com/llms.txt" \
  --transport stdio

2) Configure your IDE (Cursor/Windsurf/Claude Desktop) as shown above.

3) Ask: “What’s the recommended approach to handle Stripe webhook signature verification?”

Agent calls list_doc_sources
Agent fetches https://docs.stripe.com/llms.txt
Agent fetches https://docs.stripe.com/webhooks.md and https://docs.stripe.com/webhooks/signature.md
Agent answers with steps and includes links for verification

Result: An answer grounded in live Stripe docs, fetched just‑in‑time.

Wrap‑Up

llms.txt gives your agents a curated map; MCP turns that map into action. By plugging Stripe’s llms.txt (or any other) into an MCP docs server, you get:

Source‑backed answers
Token efficiency and faster runs
Auditable, composable workflows across multiple documentation sets

If you’d like, I can:

Generate a ready‑to‑paste IDE config for your team
Add rules that force “fetch before answer” for your key vendors
Scaffold a custom MCP server that merges Stripe docs with your internal guides behind auth

Happy building—and happy fetching.

DEV Community