DEV Community

Sheikh Ahnaf Hasan
Sheikh Ahnaf Hasan

Posted on

I Built an MCP Server to Search Documentation from Claude (So You Don't Have to Web Search)

Ever been coding with Claude and needed to check the docs for a library? You either:

  1. Open a browser, search, scan through pages
  2. Ask Claude, who might hallucinate outdated info
  3. Copy-paste docs into the chat (token explosion)

I got tired of this loop. So I built DocMCP – an MCP server that crawls documentation sites, indexes them locally, and lets Claude search them directly.

What is MCP?

Model Context Protocol (MCP) is Anthropic's open standard for connecting AI assistants to external tools and data. Think of it as "plugins for Claude" – but standardized and secure.

The Problem

Claude's knowledge has a cutoff date. When you're working with:

  • A new library version
  • Recently updated API docs
  • Framework-specific patterns

...Claude might not have the latest info. Web search helps, but it's slow and breaks your flow.

The Solution

DocMCP lets you:

# One-time setup
npx @pieeee/docmcp init

# Index any docs site
npx @pieeee/docmcp add https://tailwindcss.com/docs
npx @pieeee/docmcp add https://react.dev
npx @pieeee/docmcp add https://docs.astro.build
Enter fullscreen mode Exit fullscreen mode

Then in Claude Code, Cursor, or Claude Desktop:

"How do I center a div in Tailwind?"

Claude searches your local indexed docs and gives you accurate, up-to-date answers.

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Crawl Docs │ ──► │  Chunk +    │ ──► │   SQLite    │
│  (sitemap)  │     │  Embed      │     │  FTS5 + Vec │
└─────────────┘     └─────────────┘     └─────────────┘
                                               │
                                               ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Claude    │ ◄── │  MCP Server │ ◄── │   Hybrid    │
│   Answer    │     │  (stdio)    │     │   Search    │
└─────────────┘     └─────────────┘     └─────────────┘
Enter fullscreen mode Exit fullscreen mode
  1. Crawl: Follow sitemap or recursive links
  2. Parse: Clean HTML → Markdown, preserve code blocks
  3. Chunk: Split at headings (~512 tokens each)
  4. Index: BM25 (keyword) + vector embeddings (semantic)
  5. Search: Hybrid ranking with Reciprocal Rank Fusion

Why Hybrid Search?

Pure keyword search fails on:

"how to make elements wrap to next line"

(Answer: flex-wrap – but "wrap" ≠ "flex-wrap")

Pure vector search fails on:

"useState hook"

(Exact API names need exact matches)

DocMCP combines both. BM25 handles exact terms, vectors handle semantics. Results are merged using RRF (Reciprocal Rank Fusion).

Setup in 2 Minutes

1. Install

npm install -g @pieeee/docmcp
Enter fullscreen mode Exit fullscreen mode

2. Initialize

docmcp init
Enter fullscreen mode Exit fullscreen mode

Choose your embedding provider:

  • Anthropic (Voyage) – recommended for Claude users
  • OpenAI – if you have an OpenAI key
  • BM25 only – no API key needed, keyword search only

3. Index Some Docs

docmcp add https://tailwindcss.com/docs
docmcp add https://nextjs.org/docs
Enter fullscreen mode Exit fullscreen mode

4. Connect to Claude

Claude Code:

claude mcp add docmcp -- docmcp serve
Enter fullscreen mode Exit fullscreen mode

Cursor (~/.cursor/mcp.json):

{
  "mcpServers": {
    "docmcp": {
      "command": "docmcp",
      "args": ["serve"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "docmcp": {
      "command": "docmcp",
      "args": ["serve"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Real Example

After indexing Tailwind docs:

Me: "How do I add a gradient background in Tailwind?"

Claude (using search_docs):

Based on the Tailwind documentation, you can use the bg-gradient-to-{direction} utilities:

<div class="bg-gradient-to-r from-cyan-500 to-blue-500">
  <!-- Content -->
</div>

Directions: t (top), r (right), b (bottom), l (left), tr, br, etc.

Accurate. Up-to-date. No hallucination.

What's Under the Hood

  • SQLite with FTS5 for BM25 full-text search
  • sqlite-vec for vector similarity search
  • Crawlee for respectful, sitemap-aware crawling
  • Turndown for HTML → Markdown conversion
  • Written in TypeScript, runs on Node.js 20+

All data stays local in ~/.docmcp/. No cloud dependencies (except embedding API calls if you use one).

Embedding Provider Comparison

Provider API Key Best For
Anthropic (Voyage) ANTHROPIC_API_KEY Claude users, high quality
OpenAI OPENAI_API_KEY Already have OpenAI key
BM25 only None Zero setup, privacy-first

BM25-only mode is surprisingly good for technical docs where you're often searching exact function names.

Roadmap

  • [ ] Local embeddings (ONNX, no API needed)
  • [ ] add_docs MCP tool (index from Claude directly)
  • [ ] Scheduled re-crawls for freshness
  • [ ] VS Code extension

Try It Out

npm install -g @pieeee/docmcp
docmcp init
docmcp add https://docs.your-favorite-library.com
Enter fullscreen mode Exit fullscreen mode

GitHub: github.com/pieeee/docmcp

Top comments (0)