DEV Community

Cover image for PocketMCP: Running a Local-First MCP Server
Kailash Sankar
Kailash Sankar

Posted on

PocketMCP: Running a Local-First MCP Server

I wanted a simple way to make my personal documents searchable and accessible directly from VS Code and Cursor. The goal: run everything fully local on my mini server (Intel N100, 16GB RAM) and keep it lightweight.

This project—PocketMCP—does exactly that. The entire codebase was generated with Cursor, and I’m sharing the repo and prompts here.


What It Does

  • Watches a folder (kb/) and automatically ingests documents.
  • Supports Markdown, plain text, PDF, and DOCX files.
  • Splits them into semantic chunks, embeds them with Transformers.js (MiniLM model).
  • Stores vectors in SQLite + sqlite-vec for efficient semantic search.
  • Exposes the data through:
    • MCP server (via stdio and HTTP) → usable by Cursor & VS Code.
    • Web UI & API server → for manual testing, debugging, and verification.

All of this runs offline after the initial model download. Perfect for a homelab setup where privacy and simplicity matter.


How It Works

  1. File Watcher
    Monitors the documents folder for changes and triggers ingestion.

  2. Chunking & Embedding
    Breaks documents into ~1000-character segments with overlap. Each segment is embedded using the MiniLM model.

  • PDFs: processed page by page (text-based only, no OCR).
  • DOCX: section-aware, with optional split-on-headings.
  1. Storage
    Embeddings are stored in SQLite using the sqlite-vec extension for fast similarity search.

  2. Serving

  • Stdio transport: standard MCP protocol for dev tool integration.
  • HTTP transport: http://localhost:8001/mcp for web or LAN access.
  • Web + API server: React UI (:5173) and Express API (:5174) to run searches, inspect the DB, and verify ingestion.

architecture


Why These Choices

  • Local-first: No dependencies beyond the initial model download.
  • Multi-format support: Works with .md, .txt, .pdf, and .docx.
  • SQLite: Minimal setup, no external DB needed.
  • Transformers.js: Runs embeddings in pure JS, avoiding Python stack overhead.
  • MCP protocol: Direct integration with Cursor and VS Code.

Designed for small hardware and low-friction setup, but scales to tens of thousands of chunks easily.


Running & Verifying

  • Run directly with pnpm dev:mcp or pull the Docker image.
  • Use the web server interface to confirm ingestion and search.
  • Repo includes sample files to test quickly.

Screenshots below:

Web app homepage
Web app search interface

Then integrate with VSCode, via stdio or http:

vscode stdio approach
vscode http approach


Notes & Limits

  • PDFs: only text-based; encrypted/too large files are skipped.
  • DOCX: .docx only (not legacy .doc). Large files may be truncated.
  • Chunking: adjustable via env vars (CHUNK_SIZE, CHUNK_OVERLAP).
  • Performance: sub-100ms query latency for typical datasets on N100-class hardware.

Repo + prompts: PocketMCP

Top comments (0)