DEV Community

Cover image for mjm.local.docs: Open Source Local Knowledge Base with MCP
Mark Jack
Mark Jack

Posted on

mjm.local.docs: Open Source Local Knowledge Base with MCP

The Problem

You are mid-session with Claude Code or another AI coding assistant. You ask:

"How does authentication work in our system?"
"What was the decision behind using event sourcing in the orders module?"

The AI does its best, guessing from the code it can see. But the real answer is buried in a Word document, a PDF architecture diagram, or a Markdown ADR (Architecture Decision Record) that lives somewhere on your disk.

mjm.local.docs solves this.

It is an open-source, locally-deployed knowledge base server that exposes your documents through both:

  • A Blazor Web UI
  • A full Model Context Protocol (MCP) server

This allows your AI assistant to search, read, and even update your documentation directly from chat.

Built on .NET 10, it runs entirely on your machine:

  • No mandatory cloud dependency
  • No data leaving your environment
  • Full support for pluggable embedding models
  • Pluggable vector storage backends

GitHub:
https://github.com/markjackmilian/mjm.local.docs


What Is mjm.local.docs?

mjm.local.docs is a self-hosted semantic document search server.

At its core, it:

  • Ingests documents — PDF, Word (.docx), Markdown, plain text, and more
  • Chunks and embeds them using a configurable embedding provider (local or cloud-based)
  • Stores embeddings in a configurable vector store (SQLite, HNSW index, SQL Server, or in-memory)
  • Exposes search and management through:

    • A Blazor web interface
    • An MCP HTTP endpoint

Clean Architecture

Mjm.LocalDocs.Core           ← Domain models, interfaces (zero external dependencies)
Mjm.LocalDocs.Infrastructure ← Implementations: embeddings, readers, vector stores
Mjm.LocalDocs.Server         ← ASP.NET Core host, Blazor UI, MCP tools
Enter fullscreen mode Exit fullscreen mode

Everything is wired together via standard .NET dependency injection.

Every major component is swappable via configuration:

  • Embedding provider
  • Vector store
  • File storage

Deploy Locally in Minutes

Clone and run:

git clone https://github.com/markjackmilian/mjm.local.docs.git
cd mjm.local.docs/mjm.local.docs
dotnet run --project src/Mjm.LocalDocs.Server/Mjm.LocalDocs.Server.csproj
Enter fullscreen mode Exit fullscreen mode

Web UI:

http://localhost:5024
Enter fullscreen mode Exit fullscreen mode

Default credentials:

admin / admin
Enter fullscreen mode Exit fullscreen mode

(Change them in appsettings.json.)

MCP endpoint:

http://localhost:5024/mcp
Enter fullscreen mode Exit fullscreen mode

No Docker required.
No cloud account needed.

Out of the box:

  • In-memory vector store
  • Deterministic fake embedding generator
  • No API key required

Perfect for local development.


Pluggable Embedding Providers

Embedding generation is fully pluggable via the IEmbeddingService interface.

Configured in appsettings.json.

Provider Provider Value Dimension Notes
Fake Fake 1536 Deterministic word-hash vectors. Dev/test only.
OpenAI OpenAI 1536 Uses text-embedding-3-small. API key required.
Azure OpenAI AzureOpenAI 1536 Azure-hosted OpenAI.
Ollama Ollama 768 Fully local embeddings.

All providers implement:

IEmbeddingGenerator<string, Embedding<float>>
Enter fullscreen mode Exit fullscreen mode

You can bring your own implementation easily.


Running 100% Locally with Ollama

Example configuration:

{
  "LocalDocs": {
    "Embeddings": {
      "Provider": "Ollama",
      "Dimension": 768,
      "Ollama": {
        "Endpoint": "http://localhost:11434",
        "Model": "nomic-embed-text"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Popular models:

Model Dimension Trade-off
nomic-embed-text 768 Balanced quality/speed
mxbai-embed-large 1024 Higher quality, slower
all-minilm 384 Fastest, lower quality

Storage Options: From SQLite to SQL Server

Configured via:

LocalDocs:Storage:Provider
Enter fullscreen mode Exit fullscreen mode
Provider Description Vector Search Best For
InMemory RAM only O(n) brute-force Dev/testing
Sqlite EF Core + BLOB embeddings O(n) cosine Small/medium KB
SqliteHnsw SQLite + HNSW index file O(log n) approx Larger KB
SqlServer SQL Server 2025+ VECTOR type DiskANN Enterprise/Azure

Connection string examples:

// SQLite
"ConnectionStrings": { "LocalDocs": "Data Source=localdocs.db" }

// SQL Server
"ConnectionStrings": { "LocalDocs": "Server=myserver.database.windows.net;Database=localdocs;..." }
Enter fullscreen mode Exit fullscreen mode

Vector Search Under the Hood

Brute-Force (SQLite)

  • Embeddings stored as BLOBs
  • Cosine similarity against every chunk
  • O(n)
  • Reliable for thousands of documents

HNSW (SqliteHnsw)

Adds a persisted HNSW graph:

{
  "LocalDocs": {
    "Storage": {
      "Provider": "SqliteHnsw",
      "Hnsw": {
        "MaxConnections": 16,
        "EfConstruction": 200,
        "EfSearch": 50,
        "AutoSaveDelayMs": 5000
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Approximate O(log n).
Ideal for tens of thousands of chunks.

SQL Server DiskANN

CREATE TABLE [dbo].[chunk_embeddings] (
    chunk_id NVARCHAR(255) PRIMARY KEY,
    embedding VECTOR(1536) NOT NULL
);

CREATE VECTOR INDEX vec_idx_chunk_embeddings
ON [dbo].[chunk_embeddings](embedding)
WITH (metric = 'cosine');
Enter fullscreen mode Exit fullscreen mode

Supported metrics:

  • cosine
  • euclidean
  • dotproduct

Document Processing

Supported Formats

Format Reader Notes
.pdf PdfPig Native text only (no OCR)
.docx NPOI Modern .docx only
.md Markdown reader Preserves syntax
.txt Plain text UTF-8
.html, .json, .xml, .csv Fallback UTF-8 Raw extraction

Chunking

{
  "LocalDocs": {
    "Chunking": {
      "MaxChunkSize": 3000,
      "OverlapSize": 300
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Chunk IDs follow the format:

{DocumentId}_chunk_{index}
Enter fullscreen mode Exit fullscreen mode

Document Versioning

  • Old version marked IsSuperseded = true
  • Old chunks removed from search
  • Full history preserved
  • Version chain visible in UI

No silent history loss.


The Blazor Web UI

Built with:

  • Blazor Server
  • MudBlazor

Features

Dashboard

  • Total projects
  • Total documents
  • Storage usage

Project Management

  • Drag-and-drop multi-file upload
  • Inline Markdown editor ("Add Know How")
  • Version history navigation
  • Edit, delete, download

MCP Config Page

  • Generates ready-to-use MCP config snippet

API Token Management

  • Named Bearer tokens
  • Optional expiry
  • Tokens shown once
  • Revocable anytime

MCP: Let Your AI Navigate Your Knowledge Base

Exposes 10 MCP tools.

Tool Description
search_docs Semantic search
add_document Add new document
update_document Create new version
get_document_content Full extracted text
list_projects List projects
create_project Create project
get_project Project details
delete_project Delete project
list_documents List documents
get_document Metadata + preview
delete_document Delete document

Connecting Claude Code / OpenCode

.claude/mcp.json:

{
  "mcpServers": {
    "local-docs": {
      "type": "http",
      "url": "http://localhost:5024/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_TOKEN"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

UI vs MCP

Task Best Via
Initial setup Web UI
Bulk upload Web UI
Version review Web UI
Token management Web UI
Inline authoring Web UI
Semantic search MCP
AI-driven updates MCP
Programmatic ingestion MCP

Configuration Reference

Production with Ollama + HNSW

{
  "ConnectionStrings": {
    "LocalDocs": "Data Source=localdocs.db"
  },
  "LocalDocs": {
    "Authentication": {
      "Username": "admin",
      "Password": "your-secure-password"
    },
    "Mcp": {
      "RequireAuthentication": true
    },
    "Embeddings": {
      "Provider": "Ollama",
      "Dimension": 768,
      "Ollama": {
        "Endpoint": "http://localhost:11434",
        "Model": "nomic-embed-text"
      }
    },
    "Storage": {
      "Provider": "SqliteHnsw"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Environment variables supported:

OPENAI_API_KEY
AZURE_OPENAI_ENDPOINT
AZURE_OPENAI_API_KEY
AZURE_STORAGE_CONNECTION_STRING
Enter fullscreen mode Exit fullscreen mode

Key Packages

Package Purpose
Microsoft.SemanticKernel AI orchestration
Microsoft.Extensions.AI Embedding abstraction
ModelContextProtocol.AspNetCore MCP server
EF Core Persistence
MudBlazor UI
PdfPig PDF extraction
NPOI Word extraction
Azure.Storage.Blobs Blob storage
Serilog Logging
xunit + NSubstitute Testing

Contributing

Open source and actively developed.

Contributions welcome:

  • New embedding providers
  • Storage backends
  • Document readers
  • MCP tools

Repository:

https://github.com/markjackmilian/mjm.local.docs


Conclusion

mjm.local.docs fills a growing gap in AI-assisted development:

Your AI assistant needs access to your knowledge, not just your code.

By combining:

  • A Blazor UI for humans
  • A full MCP server for AI

It bridges your team's accumulated knowledge with your AI tools — fully local if you choose.

Give it a try.
Star the repo.
Open a PR.


If you found this article helpful, follow me on GitHub, Twitter, and Bluesky.

Thanks for reading!

Top comments (0)