The Problem
You are mid-session with Claude Code or another AI coding assistant. You ask:
"How does authentication work in our system?"
"What was the decision behind using event sourcing in the orders module?"
The AI does its best, guessing from the code it can see. But the real answer is buried in a Word document, a PDF architecture diagram, or a Markdown ADR (Architecture Decision Record) that lives somewhere on your disk.
mjm.local.docs solves this.
It is an open-source, locally-deployed knowledge base server that exposes your documents through both:
- A Blazor Web UI
- A full Model Context Protocol (MCP) server
This allows your AI assistant to search, read, and even update your documentation directly from chat.
Built on .NET 10, it runs entirely on your machine:
- No mandatory cloud dependency
- No data leaving your environment
- Full support for pluggable embedding models
- Pluggable vector storage backends
GitHub:
https://github.com/markjackmilian/mjm.local.docs
What Is mjm.local.docs?
mjm.local.docs is a self-hosted semantic document search server.
At its core, it:
- Ingests documents — PDF, Word (.docx), Markdown, plain text, and more
- Chunks and embeds them using a configurable embedding provider (local or cloud-based)
- Stores embeddings in a configurable vector store (SQLite, HNSW index, SQL Server, or in-memory)
-
Exposes search and management through:
- A Blazor web interface
- An MCP HTTP endpoint
Clean Architecture
Mjm.LocalDocs.Core ← Domain models, interfaces (zero external dependencies)
Mjm.LocalDocs.Infrastructure ← Implementations: embeddings, readers, vector stores
Mjm.LocalDocs.Server ← ASP.NET Core host, Blazor UI, MCP tools
Everything is wired together via standard .NET dependency injection.
Every major component is swappable via configuration:
- Embedding provider
- Vector store
- File storage
Deploy Locally in Minutes
Clone and run:
git clone https://github.com/markjackmilian/mjm.local.docs.git
cd mjm.local.docs/mjm.local.docs
dotnet run --project src/Mjm.LocalDocs.Server/Mjm.LocalDocs.Server.csproj
Web UI:
http://localhost:5024
Default credentials:
admin / admin
(Change them in appsettings.json.)
MCP endpoint:
http://localhost:5024/mcp
No Docker required.
No cloud account needed.
Out of the box:
- In-memory vector store
- Deterministic fake embedding generator
- No API key required
Perfect for local development.
Pluggable Embedding Providers
Embedding generation is fully pluggable via the IEmbeddingService interface.
Configured in appsettings.json.
| Provider | Provider Value | Dimension | Notes |
|---|---|---|---|
| Fake | Fake | 1536 | Deterministic word-hash vectors. Dev/test only. |
| OpenAI | OpenAI | 1536 | Uses text-embedding-3-small. API key required. |
| Azure OpenAI | AzureOpenAI | 1536 | Azure-hosted OpenAI. |
| Ollama | Ollama | 768 | Fully local embeddings. |
All providers implement:
IEmbeddingGenerator<string, Embedding<float>>
You can bring your own implementation easily.
Running 100% Locally with Ollama
Example configuration:
{
"LocalDocs": {
"Embeddings": {
"Provider": "Ollama",
"Dimension": 768,
"Ollama": {
"Endpoint": "http://localhost:11434",
"Model": "nomic-embed-text"
}
}
}
}
Popular models:
| Model | Dimension | Trade-off |
|---|---|---|
| nomic-embed-text | 768 | Balanced quality/speed |
| mxbai-embed-large | 1024 | Higher quality, slower |
| all-minilm | 384 | Fastest, lower quality |
Storage Options: From SQLite to SQL Server
Configured via:
LocalDocs:Storage:Provider
| Provider | Description | Vector Search | Best For |
|---|---|---|---|
| InMemory | RAM only | O(n) brute-force | Dev/testing |
| Sqlite | EF Core + BLOB embeddings | O(n) cosine | Small/medium KB |
| SqliteHnsw | SQLite + HNSW index file | O(log n) approx | Larger KB |
| SqlServer | SQL Server 2025+ VECTOR type | DiskANN | Enterprise/Azure |
Connection string examples:
// SQLite
"ConnectionStrings": { "LocalDocs": "Data Source=localdocs.db" }
// SQL Server
"ConnectionStrings": { "LocalDocs": "Server=myserver.database.windows.net;Database=localdocs;..." }
Vector Search Under the Hood
Brute-Force (SQLite)
- Embeddings stored as BLOBs
- Cosine similarity against every chunk
- O(n)
- Reliable for thousands of documents
HNSW (SqliteHnsw)
Adds a persisted HNSW graph:
{
"LocalDocs": {
"Storage": {
"Provider": "SqliteHnsw",
"Hnsw": {
"MaxConnections": 16,
"EfConstruction": 200,
"EfSearch": 50,
"AutoSaveDelayMs": 5000
}
}
}
}
Approximate O(log n).
Ideal for tens of thousands of chunks.
SQL Server DiskANN
CREATE TABLE [dbo].[chunk_embeddings] (
chunk_id NVARCHAR(255) PRIMARY KEY,
embedding VECTOR(1536) NOT NULL
);
CREATE VECTOR INDEX vec_idx_chunk_embeddings
ON [dbo].[chunk_embeddings](embedding)
WITH (metric = 'cosine');
Supported metrics:
- cosine
- euclidean
- dotproduct
Document Processing
Supported Formats
| Format | Reader | Notes |
|---|---|---|
| PdfPig | Native text only (no OCR) | |
| .docx | NPOI | Modern .docx only |
| .md | Markdown reader | Preserves syntax |
| .txt | Plain text | UTF-8 |
| .html, .json, .xml, .csv | Fallback UTF-8 | Raw extraction |
Chunking
{
"LocalDocs": {
"Chunking": {
"MaxChunkSize": 3000,
"OverlapSize": 300
}
}
}
Chunk IDs follow the format:
{DocumentId}_chunk_{index}
Document Versioning
- Old version marked
IsSuperseded = true - Old chunks removed from search
- Full history preserved
- Version chain visible in UI
No silent history loss.
The Blazor Web UI
Built with:
- Blazor Server
- MudBlazor
Features
Dashboard
- Total projects
- Total documents
- Storage usage
Project Management
- Drag-and-drop multi-file upload
- Inline Markdown editor ("Add Know How")
- Version history navigation
- Edit, delete, download
MCP Config Page
- Generates ready-to-use MCP config snippet
API Token Management
- Named Bearer tokens
- Optional expiry
- Tokens shown once
- Revocable anytime
MCP: Let Your AI Navigate Your Knowledge Base
Exposes 10 MCP tools.
| Tool | Description |
|---|---|
| search_docs | Semantic search |
| add_document | Add new document |
| update_document | Create new version |
| get_document_content | Full extracted text |
| list_projects | List projects |
| create_project | Create project |
| get_project | Project details |
| delete_project | Delete project |
| list_documents | List documents |
| get_document | Metadata + preview |
| delete_document | Delete document |
Connecting Claude Code / OpenCode
.claude/mcp.json:
{
"mcpServers": {
"local-docs": {
"type": "http",
"url": "http://localhost:5024/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}
UI vs MCP
| Task | Best Via |
|---|---|
| Initial setup | Web UI |
| Bulk upload | Web UI |
| Version review | Web UI |
| Token management | Web UI |
| Inline authoring | Web UI |
| Semantic search | MCP |
| AI-driven updates | MCP |
| Programmatic ingestion | MCP |
Configuration Reference
Production with Ollama + HNSW
{
"ConnectionStrings": {
"LocalDocs": "Data Source=localdocs.db"
},
"LocalDocs": {
"Authentication": {
"Username": "admin",
"Password": "your-secure-password"
},
"Mcp": {
"RequireAuthentication": true
},
"Embeddings": {
"Provider": "Ollama",
"Dimension": 768,
"Ollama": {
"Endpoint": "http://localhost:11434",
"Model": "nomic-embed-text"
}
},
"Storage": {
"Provider": "SqliteHnsw"
}
}
}
Environment variables supported:
OPENAI_API_KEY
AZURE_OPENAI_ENDPOINT
AZURE_OPENAI_API_KEY
AZURE_STORAGE_CONNECTION_STRING
Key Packages
| Package | Purpose |
|---|---|
| Microsoft.SemanticKernel | AI orchestration |
| Microsoft.Extensions.AI | Embedding abstraction |
| ModelContextProtocol.AspNetCore | MCP server |
| EF Core | Persistence |
| MudBlazor | UI |
| PdfPig | PDF extraction |
| NPOI | Word extraction |
| Azure.Storage.Blobs | Blob storage |
| Serilog | Logging |
| xunit + NSubstitute | Testing |
Contributing
Open source and actively developed.
Contributions welcome:
- New embedding providers
- Storage backends
- Document readers
- MCP tools
Repository:
https://github.com/markjackmilian/mjm.local.docs
Conclusion
mjm.local.docs fills a growing gap in AI-assisted development:
Your AI assistant needs access to your knowledge, not just your code.
By combining:
- A Blazor UI for humans
- A full MCP server for AI
It bridges your team's accumulated knowledge with your AI tools — fully local if you choose.
Give it a try.
Star the repo.
Open a PR.
If you found this article helpful, follow me on GitHub, Twitter, and Bluesky.
Thanks for reading!


Top comments (0)