Khoj Review 2026: Your Self-Hosted AI Second Brain

#opensource #ai #selfhosted #linux

This article was originally published on aifoss.dev

TL;DR: Khoj is the only serious open-source AI assistant that indexes your personal files — Obsidian vaults, org-mode, PDFs — and runs on your own hardware. Setup takes about 30 minutes via Docker Compose. Answer quality depends on which LLM you configure, but even a mid-range local model beats having your notes on someone else's server.

	Khoj (self-hosted)	Notion AI	Mem.ai
Best for	Obsidian/org-mode power users	Teams already in Notion	Solo creators wanting auto-organization
Privacy	Full — data stays on your machine	Notion's servers	Mem.ai's servers
Personal knowledge integration	Obsidian, org-mode, PDF, GitHub, Notion API	Notion pages only	Mem.ai notes only
Cost/month	~$0–$5 (electricity only)	$10+/user	$14.99/user
The catch	Docker setup required, AGPL-3.0 license	Can't index local files	No self-host option

Honest take: If you have an Obsidian vault with hundreds of notes and want to query it with AI that stays on your machine, Khoj is the only FOSS option worth running. Notion AI and Mem.ai send your knowledge base to their clouds and don't touch local filesystems.

What Khoj actually is

Khoj is an open-source personal AI assistant that does two things most alternatives skip: it indexes your own files, and it lets you bring your own LLM. You run it as a server — on a local machine, a home lab box, or a cheap VPS — point it at your documents, configure an LLM backend (local or cloud), and get a chat interface that answers from your actual knowledge base.

The GitHub repository (khoj-ai/khoj) has been actively maintained since 2022. The stable release on PyPI is v1.42.10, published July 2025. A v2.0 series has been in beta since early 2025 — beta.28 was pushed in late May 2026 — but the stable PyPI channel stays on the 1.42.x series for now. If you want the latest features, pull the beta image; if you want something predictable, stick with the stable pip package.

License is AGPL-3.0. For personal or internal-team use, this is a non-issue. If you're thinking about building a customer-facing product on top of Khoj, AGPL means you must open-source your modifications when you expose the service over a network. Talk to your legal team before going that route.

What it indexes

Khoj stores vector embeddings in PostgreSQL with the pgvector extension. The supported source types as of v1.42.10:

Markdown files — your entire Obsidian vault, any folder of .md files
Org-mode files — one of the few tools with native org support, a real differentiator for Emacs users
PDFs — text extraction via poppler
Word documents — .docx parsing
Plain text — .txt files
Notion pages — via Notion API integration (needs an integration token configured in your workspace)
GitHub repositories — can index Markdown and code files from a repo
Web pages — add URLs manually, or let the SearxNG-backed web search pull them at query time

Indexing is incremental. The first sync embeds everything; subsequent syncs only re-embed files that changed. For a 500-note Obsidian vault on a modern machine, the initial index runs in 3–8 minutes with a cloud embedding model.

Setup: Docker Compose in about 30 minutes

Self-hosting Khoj means running five containers: the main server, a PostgreSQL + pgvector database, a Terrarium Python sandbox (for code execution), a SearxNG instance (for web search), and an optional computer-control service. The docker-compose.yml in the repo orchestrates all of it.

# Download the compose file
curl -o docker-compose.yml \
  https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml

# Set required secrets
export KHOJ_ADMIN_PASSWORD=changeme
export KHOJ_DJANGO_SECRET_KEY=$(openssl rand -hex 32)

# Optional: add a cloud LLM API key if you don't want to run local models
export OPENAI_API_KEY=sk-...
# or
export ANTHROPIC_API_KEY=sk-ant-...

# Start everything
docker compose up -d

The Khoj server listens on port 42110. Open http://localhost:42110 and log in with the admin credentials you set. The PostgreSQL service uses the pgvector/pgvector:pg15 image with a health check so Khoj won't start until the database is fully ready — this prevents the connection errors that tripped up earlier versions.

Hardware requirements, per project documentation: minimum 4 GB RAM if you're using cloud LLM APIs. For local model inference via Ollama, plan for 8–16 GB RAM plus a GPU. For comfortable local inference with 30B-class models, 16 GB GPU VRAM is the recommended floor. An RTX 4090 handles Qwen2.5-72B quantized without breaking a sweat; an RTX 3090 covers the 30B range well. If you'd rather not buy GPU hardware, RunPod lets you spin up a GPU instance to run your Khoj server for a fraction of the purchase cost.

Connecting a local LLM via Ollama

If you'd rather keep your notes entirely off cloud APIs, Khoj integrates with Ollama cleanly. Once Ollama is running on the same host (or reachable over the network), configure it in the Khoj admin panel at http://localhost:42110/server/admin/:

Go to AI Models → LLM Model Config
Set the API base to your Ollama instance: http://host.docker.internal:11434
Set the model name to match what you've pulled in Ollama — e.g. qwen2.5:32b, llama3.1:8b, mistral:7b
Save and set it as the default chat model

The embedding model is a separate setting. Khoj defaults to text-embedding-3-small from OpenAI. To go fully local, swap it for nomic-embed-text via Ollama — configure it the same way in AI Models → Embedding Model Config, then trigger a full reindex. You only need to do this once.

Practical note: with an 8B local model, Khoj answers simple note retrieval questions well. Multi-hop reasoning ("what's the connection between the ideas in my Kubernetes notes and my database optimization notes?") is where smaller models start to struggle. A 32B-class model is where you stop noticing the gap compared to GPT-4o for most knowledge work.

The Obsidian workflow

The Khoj Obsidian plugin is in the community plugins marketplace. Install it, enable it, and open its settings to point it at http://localhost:42110. The plugin syncs your vault automatically on a periodic schedule; hit Force Sync in plugin settings to kick off an immediate reindex.

After sync, you get a chat panel inside Obsidian. Ask natural language questions — "What did I write about TCP congestion control?", "Summarize my notes on the Q4 project" — and Khoj retrieves the semantically relevant notes, passes them as context to your LLM, and returns an answer with citations linking back to the source notes.

The beta.25 release added progress tracking during batch sync, which matters once your vault grows past a few hundred notes. Large vaults (1,000+ notes) take 5–15 minutes on first index, shorter on subsequent syncs.

One thing that surprises people: the chat context persists through a session. You can ask a follow-up question and Khoj maintains the thread. This is handled by the server, not the plugin, so it works the same way in the browser interface.

Emacs, mobile, and other clients

For org-mode users, Khoj ships an Emacs package. Authenticate it to your self-hosted server, and you can call M-x khoj-chat or M-x khoj-search without leaving your editor. For anyone who lives in org-roam or uses org files as their primary knowledge store, this is the direct path — no browser required.

Other clients that work with your self-hosted Khoj:

Browser — the web interface at localhost:42110 has the full feature set: chat, search, agents, automations panel, and research mode
Desktop app — Electron wrapper available for macOS, Windows, and Linux; connects to a