A private RAG system where you drop in PDFs, Word docs, and code files and ask questions. Runs on any machine, no cloud dependency.
What You Need
- Any computer (GPU optional - CPU works fine)
- Ollama installed
- About 10 minutes
Architecture
| Component | Role |
|---|---|
| AnythingLLM | Desktop/server app with RAG, agents, built-in vector DB |
| Ollama | Serves local LLM for chat + embeddings |
| Qwen3 14B | Default model for answering questions |
Setup
1. Install Ollama
# Install from ollama.com, or run with Docker:
docker run -d --gpus all -p 11434:11434 --name ollama \
-v ollama:/root/.ollama ollama/ollama
# Pull a model:
ollama pull qwen3:14b
# Pull an embedder:
ollama pull nomic-embed-text
2. Install AnythingLLM
Desktop app (easiest): Download from anythingllm.com
Docker:
docker run -d -p 3001:3001 --name anythingllm \
--add-host host.docker.internal:host-gateway \
-v anythingllm:/app/server/storage \
mintplexlabs/anythingllm
3. Connect & Use
- Open AnythingLLM (http://localhost:3001 or desktop app)
- Settings > LLM Provider > Select Ollama, model qwen3:14b
- Settings > Embedder > Select Ollama, model nomic-embed-text
- Create a workspace, drop in documents, start asking questions
What You Can Do
- Chat with PDFs, Word docs, code files, web pages
- Create isolated workspaces per project
- Use built-in agent skills (web search, summarization)
- Works on CPU-only machines like a mini PC
Cost vs Cloud
| Local | ChatGPT + GPTs | |
|---|---|---|
| Monthly | $0 | $20-200 |
| Hardware | $0-300 | $0 |
| Privacy | Stays on your machine | Sent to cloud |
| Documents | Unlimited | Token-limited |
Full guide with troubleshooting: https://everylocalai.com/stack/anythingllm-ollama-rag
Top comments (0)