If you've been watching the local-AI space lately, you've probably seen OpenClaw land 100k GitHub stars on the back of autonomous agents that build their own tools, their own social networks, and — if you're not careful — their own threat models.
AutoBot takes a different approach: you stay in control. Your data never leaves your machine. Your AI runs on your hardware. And the knowledge base — the thing that makes your local AI actually useful — is something you can read, extend, and contribute to.
This post is for Python developers who want to understand exactly how that knowledge base works, how to feed it your own codebase, and where to plug in if you want to help build it.
The Stack at a Glance
AutoBot's RAG pipeline is built on three components:
| Layer | Technology | Role |
|---|---|---|
| Embedding model | Ollama (configurable) | Text → vectors |
| Vector store | ChromaDB | Similarity search |
| Retrieval + generation | LlamaIndex | Query → answer |
All of it runs locally. No API calls. No data leaving your machine.
The main module lives at autobot-backend/knowledge/. The legacy knowledge_base.py at the backend root is a thin re-export shim — all real logic is in the knowledge/ package.
End-to-End Pipeline Walk-Through
1. Document Ingestion
Entry point: knowledge/documents.py — DocumentsMixin.add_document()
# knowledge/documents.py
async def add_document(
self,
content: str,
metadata: Dict[str, Any] = None,
doc_id: Optional[str] = None,
) -> Dict[str, Any]:
"""Add a document to the knowledge base with async processing."""
When you drop a file into AutoBot, this is what happens:
- Content arrives — plain text, Markdown, or PDF.
- Chunking — the document is split into overlapping chunks so context is preserved at retrieval time.
- Embedding — each chunk is converted to a 768-dimensional float vector by the configured Ollama model.
- Storage — vectors + original text land in ChromaDB, keyed by a stable document ID.
The embedding call goes through knowledge/embedding_cache.py (EmbeddingCache), which deduplicates repeated content and tracks usage via api/analytics_embedding_patterns.py (Issue #285). Cache hits skip the Ollama round-trip entirely — useful when you re-index after editing a doc.
2. Index Configuration
Entry point: knowledge/index.py — IndexMixin
ChromaDB uses HNSW (Hierarchical Navigable Small World) for approximate nearest-neighbour search. AutoBot exposes the tuning parameters directly:
# knowledge/index.py
def _get_hnsw_metadata(self) -> Dict[str, Any]:
return {
"hnsw:space": self.hnsw_space, # distance metric (cosine by default)
"hnsw:construction_ef": self.hnsw_construction_ef,
"hnsw:search_ef": self.hnsw_search_ef,
"hnsw:M": self.hnsw_m,
}
The current defaults are tuned for collections with 545k+ vectors (Issue #72). If you're running on modest hardware with a small KB, you can tighten hnsw:M to reduce memory pressure.
All ChromaDB calls are wrapped with asyncio.to_thread() (Issue #369) to keep the FastAPI event loop unblocked — something to be aware of if you're adding new index operations.
3. Query → Answer
Entry point: knowledge/base.py — KnowledgeBaseCore
On query:
- The question is embedded with the same Ollama model used at ingestion (same vector space = valid similarity).
- HNSW search finds the top-k most similar chunks.
- The chunks are passed to LlamaIndex as context alongside the query.
- LlamaIndex sends the augmented prompt to the local Ollama LLM.
- The answer references your documents, not generic training data.
# knowledge/base.py — core wiring
from llama_index.core import Settings, VectorStoreIndex
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.vector_stores.chroma import ChromaVectorStore
4. Advanced Retrieval
Entry point: autobot-backend/advanced_rag_optimizer.py
For complex queries, AutoBot can upgrade from plain vector search to a hybrid pipeline:
-
Hybrid scoring — blends semantic similarity (HNSW cosine) with BM25 keyword score via
knowledge/search_components/reranking.py. - Query expansion — reformulates the question to improve recall on technical vocabulary mismatches.
- MAP-Elites diversification — ensures results span multiple knowledge categories rather than returning near-duplicate chunks.
-
GPU acceleration —
utils/semantic_chunker_gpu.pyuses RTX 4070 / OpenVINO where available.
The SearchResult dataclass in advanced_rag_optimizer.py carries both the raw content and all four score dimensions (semantic_score, keyword_score, hybrid_score, rerank_score) — useful if you want to instrument retrieval quality.
5. Background Vectorization
Entry point: autobot-backend/background_vectorization.py — BackgroundVectorizer
When you add new facts or documents while AutoBot is running, BackgroundVectorizer picks them up asynchronously via FastAPI background tasks. You don't have to trigger a full re-index — the KB stays live.
Feeding Your Codebase to the Knowledge Base
AutoBot has a dedicated CodeEmbeddingGenerator (autobot-backend/code_embedding_generator.py) that uses CodeBERT instead of a generic text embedding model. Code has different semantics than prose — function names, types, and structure matter — and CodeBERT is trained on code.
# code_embedding_generator.py
@dataclass
class CodeEmbeddingResult:
embedding: np.ndarray # 768-dim CodeBERT vector
device_used: str # 'npu', 'cuda', or 'cpu'
processing_time_ms: float
model_name: str
cache_hit: bool
To index your codebase:
Option 1 — Via the chat UI
You: Index the ./src directory into the knowledge base
AutoBot: ✓ Scanning ./src...
Indexed 847 functions across 63 files
Embedding device: NPU (OpenVINO)
Ready for semantic code search
Option 2 — Via the connector system
The knowledge/connectors/ directory has a registry (registry.py) and a scheduler (scheduler.py). You can register a file-server connector pointing at your repo root and let AutoBot watch for changes:
# knowledge/connectors/file_server.py
# Register your source directory as a watched connector
connector = FileServerConnector(
root_path="/path/to/your/repo",
watch=True,
file_extensions=[".py", ".md", ".yaml"],
)
Option 3 — Notion, web, database
Connectors also exist for Notion (notion.py), web crawl (web_crawler.py), audio (audio_connector.py), and database (database.py). The base class is knowledge/connectors/base.py — implement fetch() and register via registry.py.
Where to Plug In: Contributing to the KB Engine
Here are the cleanest entry points for first contributions:
knowledge/documents.py — DocumentsMixin
Good for: adding new file format support (EPUB, HTML, DOCX), improving chunking strategy.
The add_document() and related methods are well-isolated. A chunking improvement here applies to every ingestion path.
knowledge/connectors/ — Connector Registry
Good for: adding new data sources (GitHub issues, Jira, Slack export).
Implement the BaseConnector interface and register in registry.py. Look at notion.py for a reference implementation with authentication handling.
advanced_rag_optimizer.py — Hybrid Search
Good for: retrieval quality improvements, new reranking strategies, better query expansion.
The SearchResult + QueryContext dataclasses are clean — adding a new scoring dimension means extending the dataclass and wiring it into compute_blended_score() in knowledge/search_components/reranking.py.
knowledge/index.py — HNSW Tuning
Good for: performance work on large vector collections, memory footprint reduction.
The HNSW parameter exposure is deliberately simple. There's room for adaptive tuning based on collection size and hardware profile.
background_vectorization.py — BackgroundVectorizer
Good for: incremental sync improvements, smarter deduplication, conflict resolution when a connector and a manual upload touch the same document.
Running the KB Locally
# Clone and start the full stack
git clone https://github.com/mrveiss/AutoBot-AI
cd AutoBot-AI
docker compose up -d
# Or use the installer script
curl -fsSL https://raw.githubusercontent.com/mrveiss/AutoBot-AI/Dev_new_gui/install.sh | bash
The knowledge base stores vectors in ./data/chromadb/ by default. It persists across container restarts.
To run just the backend in dev mode:
cd autobot-backend
pip install -r requirements.txt
uvicorn app_factory:create_app --factory --reload --port 8000
Where to Go Next
If you want to contribute to the Python side:
- Good first issues (Python label): github.com/mrveiss/AutoBot-AI/labels/python
- All good first issues: github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue
- Contributing guide: CONTRIBUTING.md
- GitHub Discussions: github.com/mrveiss/AutoBot-AI/discussions
If this article saved you an hour of reading source code, you can buy me a coffee on Ko-fi — it goes directly toward hardware time for the project.
AutoBot is free, open source, and runs entirely on your hardware. The RAG pipeline is the core of what makes a local AI assistant actually useful — and it's a great place to dig in.
Your data. Your AI.
Top comments (0)