<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mārtiņš Veiss</title>
    <description>The latest articles on DEV Community by Mārtiņš Veiss (@mrveiss).</description>
    <link>https://dev.to/mrveiss</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3867783%2Fb7397987-7baa-4c63-b090-414707e4daa6.jpg</url>
      <title>DEV Community: Mārtiņš Veiss</title>
      <link>https://dev.to/mrveiss</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mrveiss"/>
    <language>en</language>
    <item>
      <title>Weekly Update: ✨ feat(backend): lightweight inference mode — bypass RAG/mem</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 01 Jun 2026 05:00:12 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-featbackend-lightweight-inference-mode-bypass-ragmem-g4g</link>
      <guid>https://dev.to/mrveiss/weekly-update-featbackend-lightweight-inference-mode-bypass-ragmem-g4g</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ feat(backend): lightweight inference mode — bypass RAG/memory for trivial tier (MVA-1992) (#9160)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 feat(backend): lightweight inference mode — bypass RAG/memor&lt;/li&gt;
&lt;li&gt;🔧 feat(plugins): add capability approval dialog and audit log &lt;/li&gt;
&lt;li&gt;🔧 fix(ci): sync Ansible slm_agent role + fix chromadb hardened&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-25T05:00:04.234278Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-25T05:00:04.234278Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Running AI in regulated environments: how AutoBot keeps your documents on-premise</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Thu, 28 May 2026 06:51:48 +0000</pubDate>
      <link>https://dev.to/mrveiss/running-ai-in-regulated-environments-how-autobot-keeps-your-documents-on-premise-pb6</link>
      <guid>https://dev.to/mrveiss/running-ai-in-regulated-environments-how-autobot-keeps-your-documents-on-premise-pb6</guid>
      <description>&lt;p&gt;Most AI productivity tools are asking you to trust a third party with your data. For a solo dev building side projects, that trade-off is fine. For a law firm, a hospital system, or a fintech company — it isn't.&lt;/p&gt;

&lt;p&gt;This post is for the second group. I want to walk through exactly how AutoBot handles data in a regulated environment, where the risks actually sit, and what you need to configure to deploy it safely.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with cloud AI in regulated industries
&lt;/h2&gt;

&lt;p&gt;When you send a prompt to GPT-4, Claude, or Gemini, your text crosses the network. That's obvious. What's less obvious is what else goes with it when you're using most AI platforms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The documents you've uploaded to provide context&lt;/li&gt;
&lt;li&gt;The "system prompt" that describes your business or patient workflows&lt;/li&gt;
&lt;li&gt;Metadata about what you're working on and when&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For HIPAA-covered entities, PHI (Protected Health Information) cannot be processed by a business associate without a signed BAA. Most consumer AI products don't offer BAAs. The ones that do cost enterprise pricing and require legal review cycles.&lt;/p&gt;

&lt;p&gt;GDPR's article 28 has similar requirements for data processors. SOC 2 Type II audits will ask where your data goes. ISO 27001 requires you to document and control it.&lt;/p&gt;

&lt;p&gt;None of this means you can't use AI. It means you need to choose carefully where your data goes.&lt;/p&gt;




&lt;h2&gt;
  
  
  AutoBot's data model
&lt;/h2&gt;

&lt;p&gt;AutoBot separates two things that most platforms conflate: the &lt;strong&gt;knowledge base&lt;/strong&gt; and the &lt;strong&gt;brain&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The knowledge base&lt;/strong&gt; is the documents you upload — your patient intake forms, your case files, your customer contracts, your internal codebase. In AutoBot, this data never leaves your machine. The RAG engine (Retrieval-Augmented Generation) indexes your documents locally into ChromaDB, a vector database running on your own hardware. When you ask a question, the relevant chunks are retrieved locally — no external API call has happened yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The brain&lt;/strong&gt; is the LLM that synthesizes the retrieved context into an answer. This is where you have a choice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run it locally via Ollama or any OpenAI-compatible server — prompts never leave your network&lt;/li&gt;
&lt;li&gt;Route to a cloud model (GPT-4, Claude) — only the synthesized prompt goes out, &lt;strong&gt;not your documents&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The line is: the brain phones home. Your documents don't.&lt;/p&gt;

&lt;p&gt;This is meaningfully different from uploading files to a cloud AI assistant. In AutoBot, a search for "what does our standard NDA say about IP assignment" retrieves the relevant clause locally and sends only the question + retrieved text to the LLM. Your full NDA document never leaves your hardware.&lt;/p&gt;




&lt;h2&gt;
  
  
  Network isolation in practice
&lt;/h2&gt;

&lt;p&gt;Data model is one layer. Network configuration is another.&lt;/p&gt;

&lt;p&gt;AutoBot runs on Docker Compose. The default configuration is suitable for a development environment — it exposes ports on &lt;code&gt;0.0.0.0&lt;/code&gt;, which means anything on your network can reach it. For production in a regulated environment, you need to tighten this.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Bind to localhost, not all interfaces
&lt;/h3&gt;

&lt;p&gt;In your &lt;code&gt;docker-compose.override.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:3000:3000"&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:8000:8000"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means AutoBot is only reachable from the host machine itself. Put a reverse proxy (nginx, Caddy) in front of it that handles TLS and access control.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Internal service network
&lt;/h3&gt;

&lt;p&gt;Keep internal services off the host network entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autobot-internal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;internal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;chromadb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;autobot-internal&lt;/span&gt;
  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;autobot-internal&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;autobot-internal&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;  &lt;span class="c1"&gt;# only backend needs external access for LLM calls&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ChromaDB and Redis should never be reachable outside the Docker network.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Resource limits
&lt;/h3&gt;

&lt;p&gt;One question I keep seeing (shoutout to @clawnewsai.bsky.social for surfacing this): does AutoBot enforce resource limits by default?&lt;/p&gt;

&lt;p&gt;Currently no — the &lt;code&gt;deploy.resources&lt;/code&gt; stanza in docker-compose.yml is left to the operator. For production, add explicit limits to prevent one runaway process from starving the host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;cpus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0"&lt;/span&gt;
          &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;4G&lt;/span&gt;
        &lt;span class="na"&gt;reservations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;cpus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.5"&lt;/span&gt;
          &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1G&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real-world numbers vary with load. A good starting point for a team of 5-10 users: 4 vCPUs and 8GB RAM for the full stack on a dedicated host.&lt;/p&gt;




&lt;h2&gt;
  
  
  What goes where: the data flow audit
&lt;/h2&gt;

&lt;p&gt;For compliance documentation, here's exactly what leaves your network in each configuration:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Prompts&lt;/th&gt;
&lt;th&gt;Answers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Full local (Ollama)&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid (OpenAI/Claude for LLM)&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;Yes — to LLM provider&lt;/td&gt;
&lt;td&gt;Returned from LLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud-only (no local Ollama)&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;Yes — to LLM provider&lt;/td&gt;
&lt;td&gt;Returned from LLM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In all cases: your documents stay on your hardware. The ChromaDB vector store is local. The retrieval step is local.&lt;/p&gt;

&lt;p&gt;If you're running Ollama on the same host, nothing crosses the network boundary at all. This is the right configuration for HIPAA-covered environments until you have a BAA with your LLM provider.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started in a restricted environment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI

&lt;span class="c"&gt;# Copy and edit the environment file&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env

&lt;span class="c"&gt;# For full local operation: install Ollama first&lt;/span&gt;
&lt;span class="c"&gt;# https://ollama.com/download&lt;/span&gt;
ollama pull llama3.2

&lt;span class="c"&gt;# Start AutoBot&lt;/span&gt;
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Connect Ollama as your LLM provider. Upload your first document.&lt;/p&gt;

&lt;p&gt;From that point, nothing has left your machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Compliance checklist before production
&lt;/h2&gt;

&lt;p&gt;This is not legal advice. Get a qualified compliance officer to sign off before processing regulated data. But here's the technical baseline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Reverse proxy with TLS in front of AutoBot&lt;/li&gt;
&lt;li&gt;[ ] Ports bound to &lt;code&gt;127.0.0.1&lt;/code&gt;, not &lt;code&gt;0.0.0.0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] ChromaDB and Redis on internal Docker network only&lt;/li&gt;
&lt;li&gt;[ ] Resource limits set per service&lt;/li&gt;
&lt;li&gt;[ ] Ollama running locally if zero data egress is required&lt;/li&gt;
&lt;li&gt;[ ] Audit logging enabled on the reverse proxy layer&lt;/li&gt;
&lt;li&gt;[ ] Host firewall rules (&lt;code&gt;ufw&lt;/code&gt; / &lt;code&gt;firewalld&lt;/code&gt;) blocking unexpected inbound&lt;/li&gt;
&lt;li&gt;[ ] Regular backups of ChromaDB volume (your knowledge base)&lt;/li&gt;
&lt;li&gt;[ ] BAA executed with any cloud LLM provider you route to&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The DevOps guide at &lt;a href="https://dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive-into-docker-compose-model-sizing-and-production-ops-2d56"&gt;dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive&lt;/a&gt; covers the reverse proxy and firewall setup in detail.&lt;/p&gt;




&lt;h2&gt;
  
  
  The overhead question
&lt;/h2&gt;

&lt;p&gt;I keep getting asked: what's the overhead for a small team?&lt;/p&gt;

&lt;p&gt;For a team of 5-10: a single machine with 16GB RAM and a modern CPU runs the full stack comfortably in CPU-only mode. GPU is optional — it dramatically accelerates local inference but isn't required. The DevOps guide has sizing tables for different workload profiles.&lt;/p&gt;

&lt;p&gt;The operational overhead is similar to running any self-hosted application — you own the uptime, you manage the updates, you back up the data. For regulated industries, that overhead is already priced in. You're not adding new burden; you're moving existing compliance obligations to infrastructure you control.&lt;/p&gt;




&lt;h2&gt;
  
  
  The bottom line
&lt;/h2&gt;

&lt;p&gt;AutoBot doesn't solve your compliance program. But it removes the data-egress problem from the AI layer entirely.&lt;/p&gt;

&lt;p&gt;If you're in a regulated industry and you've been waiting on cloud AI because you can't figure out the data residency question — the answer is to not send the data in the first place.&lt;/p&gt;

&lt;p&gt;Your knowledge base stays on your machine. You pick the brain. If you pick a local brain, nothing leaves your network.&lt;/p&gt;

&lt;p&gt;That's the architecture. The compliance framework on top is yours to build. But the foundation is solid.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AutoBot is open source at &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI&lt;/a&gt;. Full documentation, Docker deployment guides, and community discussions are there. If you're working through a regulated-environment deployment and hit a configuration question, open a discussion — the community has seen most of the edge cases.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>selfhosted</category>
      <category>security</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Weekly Update: ✨ fix(llc): address LLC API/model gaps (#8479 #8478 #8476 #8</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 25 May 2026 05:00:10 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-fixllc-address-llc-apimodel-gaps-8479-8478-8476-8-57f2</link>
      <guid>https://dev.to/mrveiss/weekly-update-fixllc-address-llc-apimodel-gaps-8479-8478-8476-8-57f2</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ fix(llc): address LLC API/model gaps (#8479 #8478 #8476 #8474 #8462 #8461 #8493)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 fix(llc): address LLC API/model gaps (#8479 #8478 #8476 #847&lt;/li&gt;
&lt;li&gt;🔧 fix(llc): fix 18 LLC functional bugs from sprint discovery (&lt;/li&gt;
&lt;li&gt;🔧 fix(llc): restore missing enums, auth, migrations, SAST, CI &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-18T05:00:06.602657Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-18T05:00:06.602657Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Inside AutoBot's Frontend: A Developer Walkthrough</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:32:51 +0000</pubDate>
      <link>https://dev.to/mrveiss/inside-autobots-frontend-a-developer-walkthrough-2k3j</link>
      <guid>https://dev.to/mrveiss/inside-autobots-frontend-a-developer-walkthrough-2k3j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AutoBot&lt;/strong&gt; is the open-source, self-hosted AI automation platform where your data never leaves your server.&lt;br&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What you see when you open AutoBot
&lt;/h2&gt;

&lt;p&gt;AutoBot's chat interface greets you with a familiar two-pane layout: a conversation sidebar on the left and an active chat panel on the right.  Behind that simplicity lives a rich UI built from about 40 focused Vue single-file components.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chat UI
&lt;/h3&gt;

&lt;p&gt;The core chat flow is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ChatView.vue
  └── ChatInterface.vue
        ├── ChatSidebar.vue        ← conversation list + search
        ├── ChatHeader.vue         ← model selector, settings toggle
        ├── ChatMessages.vue       ← scrolling message feed
        │     └── MessageItem.vue  ← per-message bubble + citations
        ├── ChatInput.vue          ← textarea, attachments, send button
        ├── ChatTabs.vue           ← switch between Chat / Browser / Docs
        └── CitationsDisplay.vue   ← inline source links from RAG
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ChatTabs.vue&lt;/code&gt; component is the pivot point: it lets you jump between a raw conversation, an embedded browser session (for web research), and a documentation search sidebar — all within the same view.&lt;/p&gt;

&lt;h3&gt;
  
  
  Knowledge Base UI
&lt;/h3&gt;

&lt;p&gt;AutoBot's Knowledge Base is where the "your data" part of &lt;em&gt;Your data. Your AI.&lt;/em&gt; lives.  The &lt;code&gt;KnowledgeView.vue&lt;/code&gt; brings together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeBrowser&lt;/strong&gt; — file-tree style explorer of all ingested documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeSearch&lt;/strong&gt; — full-text + vector search with &lt;code&gt;KBSearchResultPanel&lt;/code&gt; rendering scored results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeGraph / KnowledgeGraph3D&lt;/strong&gt; — D3-powered entity graph so you can &lt;em&gt;see&lt;/em&gt; how concepts connect&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeUpload&lt;/strong&gt; — drag-and-drop ingestion with real-time vectorization progress (&lt;code&gt;VectorizationProgressModal&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeMaintenance&lt;/strong&gt; — deduplication, cleanup stats, and orphan management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pipeline inside &lt;code&gt;KnowledgeView&lt;/code&gt; fans out to more than 30 sub-components, but each one has a narrow responsibility.  If you add a new panel, you're usually only touching one file.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component architecture in &lt;code&gt;autobot-frontend/&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;autobot-frontend/
├── src/
│   ├── components/       # feature-scoped component trees
│   │   ├── chat/
│   │   ├── knowledge/
│   │   ├── agents/
│   │   ├── browser/
│   │   ├── charts/
│   │   └── base/         # shared primitives (buttons, modals, …)
│   ├── views/            # route-level pages (one per route)
│   ├── stores/           # Pinia stores (useChatStore, useKnowledgeStore, …)
│   ├── composables/      # shared reactive logic
│   ├── design-system/    # tokens.ts — canonical token catalog
│   ├── router/           # Vue Router config
│   └── styles/           # global CSS + Tailwind @theme block
├── cypress/              # end-to-end tests
└── package.json          # Vue 3 + Vite + Tailwind CSS 4 + TypeScript
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tech stack at a glance&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Framework&lt;/td&gt;
&lt;td&gt;Vue 3 (Composition API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build&lt;/td&gt;
&lt;td&gt;Vite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;Pinia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Styling&lt;/td&gt;
&lt;td&gt;Tailwind CSS 4 (&lt;code&gt;@theme&lt;/code&gt; tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing&lt;/td&gt;
&lt;td&gt;Vitest (unit) + Cypress/Playwright (e2e)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storybook&lt;/td&gt;
&lt;td&gt;Component stories live in &lt;code&gt;src/stories/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Design tokens
&lt;/h3&gt;

&lt;p&gt;All colors, spacings, and radii flow from &lt;code&gt;src/design-system/tokens.ts&lt;/code&gt;.  This file is the single source of truth for token &lt;em&gt;names&lt;/em&gt;; actual values live in &lt;code&gt;src/assets/tailwind.css&lt;/code&gt; under the &lt;code&gt;@theme&lt;/code&gt; block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tokens.ts (abridged)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SEMANTIC_COLORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-primary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-primary text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-secondary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-secondary text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-success text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="c1"&gt;// …&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding a new brand color means two edits: &lt;code&gt;tailwind.css&lt;/code&gt; for the value, &lt;code&gt;tokens.ts&lt;/code&gt; to register the name.  That's it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to contribute to the UI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Get the repo running
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI/autobot-frontend
npm &lt;span class="nb"&gt;install
&lt;/span&gt;npm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dev server starts at &lt;code&gt;http://localhost:5173&lt;/code&gt;.  You don't need a running backend to work on visual components — the Storybook stories in &lt;code&gt;src/stories/&lt;/code&gt; cover most UI primitives in isolation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Explore Storybook
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run storybook
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DesignTokens.stories.ts&lt;/code&gt; gives you the full token palette in one page.  If you want to see a component in isolation before wiring it up to real data, stories are the right place to start.&lt;/p&gt;

&lt;h3&gt;
  
  
  Find good first issues
&lt;/h3&gt;

&lt;p&gt;The fastest path to a first contribution is the &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;&lt;code&gt;good first issue&lt;/code&gt; + &lt;code&gt;area: frontend&lt;/code&gt;&lt;/a&gt; label combination.  These are scoped to single components or small style fixes — no need to understand the full stack before opening a PR.&lt;/p&gt;

&lt;p&gt;Common entry points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility&lt;/strong&gt; — the &lt;code&gt;ACCESSIBILITY_IMPROVEMENTS.md&lt;/code&gt; doc tracks open a11y work across the chat and KB UIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design token gaps&lt;/strong&gt; — new palette entries or missing dark-mode mappings in &lt;code&gt;tailwind.css&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storybook coverage&lt;/strong&gt; — components in &lt;code&gt;src/components/base/&lt;/code&gt; that don't have a story yet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; — &lt;code&gt;src/components/**/__tests__/&lt;/code&gt; has gaps; Vitest tests are welcome.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Testing approach
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; (&lt;code&gt;npm run test:unit&lt;/code&gt;) — use Vitest + Vue Test Utils.  Keep tests in &lt;code&gt;__tests__/&lt;/code&gt; next to the component.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;E2E&lt;/strong&gt; (&lt;code&gt;npm run test:e2e:dev&lt;/code&gt;) — Cypress tests live in &lt;code&gt;cypress/&lt;/code&gt;.  Run against the Vite dev server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type-check&lt;/strong&gt; — &lt;code&gt;npx vue-tsc --noEmit -p tsconfig.app.json&lt;/code&gt;.  The repo has a tracked baseline of ~248 type errors (legacy debt); PRs should not &lt;em&gt;add&lt;/em&gt; errors — see the CI check in &lt;code&gt;.github/workflows/frontend-typecheck-regression.yml&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  PR checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;npm run lint&lt;/code&gt; passes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npm run test:unit&lt;/code&gt; passes (or new tests added for the changed component)&lt;/li&gt;
&lt;li&gt;No new type errors vs. the baseline&lt;/li&gt;
&lt;li&gt;Storybook story updated/added if you touched a &lt;code&gt;base/&lt;/code&gt; component&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Your data. Your AI.
&lt;/h2&gt;

&lt;p&gt;AutoBot's frontend reflects the same philosophy as the project: everything runs locally, nothing is sent to a third party, and every part of the stack is open for you to inspect, extend, or replace.&lt;/p&gt;

&lt;p&gt;If this post helped you find your way around the codebase, the best next step is to open an issue or pick one that's already waiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;Good first issues — frontend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;GitHub Sponsors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/blob/main/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING.md&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>vue</category>
      <category>typescript</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
    <item>
      <title>Self-Hosting AutoBot: A DevOps Deep Dive into Docker Compose, Model Sizing, and Production Ops</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:31:40 +0000</pubDate>
      <link>https://dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive-into-docker-compose-model-sizing-and-production-ops-2d56</link>
      <guid>https://dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive-into-docker-compose-model-sizing-and-production-ops-2d56</guid>
      <description>&lt;p&gt;You've seen the demos. You want to run AutoBot on your own hardware, your own data, under your own control. Good instinct. Here's the full operational picture — Docker Compose internals, how to match LLM models to your GPU or CPU, and the production habits that keep things stable long-term.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Self-Host?
&lt;/h2&gt;

&lt;p&gt;AutoBot's tagline is &lt;strong&gt;"Your data. Your AI."&lt;/strong&gt; That's not marketing copy — it's an architectural choice. When you self-host:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversations never leave your network&lt;/li&gt;
&lt;li&gt;You choose which models run (open-weight, cloud API, or a mix)&lt;/li&gt;
&lt;li&gt;Upgrade timing is yours to control&lt;/li&gt;
&lt;li&gt;No per-seat pricing surprises&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trade-off is operational responsibility. This post is about making that trade-off comfortable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Docker Compose Deep Dive
&lt;/h2&gt;

&lt;p&gt;AutoBot ships with a &lt;code&gt;docker-compose.yml&lt;/code&gt; that wires together several services. Let's walk through each layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Services Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./backend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8000:8000"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;chromadb&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OLLAMA_HOST=http://ollama:11434&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CHROMA_HOST=chromadb&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REDIS_URL=redis://redis:6379&lt;/span&gt;

  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./frontend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;chromadb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;chromadb/chroma:latest&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;chroma_data:/chroma/chroma&lt;/span&gt;

  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis_data:/data&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-server --appendonly yes&lt;/span&gt;

  &lt;span class="na"&gt;ollama&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama/ollama:latest&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ollama_models:/root/.ollama&lt;/span&gt;
    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;reservations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
              &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;all&lt;/span&gt;
              &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;gpu&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;chroma_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;redis_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ollama_models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What Each Service Does
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;backend&lt;/strong&gt; — FastAPI application. Handles chat sessions, RAG retrieval, fleet management. The &lt;code&gt;OLLAMA_HOST&lt;/code&gt; env var points it at your local model server; swap this for an OpenAI-compatible URL to use a cloud LLM instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;frontend&lt;/strong&gt; — Next.js UI. Talks only to the backend on port 8000. Stateless — you can restart it without losing anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;chromadb&lt;/strong&gt; — Vector database for knowledge bases. Your embedded documents live here. The &lt;code&gt;chroma_data&lt;/code&gt; volume is critical — back it up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;redis&lt;/strong&gt; — Session state and task queues. With &lt;code&gt;--appendonly yes&lt;/code&gt;, Redis persists to disk. Losing this volume means losing active session context (but not your knowledge bases).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ollama&lt;/strong&gt; — Local LLM inference server. Holds downloaded model weights in &lt;code&gt;ollama_models&lt;/code&gt;. Models are large (4–70 GB each); this volume is expensive to rebuild.&lt;/p&gt;

&lt;h3&gt;
  
  
  Networking
&lt;/h3&gt;

&lt;p&gt;All services communicate on a default Docker bridge network. The service names (&lt;code&gt;chromadb&lt;/code&gt;, &lt;code&gt;redis&lt;/code&gt;, &lt;code&gt;ollama&lt;/code&gt;) resolve as hostnames inside the network — that's why the backend config uses &lt;code&gt;http://ollama:11434&lt;/code&gt; rather than &lt;code&gt;localhost&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For a production deployment, consider an explicit network definition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autobot_net&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;autobot_net&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="c1"&gt;# ... same for all services&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you add an Nginx reverse proxy or Traefik on the same network without exposing internal ports.&lt;/p&gt;




&lt;h2&gt;
  
  
  Model Sizing to Hardware
&lt;/h2&gt;

&lt;p&gt;This is where most self-hosting guides go wrong — they talk about VPS pricing instead of the actual constraint: &lt;strong&gt;inference throughput vs. memory bandwidth&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Rule of Thumb
&lt;/h3&gt;

&lt;p&gt;A model running entirely in VRAM is fast. A model that spills to RAM (or worse, disk) is slow. Plan your setup so your primary model fits in VRAM with room for the OS and other processes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hardware&lt;/th&gt;
&lt;th&gt;VRAM&lt;/th&gt;
&lt;th&gt;Practical Model Ceiling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3060&lt;/td&gt;
&lt;td&gt;12 GB&lt;/td&gt;
&lt;td&gt;Llama 3 8B (Q4), Mistral 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3090 / 4090&lt;/td&gt;
&lt;td&gt;24 GB&lt;/td&gt;
&lt;td&gt;Llama 3 70B (Q4 at the edge), Llama 3 8B (full precision)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2× A100 80 GB&lt;/td&gt;
&lt;td&gt;160 GB&lt;/td&gt;
&lt;td&gt;Llama 3 70B (full), most open-weight frontier models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU only (32 GB RAM)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Llama 3 8B (Q4, slow) — workable for low-traffic RAG&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Local Ollama vs. Cloud LLM Trade-offs
&lt;/h3&gt;

&lt;p&gt;AutoBot supports both. Here's how to think about the choice:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local Ollama (default)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero per-token cost&lt;/li&gt;
&lt;li&gt;Private by definition&lt;/li&gt;
&lt;li&gt;Latency depends on your hardware&lt;/li&gt;
&lt;li&gt;Best for: high-volume internal tools, sensitive data, experimentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cloud LLM (OpenAI, Anthropic, etc.)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pay per token&lt;/li&gt;
&lt;li&gt;Faster for large models you can't run locally&lt;/li&gt;
&lt;li&gt;Data leaves your network (check your provider's retention policy)&lt;/li&gt;
&lt;li&gt;Best for: production apps that need frontier model quality without buying GPUs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;OLLAMA_HOST&lt;/code&gt; env var makes switching simple. Point it at &lt;code&gt;https://api.openai.com/v1&lt;/code&gt; (with an OpenAI-compatible wrapper) to route through a cloud provider without touching application code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Practical Model Recommendations
&lt;/h3&gt;

&lt;p&gt;For a &lt;strong&gt;RAG-heavy knowledge base&lt;/strong&gt; workload (most AutoBot deployments): a quantized 8B model (Llama 3.1 8B Q4_K_M) hits the sweet spot — fast enough for real-time chat, accurate enough for document retrieval, fits comfortably on a single consumer GPU.&lt;/p&gt;

&lt;p&gt;For a &lt;strong&gt;multi-agent fleet&lt;/strong&gt; workload: consider running a smaller model (3B–7B) per agent node and reserving a larger model for orchestration decisions. AutoBot's fleet manager is built to handle per-agent model config.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Tips
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backups
&lt;/h3&gt;

&lt;p&gt;The three volumes that matter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# ChromaDB — your knowledge bases&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; autobot_chroma_data:/source &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /backup:/backup &lt;span class="se"&gt;\&lt;/span&gt;
  alpine &lt;span class="nb"&gt;tar &lt;/span&gt;czf /backup/chroma-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /source &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Redis — session state&lt;/span&gt;
docker &lt;span class="nb"&gt;exec &lt;/span&gt;autobot-redis-1 redis-cli BGSAVE
docker &lt;span class="nb"&gt;cp &lt;/span&gt;autobot-redis-1:/data/dump.rdb /backup/redis-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.rdb

&lt;span class="c"&gt;# Ollama models — large, but painful to re-download&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; autobot_ollama_models:/source &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /backup:/backup &lt;span class="se"&gt;\&lt;/span&gt;
  alpine &lt;span class="nb"&gt;tar &lt;/span&gt;czf /backup/ollama-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /source &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run chroma and redis backups daily. Ollama models only change when you pull new ones — back up on change, not on schedule.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upgrades
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull latest images&lt;/span&gt;
docker compose pull

&lt;span class="c"&gt;# Recreate containers (zero-downtime if you add a load balancer)&lt;/span&gt;
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--no-deps&lt;/span&gt; &lt;span class="nt"&gt;--build&lt;/span&gt; backend frontend

&lt;span class="c"&gt;# Full restart (brief downtime)&lt;/span&gt;
docker compose down &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pin image tags in production (&lt;code&gt;chromadb/chroma:0.5.3&lt;/code&gt; not &lt;code&gt;latest&lt;/code&gt;) so upgrades are deliberate, not automatic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;

&lt;p&gt;AutoBot's backend exposes a &lt;code&gt;/health&lt;/code&gt; endpoint. Wire it into your monitoring stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Simple cron healthcheck&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;/5 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; curl &lt;span class="nt"&gt;-sf&lt;/span&gt; http://localhost:8000/health &lt;span class="o"&gt;||&lt;/span&gt; notify-oncall
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For metrics, the backend emits structured logs to stdout. Forward them to Loki, Datadog, or whatever you already use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json-file"&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;max-size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;50m"&lt;/span&gt;
        &lt;span class="na"&gt;max-file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch for these signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChromaDB query latency&lt;/strong&gt; &amp;gt; 2s — index fragmentation or under-resourced container&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis memory&lt;/strong&gt; approaching limit — set &lt;code&gt;maxmemory&lt;/code&gt; and a sensible eviction policy (&lt;code&gt;allkeys-lru&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama inference time&lt;/strong&gt; spiking — model being swapped to RAM; consider reducing context length or switching to a smaller quantization&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Self-hosting is the start, not the finish. Once you're running in production, the interesting work is building knowledge bases, connecting data sources, and wiring up agents for your specific workflows.&lt;/p&gt;

&lt;p&gt;If you want to help make AutoBot better at the infrastructure layer, there are open issues tagged for DevOps contributors:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/mrveiss/AutoBot-AI/issues?q=is%3Aopen+label%3A%22good+first+issue%22+label%3ADevOps" rel="noopener noreferrer"&gt;Good first issues — DevOps label on AutoBot-AI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If AutoBot is saving you money or time on your infra, consider supporting development:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://ko-fi.com/mrveiss" rel="noopener noreferrer"&gt;Ko-fi: ko-fi.com/mrveiss&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Questions, corrections, or war stories from your own deployment — drop them in the comments.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>docker</category>
      <category>selfhosted</category>
      <category>ai</category>
    </item>
    <item>
      <title>AutoBot's RAG Pipeline Internals — A Python Developer's Guide</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:30:31 +0000</pubDate>
      <link>https://dev.to/mrveiss/autobots-rag-pipeline-internals-a-python-developers-guide-40f9</link>
      <guid>https://dev.to/mrveiss/autobots-rag-pipeline-internals-a-python-developers-guide-40f9</guid>
      <description>&lt;p&gt;If you've been watching the local-AI space lately, you've probably seen OpenClaw land 100k GitHub stars on the back of autonomous agents that build their own tools, their own social networks, and — if you're not careful — their own threat models.&lt;/p&gt;

&lt;p&gt;AutoBot takes a different approach: &lt;strong&gt;you stay in control&lt;/strong&gt;. Your data never leaves your machine. Your AI runs on your hardware. And the knowledge base — the thing that makes your local AI actually &lt;em&gt;useful&lt;/em&gt; — is something you can read, extend, and contribute to.&lt;/p&gt;

&lt;p&gt;This post is for Python developers who want to understand exactly how that knowledge base works, how to feed it your own codebase, and where to plug in if you want to help build it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stack at a Glance
&lt;/h2&gt;

&lt;p&gt;AutoBot's RAG pipeline is built on three components:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Embedding model&lt;/td&gt;
&lt;td&gt;Ollama (configurable)&lt;/td&gt;
&lt;td&gt;Text → vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector store&lt;/td&gt;
&lt;td&gt;ChromaDB&lt;/td&gt;
&lt;td&gt;Similarity search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrieval + generation&lt;/td&gt;
&lt;td&gt;LlamaIndex&lt;/td&gt;
&lt;td&gt;Query → answer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All of it runs locally. No API calls. No data leaving your machine.&lt;/p&gt;

&lt;p&gt;The main module lives at &lt;code&gt;autobot-backend/knowledge/&lt;/code&gt;. The legacy &lt;code&gt;knowledge_base.py&lt;/code&gt; at the backend root is a thin re-export shim — all real logic is in the &lt;code&gt;knowledge/&lt;/code&gt; package.&lt;/p&gt;




&lt;h2&gt;
  
  
  End-to-End Pipeline Walk-Through
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Document Ingestion
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/documents.py&lt;/code&gt; — &lt;code&gt;DocumentsMixin.add_document()&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/documents.py
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Add a document to the knowledge base with async processing.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you drop a file into AutoBot, this is what happens:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Content arrives&lt;/strong&gt; — plain text, Markdown, or PDF.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chunking&lt;/strong&gt; — the document is split into overlapping chunks so context is preserved at retrieval time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding&lt;/strong&gt; — each chunk is converted to a 768-dimensional float vector by the configured Ollama model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt; — vectors + original text land in ChromaDB, keyed by a stable document ID.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The embedding call goes through &lt;code&gt;knowledge/embedding_cache.py&lt;/code&gt; (&lt;code&gt;EmbeddingCache&lt;/code&gt;), which deduplicates repeated content and tracks usage via &lt;code&gt;api/analytics_embedding_patterns.py&lt;/code&gt; (Issue #285). Cache hits skip the Ollama round-trip entirely — useful when you re-index after editing a doc.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Index Configuration
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/index.py&lt;/code&gt; — &lt;code&gt;IndexMixin&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;ChromaDB uses HNSW (Hierarchical Navigable Small World) for approximate nearest-neighbour search. AutoBot exposes the tuning parameters directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/index.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_hnsw_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:space&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_space&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;# distance metric (cosine by default)
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:construction_ef&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_construction_ef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:search_ef&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_search_ef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:M&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The current defaults are tuned for collections with 545k+ vectors (Issue #72). If you're running on modest hardware with a small KB, you can tighten &lt;code&gt;hnsw:M&lt;/code&gt; to reduce memory pressure.&lt;/p&gt;

&lt;p&gt;All ChromaDB calls are wrapped with &lt;code&gt;asyncio.to_thread()&lt;/code&gt; (Issue #369) to keep the FastAPI event loop unblocked — something to be aware of if you're adding new index operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Query → Answer
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/base.py&lt;/code&gt; — &lt;code&gt;KnowledgeBaseCore&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;On query:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The question is embedded with the same Ollama model used at ingestion (same vector space = valid similarity).&lt;/li&gt;
&lt;li&gt;HNSW search finds the top-k most similar chunks.&lt;/li&gt;
&lt;li&gt;The chunks are passed to LlamaIndex as context alongside the query.&lt;/li&gt;
&lt;li&gt;LlamaIndex sends the augmented prompt to the local Ollama LLM.&lt;/li&gt;
&lt;li&gt;The answer references &lt;em&gt;your&lt;/em&gt; documents, not generic training data.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/base.py — core wiring
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Settings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.embeddings.ollama&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbedding&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.llms.ollama&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Ollama&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.vector_stores.chroma&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChromaVectorStore&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Advanced Retrieval
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;autobot-backend/advanced_rag_optimizer.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;For complex queries, AutoBot can upgrade from plain vector search to a hybrid pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid scoring&lt;/strong&gt; — blends semantic similarity (HNSW cosine) with BM25 keyword score via &lt;code&gt;knowledge/search_components/reranking.py&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query expansion&lt;/strong&gt; — reformulates the question to improve recall on technical vocabulary mismatches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAP-Elites diversification&lt;/strong&gt; — ensures results span multiple knowledge categories rather than returning near-duplicate chunks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU acceleration&lt;/strong&gt; — &lt;code&gt;utils/semantic_chunker_gpu.py&lt;/code&gt; uses RTX 4070 / OpenVINO where available.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;SearchResult&lt;/code&gt; dataclass in &lt;code&gt;advanced_rag_optimizer.py&lt;/code&gt; carries both the raw content and all four score dimensions (&lt;code&gt;semantic_score&lt;/code&gt;, &lt;code&gt;keyword_score&lt;/code&gt;, &lt;code&gt;hybrid_score&lt;/code&gt;, &lt;code&gt;rerank_score&lt;/code&gt;) — useful if you want to instrument retrieval quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Background Vectorization
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;autobot-backend/background_vectorization.py&lt;/code&gt; — &lt;code&gt;BackgroundVectorizer&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;When you add new facts or documents while AutoBot is running, &lt;code&gt;BackgroundVectorizer&lt;/code&gt; picks them up asynchronously via FastAPI background tasks. You don't have to trigger a full re-index — the KB stays live.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feeding Your Codebase to the Knowledge Base
&lt;/h2&gt;

&lt;p&gt;AutoBot has a dedicated &lt;code&gt;CodeEmbeddingGenerator&lt;/code&gt; (&lt;code&gt;autobot-backend/code_embedding_generator.py&lt;/code&gt;) that uses &lt;strong&gt;CodeBERT&lt;/strong&gt; instead of a generic text embedding model. Code has different semantics than prose — function names, types, and structure matter — and CodeBERT is trained on code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# code_embedding_generator.py
&lt;/span&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CodeEmbeddingResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;       &lt;span class="c1"&gt;# 768-dim CodeBERT vector
&lt;/span&gt;    &lt;span class="n"&gt;device_used&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;            &lt;span class="c1"&gt;# 'npu', 'cuda', or 'cpu'
&lt;/span&gt;    &lt;span class="n"&gt;processing_time_ms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;cache_hit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To index your codebase:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1 — Via the chat UI&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: Index the ./src directory into the knowledge base
AutoBot: ✓ Scanning ./src...
         Indexed 847 functions across 63 files
         Embedding device: NPU (OpenVINO)
         Ready for semantic code search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 2 — Via the connector system&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;knowledge/connectors/&lt;/code&gt; directory has a registry (&lt;code&gt;registry.py&lt;/code&gt;) and a scheduler (&lt;code&gt;scheduler.py&lt;/code&gt;). You can register a file-server connector pointing at your repo root and let AutoBot watch for changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/connectors/file_server.py
# Register your source directory as a watched connector
&lt;/span&gt;&lt;span class="n"&gt;connector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FileServerConnector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;root_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/your/repo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;file_extensions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 3 — Notion, web, database&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Connectors also exist for Notion (&lt;code&gt;notion.py&lt;/code&gt;), web crawl (&lt;code&gt;web_crawler.py&lt;/code&gt;), audio (&lt;code&gt;audio_connector.py&lt;/code&gt;), and database (&lt;code&gt;database.py&lt;/code&gt;). The base class is &lt;code&gt;knowledge/connectors/base.py&lt;/code&gt; — implement &lt;code&gt;fetch()&lt;/code&gt; and register via &lt;code&gt;registry.py&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where to Plug In: Contributing to the KB Engine
&lt;/h2&gt;

&lt;p&gt;Here are the cleanest entry points for first contributions:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/documents.py&lt;/code&gt; — DocumentsMixin
&lt;/h3&gt;

&lt;p&gt;Good for: adding new file format support (EPUB, HTML, DOCX), improving chunking strategy.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;add_document()&lt;/code&gt; and related methods are well-isolated. A chunking improvement here applies to every ingestion path.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/connectors/&lt;/code&gt; — Connector Registry
&lt;/h3&gt;

&lt;p&gt;Good for: adding new data sources (GitHub issues, Jira, Slack export).&lt;/p&gt;

&lt;p&gt;Implement the &lt;code&gt;BaseConnector&lt;/code&gt; interface and register in &lt;code&gt;registry.py&lt;/code&gt;. Look at &lt;code&gt;notion.py&lt;/code&gt; for a reference implementation with authentication handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;advanced_rag_optimizer.py&lt;/code&gt; — Hybrid Search
&lt;/h3&gt;

&lt;p&gt;Good for: retrieval quality improvements, new reranking strategies, better query expansion.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;SearchResult&lt;/code&gt; + &lt;code&gt;QueryContext&lt;/code&gt; dataclasses are clean — adding a new scoring dimension means extending the dataclass and wiring it into &lt;code&gt;compute_blended_score()&lt;/code&gt; in &lt;code&gt;knowledge/search_components/reranking.py&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/index.py&lt;/code&gt; — HNSW Tuning
&lt;/h3&gt;

&lt;p&gt;Good for: performance work on large vector collections, memory footprint reduction.&lt;/p&gt;

&lt;p&gt;The HNSW parameter exposure is deliberately simple. There's room for adaptive tuning based on collection size and hardware profile.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;background_vectorization.py&lt;/code&gt; — BackgroundVectorizer
&lt;/h3&gt;

&lt;p&gt;Good for: incremental sync improvements, smarter deduplication, conflict resolution when a connector and a manual upload touch the same document.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the KB Locally
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and start the full stack&lt;/span&gt;
git clone https://github.com/mrveiss/AutoBot-AI
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# Or use the installer script&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/mrveiss/AutoBot-AI/Dev_new_gui/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The knowledge base stores vectors in &lt;code&gt;./data/chromadb/&lt;/code&gt; by default. It persists across container restarts.&lt;/p&gt;

&lt;p&gt;To run just the backend in dev mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;autobot-backend
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
uvicorn app_factory:create_app &lt;span class="nt"&gt;--factory&lt;/span&gt; &lt;span class="nt"&gt;--reload&lt;/span&gt; &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Where to Go Next
&lt;/h2&gt;

&lt;p&gt;If you want to contribute to the Python side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Good first issues (Python label):&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/python" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/labels/python&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All good first issues:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contributing guide:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/blob/Dev_new_gui/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Discussions:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this article saved you an hour of reading source code, you can &lt;a href="https://ko-fi.com/mrveiss" rel="noopener noreferrer"&gt;buy me a coffee on Ko-fi&lt;/a&gt; — it goes directly toward hardware time for the project.&lt;/p&gt;




&lt;p&gt;AutoBot is free, open source, and runs entirely on your hardware. The RAG pipeline is the core of what makes a local AI assistant actually useful — and it's a great place to dig in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your data. Your AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>rag</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>OpenClaw and AutoBot: two different visions for local AI</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:21:25 +0000</pubDate>
      <link>https://dev.to/mrveiss/openclaw-and-autobot-two-different-visions-for-local-ai-648</link>
      <guid>https://dev.to/mrveiss/openclaw-and-autobot-two-different-visions-for-local-ai-648</guid>
      <description>&lt;p&gt;OpenClaw hit 100,000 GitHub stars in two months. Its agents built their own social network. PCWorld and TechCrunch ran pieces on the risks. If you've been anywhere near AI Twitter this week, you've seen the wave.&lt;/p&gt;

&lt;p&gt;I've been building AutoBot for three years. People keep asking me the same question: &lt;em&gt;is this your competitor?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It's not. We're solving different problems.&lt;/p&gt;

&lt;p&gt;This piece is for the developers I keep meeting who are excited by OpenClaw and unsettled by it at the same time. There's a real reason for that feeling — and there's room in the local-AI world for both projects to exist.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two philosophies, one ecosystem
&lt;/h2&gt;

&lt;p&gt;OpenClaw is about &lt;strong&gt;agent autonomy&lt;/strong&gt;. You give the agent goals, system access, and time. It figures out the rest. The whole point is that you're not in the loop for every step.&lt;/p&gt;

&lt;p&gt;AutoBot is about &lt;strong&gt;data sovereignty&lt;/strong&gt;. You feed it your docs, your codebase, your business knowledge. It answers questions, drafts copy, helps you code — but it does what you ask, when you ask, on your machine.&lt;/p&gt;

&lt;p&gt;Different problems. Different trade-offs. Both legitimate.&lt;/p&gt;

&lt;p&gt;The PCWorld and TechCrunch coverage didn't say OpenClaw was bad. It said &lt;em&gt;autonomous agents with system-level permissions are a category of risk we don't have great answers for yet&lt;/em&gt;. That's true. It's also the price of admission for what OpenClaw is trying to do, and a lot of people will pay it gladly.&lt;/p&gt;

&lt;p&gt;Some won't. Those are the people I want to talk to.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "Your data. Your AI." actually means
&lt;/h2&gt;

&lt;p&gt;The line we built AutoBot around is &lt;em&gt;Your data. Your AI.&lt;/em&gt; Here's what that resolves to in code:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your data stays on your machine.&lt;/strong&gt; The knowledge base — the documents you upload, the codebase you index, the business processes you paste in — never leaves your hardware. There is no cloud component. There is no telemetry pipe. If your machine is offline, AutoBot is offline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You pick the brain.&lt;/strong&gt; Want to run fully local? Plug in Ollama, LM Studio, llama.cpp — anything with an OpenAI-compatible endpoint. Want GPT-4 or Claude for the heavy lifting? Connect your API key. Your prompts go to that model, but your knowledge base documents don't.&lt;/p&gt;

&lt;p&gt;The brain phones home. Your documents don't. That's the line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You decide what it does.&lt;/strong&gt; AutoBot doesn't run on a schedule. It doesn't take actions while you sleep. It doesn't have system access beyond what its container can see. The trade-off: you have to ask. The benefit: nothing happens that you didn't ask for.&lt;/p&gt;




&lt;h2&gt;
  
  
  When you'd pick which
&lt;/h2&gt;

&lt;p&gt;I'm not going to pretend AutoBot is the answer for everything. Picking the right tool matters more than picking a side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw fits when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want long-running, multi-step automation&lt;/li&gt;
&lt;li&gt;You're comfortable scoping permissions and accepting agent risk&lt;/li&gt;
&lt;li&gt;The win is the agent doing things without you in the loop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AutoBot fits when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your data can't leave your network (regulated industries, proprietary code, client work)&lt;/li&gt;
&lt;li&gt;You want an AI that knows &lt;em&gt;your&lt;/em&gt; domain, not a generic model&lt;/li&gt;
&lt;li&gt;You want to keep the human in the loop — the AI is a tool, not a coworker&lt;/li&gt;
&lt;li&gt;You need something you can deploy once and run forever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are people who will run both. AutoBot for the knowledge base and chat layer over their own data. OpenClaw for autonomous tasks where they've scoped the risk. That's a legitimate stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  What we actually built
&lt;/h2&gt;

&lt;p&gt;Because I keep getting asked: AutoBot is a self-hosted AI platform. The chat interface gets the attention but the knowledge base is the product.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RAG engine&lt;/strong&gt; that turns your raw files into a searchable AI layer that knows your domain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable LLM&lt;/strong&gt; — local via Ollama, or any OpenAI-compatible endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fleet management&lt;/strong&gt; for running AutoBot across multiple machines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker Compose deploy&lt;/strong&gt; — one command, full stack&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's open source. It's actively developed. The roadmap is public. Community PRs are welcome and tagged with skill-based good-first-issue labels for Python, frontend, and DevOps contributors.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it in five minutes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Connect your LLM. Feed it your first document.&lt;/p&gt;

&lt;p&gt;That's it. You're running your own AI.&lt;/p&gt;




&lt;p&gt;If the OpenClaw moment got you thinking harder about where your data lives and who controls your AI — even if you stay on OpenClaw — that's a good thing for the ecosystem. We need more people asking those questions.&lt;/p&gt;

&lt;p&gt;If the answer you land on is &lt;em&gt;I want the AI but I want to stay in control&lt;/em&gt;, AutoBot is here for that.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;Star us on GitHub&lt;/a&gt; · &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;Join the discussions&lt;/a&gt; · &lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;Sponsor the project&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>selfhosted</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Weekly Update: ✨ docs: refresh stale status/changelog and canonical TaskSta</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 05:00:06 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-docs-refresh-stale-statuschangelog-and-canonical-tasksta-1pj3</link>
      <guid>https://dev.to/mrveiss/weekly-update-docs-refresh-stale-statuschangelog-and-canonical-tasksta-1pj3</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ docs: refresh stale status/changelog and canonical TaskStatus examples (#7498)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 docs: refresh stale status/changelog and canonical TaskStatu&lt;/li&gt;
&lt;li&gt;🔧 feat(web_fetch): add WebFetcher.fetch_raw_html public API (c&lt;/li&gt;
&lt;li&gt;🔧 fix(multimodal): LSP exception contract — VisionProcessor + &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +1&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-04T05:00:03.384055Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-04T05:00:03.384055Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Weekly Update: ✨ fix(backend): replace bare singleton aliases with get_*()</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 27 Apr 2026 05:00:10 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-fixbackend-replace-bare-singleton-aliases-with-get-3e0e</link>
      <guid>https://dev.to/mrveiss/weekly-update-fixbackend-replace-bare-singleton-aliases-with-get-3e0e</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ fix(backend): replace bare singleton aliases with get_*() pattern at all call sites (#6196) (#6217)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 fix(backend): replace bare singleton aliases with get_*() pa&lt;/li&gt;
&lt;li&gt;🔧 fix(api): move @with_error_handling below @router.* in workf&lt;/li&gt;
&lt;li&gt;🔧 fix(deploy): add Ansible task to clear stale .pyc files afte&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-20T05:00:04.313917Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-20T05:00:04.313917Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Why We Built AutoBot: The WordPress of AI</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Wed, 15 Apr 2026 19:39:01 +0000</pubDate>
      <link>https://dev.to/mrveiss/why-we-built-autobot-the-wordpress-of-ai-4k7b</link>
      <guid>https://dev.to/mrveiss/why-we-built-autobot-the-wordpress-of-ai-4k7b</guid>
      <description>&lt;p&gt;Three years ago I started building AutoBot because I was tired of renting my own intelligence.&lt;/p&gt;

&lt;p&gt;Every AI tool I used followed the same playbook: send your data to our servers, pay monthly, accept our terms, trust us not to read your prompts. The model was the product. You were the user. Your data was the inventory.&lt;/p&gt;

&lt;p&gt;I wanted something different. I wanted an AI that felt like mine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your data. Your AI.
&lt;/h2&gt;

&lt;p&gt;That's the line we built everything around.&lt;/p&gt;

&lt;p&gt;WordPress gave everyone a website. Before WordPress, having a web presence meant renting space on someone else's platform, playing by their rules, losing your content if they shut down. WordPress flipped that. You install it, you own it, you extend it, you run it forever.&lt;/p&gt;

&lt;p&gt;AutoBot is that for AI.&lt;/p&gt;

&lt;p&gt;Self-hosted. Open source. Yours to extend. The AI platform that belongs to you — not to us.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feed it your world
&lt;/h2&gt;

&lt;p&gt;The core feature isn't the chat interface. It's the knowledge base.&lt;/p&gt;

&lt;p&gt;Drop in your docs. Upload your codebase. Paste your business processes. AutoBot's RAG engine turns your raw files into a searchable, queryable AI layer that actually knows your domain — not just what some model was trained on two years ago.&lt;/p&gt;

&lt;p&gt;Feed it your docs. Your codebase. Your business.&lt;br&gt;
It learns what you know. It stays where you are.&lt;/p&gt;

&lt;p&gt;This is the part that changes things. An AI that knows &lt;em&gt;your&lt;/em&gt; codebase gives better answers than a generic model. An AI that's read &lt;em&gt;your&lt;/em&gt; legal documents is more useful than one guessing at your jurisdiction. An AI trained on &lt;em&gt;your&lt;/em&gt; patient intake forms is more reliable than one pattern-matching across the internet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An AI that's actually about you.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Pick your brain
&lt;/h2&gt;

&lt;p&gt;AutoBot is not opinionated about which LLM powers it. That's your call.&lt;/p&gt;

&lt;p&gt;Want to run fully local? Plug in Ollama. LM Studio. llama.cpp. Anything with an OpenAI-compatible endpoint.&lt;/p&gt;

&lt;p&gt;Prefer GPT-4 or Claude for the heavy lifting? Connect your API key. Your data stays on your machine — your prompts go to the model, but your knowledge base documents don't.&lt;/p&gt;

&lt;p&gt;That distinction matters. The brain phones home. Your documents don't.&lt;/p&gt;




&lt;h2&gt;
  
  
  You shouldn't have to ask permission
&lt;/h2&gt;

&lt;p&gt;The thing that finally broke me on cloud AI wasn't the pricing. It was the 2 AM email.&lt;/p&gt;

&lt;p&gt;"We're updating our terms of service effective next month. By continuing to use the service you agree to..."&lt;/p&gt;

&lt;p&gt;Your data never leaves your machine — whatever brain powers it.&lt;br&gt;
No rate limits on your knowledge. No vendor changing the rules on you overnight.&lt;/p&gt;

&lt;p&gt;Deploy once. Run it your way. Forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  Most AI is rented. AutoBot is yours.
&lt;/h2&gt;

&lt;p&gt;You decide what it does. You decide where it runs. You decide who sees it.&lt;/p&gt;

&lt;p&gt;No subscription. No surveillance. No one reading your prompts.&lt;br&gt;
Install it. Own it. Run it forever.&lt;/p&gt;

&lt;p&gt;For developers who've been burned by API deprecations, pricing pivots, and terms changes — this is for you.&lt;/p&gt;

&lt;p&gt;For law firms, medical startups, and anyone in a regulated industry — your data never touches our servers. No cloud vendor to breach. No third party holding your keys. What stays on your machine stays yours — full stop. You control the perimeter. We just give you the tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get started in 5 minutes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Connect your LLM. Feed your first document.&lt;/p&gt;

&lt;p&gt;That's it. You're running your own AI.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AutoBot is open source. If this resonates, &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;star us on GitHub&lt;/a&gt;, join the &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;community discussions&lt;/a&gt;, or &lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;sponsor the project&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>selfhosted</category>
      <category>opensource</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Weekly Update: ✨ docs(claude.md): add codebase-as-source-of-truth rule</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 13 Apr 2026 05:00:06 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-docsclaudemd-add-codebase-as-source-of-truth-rule-2ol3</link>
      <guid>https://dev.to/mrveiss/weekly-update-docsclaudemd-add-codebase-as-source-of-truth-rule-2ol3</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ docs(claude.md): add codebase-as-source-of-truth rule
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 docs(claude.md): add codebase-as-source-of-truth rule&lt;/li&gt;
&lt;li&gt;🔧 Merge branch 'main' into Dev_new_gui&lt;/li&gt;
&lt;li&gt;🔧 fix(devops): ensure log directory/files have correct ownersh&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-06T05:00:03.370813Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-06T05:00:03.370813Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Weekly Update: ✨ WIP: preserve work from issue-3291 (#4241)</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Sun, 12 Apr 2026 19:33:43 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-wip-preserve-work-from-issue-3291-4241-4lk7</link>
      <guid>https://dev.to/mrveiss/weekly-update-wip-preserve-work-from-issue-3291-4241-4lk7</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ WIP: preserve work from issue-3291 (#4241)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3291 (#4241)&lt;/li&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3290 (#4240)&lt;/li&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3281 (#4239)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-05T16:46:53.357011Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-05T16:46:53.357011Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
  </channel>
</rss>
