<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mārtiņš Veiss</title>
    <description>The latest articles on DEV Community by Mārtiņš Veiss (@mrveiss).</description>
    <link>https://dev.to/mrveiss</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3867783%2Fb7397987-7baa-4c63-b090-414707e4daa6.jpg</url>
      <title>DEV Community: Mārtiņš Veiss</title>
      <link>https://dev.to/mrveiss</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mrveiss"/>
    <language>en</language>
    <item>
      <title>Inside AutoBot's Frontend: A Developer Walkthrough</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:32:51 +0000</pubDate>
      <link>https://dev.to/mrveiss/inside-autobots-frontend-a-developer-walkthrough-2k3j</link>
      <guid>https://dev.to/mrveiss/inside-autobots-frontend-a-developer-walkthrough-2k3j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AutoBot&lt;/strong&gt; is the open-source, self-hosted AI automation platform where your data never leaves your server.&lt;br&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What you see when you open AutoBot
&lt;/h2&gt;

&lt;p&gt;AutoBot's chat interface greets you with a familiar two-pane layout: a conversation sidebar on the left and an active chat panel on the right.  Behind that simplicity lives a rich UI built from about 40 focused Vue single-file components.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chat UI
&lt;/h3&gt;

&lt;p&gt;The core chat flow is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ChatView.vue
  └── ChatInterface.vue
        ├── ChatSidebar.vue        ← conversation list + search
        ├── ChatHeader.vue         ← model selector, settings toggle
        ├── ChatMessages.vue       ← scrolling message feed
        │     └── MessageItem.vue  ← per-message bubble + citations
        ├── ChatInput.vue          ← textarea, attachments, send button
        ├── ChatTabs.vue           ← switch between Chat / Browser / Docs
        └── CitationsDisplay.vue   ← inline source links from RAG
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ChatTabs.vue&lt;/code&gt; component is the pivot point: it lets you jump between a raw conversation, an embedded browser session (for web research), and a documentation search sidebar — all within the same view.&lt;/p&gt;

&lt;h3&gt;
  
  
  Knowledge Base UI
&lt;/h3&gt;

&lt;p&gt;AutoBot's Knowledge Base is where the "your data" part of &lt;em&gt;Your data. Your AI.&lt;/em&gt; lives.  The &lt;code&gt;KnowledgeView.vue&lt;/code&gt; brings together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeBrowser&lt;/strong&gt; — file-tree style explorer of all ingested documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeSearch&lt;/strong&gt; — full-text + vector search with &lt;code&gt;KBSearchResultPanel&lt;/code&gt; rendering scored results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeGraph / KnowledgeGraph3D&lt;/strong&gt; — D3-powered entity graph so you can &lt;em&gt;see&lt;/em&gt; how concepts connect&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeUpload&lt;/strong&gt; — drag-and-drop ingestion with real-time vectorization progress (&lt;code&gt;VectorizationProgressModal&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KnowledgeMaintenance&lt;/strong&gt; — deduplication, cleanup stats, and orphan management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pipeline inside &lt;code&gt;KnowledgeView&lt;/code&gt; fans out to more than 30 sub-components, but each one has a narrow responsibility.  If you add a new panel, you're usually only touching one file.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component architecture in &lt;code&gt;autobot-frontend/&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;autobot-frontend/
├── src/
│   ├── components/       # feature-scoped component trees
│   │   ├── chat/
│   │   ├── knowledge/
│   │   ├── agents/
│   │   ├── browser/
│   │   ├── charts/
│   │   └── base/         # shared primitives (buttons, modals, …)
│   ├── views/            # route-level pages (one per route)
│   ├── stores/           # Pinia stores (useChatStore, useKnowledgeStore, …)
│   ├── composables/      # shared reactive logic
│   ├── design-system/    # tokens.ts — canonical token catalog
│   ├── router/           # Vue Router config
│   └── styles/           # global CSS + Tailwind @theme block
├── cypress/              # end-to-end tests
└── package.json          # Vue 3 + Vite + Tailwind CSS 4 + TypeScript
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tech stack at a glance&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Framework&lt;/td&gt;
&lt;td&gt;Vue 3 (Composition API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build&lt;/td&gt;
&lt;td&gt;Vite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;Pinia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Styling&lt;/td&gt;
&lt;td&gt;Tailwind CSS 4 (&lt;code&gt;@theme&lt;/code&gt; tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing&lt;/td&gt;
&lt;td&gt;Vitest (unit) + Cypress/Playwright (e2e)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storybook&lt;/td&gt;
&lt;td&gt;Component stories live in &lt;code&gt;src/stories/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Design tokens
&lt;/h3&gt;

&lt;p&gt;All colors, spacings, and radii flow from &lt;code&gt;src/design-system/tokens.ts&lt;/code&gt;.  This file is the single source of truth for token &lt;em&gt;names&lt;/em&gt;; actual values live in &lt;code&gt;src/assets/tailwind.css&lt;/code&gt; under the &lt;code&gt;@theme&lt;/code&gt; block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tokens.ts (abridged)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SEMANTIC_COLORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-primary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-primary text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-secondary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-secondary text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;autobot-success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bg-autobot-success text-white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="c1"&gt;// …&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding a new brand color means two edits: &lt;code&gt;tailwind.css&lt;/code&gt; for the value, &lt;code&gt;tokens.ts&lt;/code&gt; to register the name.  That's it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to contribute to the UI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Get the repo running
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI/autobot-frontend
npm &lt;span class="nb"&gt;install
&lt;/span&gt;npm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dev server starts at &lt;code&gt;http://localhost:5173&lt;/code&gt;.  You don't need a running backend to work on visual components — the Storybook stories in &lt;code&gt;src/stories/&lt;/code&gt; cover most UI primitives in isolation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Explore Storybook
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run storybook
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DesignTokens.stories.ts&lt;/code&gt; gives you the full token palette in one page.  If you want to see a component in isolation before wiring it up to real data, stories are the right place to start.&lt;/p&gt;

&lt;h3&gt;
  
  
  Find good first issues
&lt;/h3&gt;

&lt;p&gt;The fastest path to a first contribution is the &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;&lt;code&gt;good first issue&lt;/code&gt; + &lt;code&gt;area: frontend&lt;/code&gt;&lt;/a&gt; label combination.  These are scoped to single components or small style fixes — no need to understand the full stack before opening a PR.&lt;/p&gt;

&lt;p&gt;Common entry points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility&lt;/strong&gt; — the &lt;code&gt;ACCESSIBILITY_IMPROVEMENTS.md&lt;/code&gt; doc tracks open a11y work across the chat and KB UIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design token gaps&lt;/strong&gt; — new palette entries or missing dark-mode mappings in &lt;code&gt;tailwind.css&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storybook coverage&lt;/strong&gt; — components in &lt;code&gt;src/components/base/&lt;/code&gt; that don't have a story yet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; — &lt;code&gt;src/components/**/__tests__/&lt;/code&gt; has gaps; Vitest tests are welcome.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Testing approach
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; (&lt;code&gt;npm run test:unit&lt;/code&gt;) — use Vitest + Vue Test Utils.  Keep tests in &lt;code&gt;__tests__/&lt;/code&gt; next to the component.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;E2E&lt;/strong&gt; (&lt;code&gt;npm run test:e2e:dev&lt;/code&gt;) — Cypress tests live in &lt;code&gt;cypress/&lt;/code&gt;.  Run against the Vite dev server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type-check&lt;/strong&gt; — &lt;code&gt;npx vue-tsc --noEmit -p tsconfig.app.json&lt;/code&gt;.  The repo has a tracked baseline of ~248 type errors (legacy debt); PRs should not &lt;em&gt;add&lt;/em&gt; errors — see the CI check in &lt;code&gt;.github/workflows/frontend-typecheck-regression.yml&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  PR checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;npm run lint&lt;/code&gt; passes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npm run test:unit&lt;/code&gt; passes (or new tests added for the changed component)&lt;/li&gt;
&lt;li&gt;No new type errors vs. the baseline&lt;/li&gt;
&lt;li&gt;Storybook story updated/added if you touched a &lt;code&gt;base/&lt;/code&gt; component&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Your data. Your AI.
&lt;/h2&gt;

&lt;p&gt;AutoBot's frontend reflects the same philosophy as the project: everything runs locally, nothing is sent to a third party, and every part of the stack is open for you to inspect, extend, or replace.&lt;/p&gt;

&lt;p&gt;If this post helped you find your way around the codebase, the best next step is to open an issue or pick one that's already waiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;Good first issues — frontend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;GitHub Sponsors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/blob/main/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING.md&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>vue</category>
      <category>typescript</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
    <item>
      <title>Self-Hosting AutoBot: A DevOps Deep Dive into Docker Compose, Model Sizing, and Production Ops</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:31:40 +0000</pubDate>
      <link>https://dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive-into-docker-compose-model-sizing-and-production-ops-2d56</link>
      <guid>https://dev.to/mrveiss/self-hosting-autobot-a-devops-deep-dive-into-docker-compose-model-sizing-and-production-ops-2d56</guid>
      <description>&lt;p&gt;You've seen the demos. You want to run AutoBot on your own hardware, your own data, under your own control. Good instinct. Here's the full operational picture — Docker Compose internals, how to match LLM models to your GPU or CPU, and the production habits that keep things stable long-term.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Self-Host?
&lt;/h2&gt;

&lt;p&gt;AutoBot's tagline is &lt;strong&gt;"Your data. Your AI."&lt;/strong&gt; That's not marketing copy — it's an architectural choice. When you self-host:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversations never leave your network&lt;/li&gt;
&lt;li&gt;You choose which models run (open-weight, cloud API, or a mix)&lt;/li&gt;
&lt;li&gt;Upgrade timing is yours to control&lt;/li&gt;
&lt;li&gt;No per-seat pricing surprises&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trade-off is operational responsibility. This post is about making that trade-off comfortable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Docker Compose Deep Dive
&lt;/h2&gt;

&lt;p&gt;AutoBot ships with a &lt;code&gt;docker-compose.yml&lt;/code&gt; that wires together several services. Let's walk through each layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Services Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./backend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8000:8000"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;chromadb&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OLLAMA_HOST=http://ollama:11434&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CHROMA_HOST=chromadb&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REDIS_URL=redis://redis:6379&lt;/span&gt;

  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./frontend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;chromadb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;chromadb/chroma:latest&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;chroma_data:/chroma/chroma&lt;/span&gt;

  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis_data:/data&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-server --appendonly yes&lt;/span&gt;

  &lt;span class="na"&gt;ollama&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama/ollama:latest&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ollama_models:/root/.ollama&lt;/span&gt;
    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;reservations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
              &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;all&lt;/span&gt;
              &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;gpu&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;chroma_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;redis_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ollama_models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What Each Service Does
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;backend&lt;/strong&gt; — FastAPI application. Handles chat sessions, RAG retrieval, fleet management. The &lt;code&gt;OLLAMA_HOST&lt;/code&gt; env var points it at your local model server; swap this for an OpenAI-compatible URL to use a cloud LLM instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;frontend&lt;/strong&gt; — Next.js UI. Talks only to the backend on port 8000. Stateless — you can restart it without losing anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;chromadb&lt;/strong&gt; — Vector database for knowledge bases. Your embedded documents live here. The &lt;code&gt;chroma_data&lt;/code&gt; volume is critical — back it up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;redis&lt;/strong&gt; — Session state and task queues. With &lt;code&gt;--appendonly yes&lt;/code&gt;, Redis persists to disk. Losing this volume means losing active session context (but not your knowledge bases).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ollama&lt;/strong&gt; — Local LLM inference server. Holds downloaded model weights in &lt;code&gt;ollama_models&lt;/code&gt;. Models are large (4–70 GB each); this volume is expensive to rebuild.&lt;/p&gt;

&lt;h3&gt;
  
  
  Networking
&lt;/h3&gt;

&lt;p&gt;All services communicate on a default Docker bridge network. The service names (&lt;code&gt;chromadb&lt;/code&gt;, &lt;code&gt;redis&lt;/code&gt;, &lt;code&gt;ollama&lt;/code&gt;) resolve as hostnames inside the network — that's why the backend config uses &lt;code&gt;http://ollama:11434&lt;/code&gt; rather than &lt;code&gt;localhost&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For a production deployment, consider an explicit network definition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autobot_net&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;autobot_net&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="c1"&gt;# ... same for all services&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you add an Nginx reverse proxy or Traefik on the same network without exposing internal ports.&lt;/p&gt;




&lt;h2&gt;
  
  
  Model Sizing to Hardware
&lt;/h2&gt;

&lt;p&gt;This is where most self-hosting guides go wrong — they talk about VPS pricing instead of the actual constraint: &lt;strong&gt;inference throughput vs. memory bandwidth&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Rule of Thumb
&lt;/h3&gt;

&lt;p&gt;A model running entirely in VRAM is fast. A model that spills to RAM (or worse, disk) is slow. Plan your setup so your primary model fits in VRAM with room for the OS and other processes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hardware&lt;/th&gt;
&lt;th&gt;VRAM&lt;/th&gt;
&lt;th&gt;Practical Model Ceiling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3060&lt;/td&gt;
&lt;td&gt;12 GB&lt;/td&gt;
&lt;td&gt;Llama 3 8B (Q4), Mistral 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3090 / 4090&lt;/td&gt;
&lt;td&gt;24 GB&lt;/td&gt;
&lt;td&gt;Llama 3 70B (Q4 at the edge), Llama 3 8B (full precision)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2× A100 80 GB&lt;/td&gt;
&lt;td&gt;160 GB&lt;/td&gt;
&lt;td&gt;Llama 3 70B (full), most open-weight frontier models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU only (32 GB RAM)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Llama 3 8B (Q4, slow) — workable for low-traffic RAG&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Local Ollama vs. Cloud LLM Trade-offs
&lt;/h3&gt;

&lt;p&gt;AutoBot supports both. Here's how to think about the choice:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local Ollama (default)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero per-token cost&lt;/li&gt;
&lt;li&gt;Private by definition&lt;/li&gt;
&lt;li&gt;Latency depends on your hardware&lt;/li&gt;
&lt;li&gt;Best for: high-volume internal tools, sensitive data, experimentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cloud LLM (OpenAI, Anthropic, etc.)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pay per token&lt;/li&gt;
&lt;li&gt;Faster for large models you can't run locally&lt;/li&gt;
&lt;li&gt;Data leaves your network (check your provider's retention policy)&lt;/li&gt;
&lt;li&gt;Best for: production apps that need frontier model quality without buying GPUs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;OLLAMA_HOST&lt;/code&gt; env var makes switching simple. Point it at &lt;code&gt;https://api.openai.com/v1&lt;/code&gt; (with an OpenAI-compatible wrapper) to route through a cloud provider without touching application code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Practical Model Recommendations
&lt;/h3&gt;

&lt;p&gt;For a &lt;strong&gt;RAG-heavy knowledge base&lt;/strong&gt; workload (most AutoBot deployments): a quantized 8B model (Llama 3.1 8B Q4_K_M) hits the sweet spot — fast enough for real-time chat, accurate enough for document retrieval, fits comfortably on a single consumer GPU.&lt;/p&gt;

&lt;p&gt;For a &lt;strong&gt;multi-agent fleet&lt;/strong&gt; workload: consider running a smaller model (3B–7B) per agent node and reserving a larger model for orchestration decisions. AutoBot's fleet manager is built to handle per-agent model config.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Tips
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backups
&lt;/h3&gt;

&lt;p&gt;The three volumes that matter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# ChromaDB — your knowledge bases&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; autobot_chroma_data:/source &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /backup:/backup &lt;span class="se"&gt;\&lt;/span&gt;
  alpine &lt;span class="nb"&gt;tar &lt;/span&gt;czf /backup/chroma-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /source &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Redis — session state&lt;/span&gt;
docker &lt;span class="nb"&gt;exec &lt;/span&gt;autobot-redis-1 redis-cli BGSAVE
docker &lt;span class="nb"&gt;cp &lt;/span&gt;autobot-redis-1:/data/dump.rdb /backup/redis-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.rdb

&lt;span class="c"&gt;# Ollama models — large, but painful to re-download&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; autobot_ollama_models:/source &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /backup:/backup &lt;span class="se"&gt;\&lt;/span&gt;
  alpine &lt;span class="nb"&gt;tar &lt;/span&gt;czf /backup/ollama-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /source &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run chroma and redis backups daily. Ollama models only change when you pull new ones — back up on change, not on schedule.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upgrades
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull latest images&lt;/span&gt;
docker compose pull

&lt;span class="c"&gt;# Recreate containers (zero-downtime if you add a load balancer)&lt;/span&gt;
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--no-deps&lt;/span&gt; &lt;span class="nt"&gt;--build&lt;/span&gt; backend frontend

&lt;span class="c"&gt;# Full restart (brief downtime)&lt;/span&gt;
docker compose down &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pin image tags in production (&lt;code&gt;chromadb/chroma:0.5.3&lt;/code&gt; not &lt;code&gt;latest&lt;/code&gt;) so upgrades are deliberate, not automatic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;

&lt;p&gt;AutoBot's backend exposes a &lt;code&gt;/health&lt;/code&gt; endpoint. Wire it into your monitoring stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Simple cron healthcheck&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;/5 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; curl &lt;span class="nt"&gt;-sf&lt;/span&gt; http://localhost:8000/health &lt;span class="o"&gt;||&lt;/span&gt; notify-oncall
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For metrics, the backend emits structured logs to stdout. Forward them to Loki, Datadog, or whatever you already use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json-file"&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;max-size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;50m"&lt;/span&gt;
        &lt;span class="na"&gt;max-file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch for these signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChromaDB query latency&lt;/strong&gt; &amp;gt; 2s — index fragmentation or under-resourced container&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis memory&lt;/strong&gt; approaching limit — set &lt;code&gt;maxmemory&lt;/code&gt; and a sensible eviction policy (&lt;code&gt;allkeys-lru&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama inference time&lt;/strong&gt; spiking — model being swapped to RAM; consider reducing context length or switching to a smaller quantization&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Self-hosting is the start, not the finish. Once you're running in production, the interesting work is building knowledge bases, connecting data sources, and wiring up agents for your specific workflows.&lt;/p&gt;

&lt;p&gt;If you want to help make AutoBot better at the infrastructure layer, there are open issues tagged for DevOps contributors:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/mrveiss/AutoBot-AI/issues?q=is%3Aopen+label%3A%22good+first+issue%22+label%3ADevOps" rel="noopener noreferrer"&gt;Good first issues — DevOps label on AutoBot-AI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If AutoBot is saving you money or time on your infra, consider supporting development:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://ko-fi.com/mrveiss" rel="noopener noreferrer"&gt;Ko-fi: ko-fi.com/mrveiss&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Questions, corrections, or war stories from your own deployment — drop them in the comments.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>docker</category>
      <category>selfhosted</category>
      <category>ai</category>
    </item>
    <item>
      <title>AutoBot's RAG Pipeline Internals — A Python Developer's Guide</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:30:31 +0000</pubDate>
      <link>https://dev.to/mrveiss/autobots-rag-pipeline-internals-a-python-developers-guide-40f9</link>
      <guid>https://dev.to/mrveiss/autobots-rag-pipeline-internals-a-python-developers-guide-40f9</guid>
      <description>&lt;p&gt;If you've been watching the local-AI space lately, you've probably seen OpenClaw land 100k GitHub stars on the back of autonomous agents that build their own tools, their own social networks, and — if you're not careful — their own threat models.&lt;/p&gt;

&lt;p&gt;AutoBot takes a different approach: &lt;strong&gt;you stay in control&lt;/strong&gt;. Your data never leaves your machine. Your AI runs on your hardware. And the knowledge base — the thing that makes your local AI actually &lt;em&gt;useful&lt;/em&gt; — is something you can read, extend, and contribute to.&lt;/p&gt;

&lt;p&gt;This post is for Python developers who want to understand exactly how that knowledge base works, how to feed it your own codebase, and where to plug in if you want to help build it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stack at a Glance
&lt;/h2&gt;

&lt;p&gt;AutoBot's RAG pipeline is built on three components:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Embedding model&lt;/td&gt;
&lt;td&gt;Ollama (configurable)&lt;/td&gt;
&lt;td&gt;Text → vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector store&lt;/td&gt;
&lt;td&gt;ChromaDB&lt;/td&gt;
&lt;td&gt;Similarity search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrieval + generation&lt;/td&gt;
&lt;td&gt;LlamaIndex&lt;/td&gt;
&lt;td&gt;Query → answer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All of it runs locally. No API calls. No data leaving your machine.&lt;/p&gt;

&lt;p&gt;The main module lives at &lt;code&gt;autobot-backend/knowledge/&lt;/code&gt;. The legacy &lt;code&gt;knowledge_base.py&lt;/code&gt; at the backend root is a thin re-export shim — all real logic is in the &lt;code&gt;knowledge/&lt;/code&gt; package.&lt;/p&gt;




&lt;h2&gt;
  
  
  End-to-End Pipeline Walk-Through
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Document Ingestion
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/documents.py&lt;/code&gt; — &lt;code&gt;DocumentsMixin.add_document()&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/documents.py
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Add a document to the knowledge base with async processing.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you drop a file into AutoBot, this is what happens:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Content arrives&lt;/strong&gt; — plain text, Markdown, or PDF.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chunking&lt;/strong&gt; — the document is split into overlapping chunks so context is preserved at retrieval time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding&lt;/strong&gt; — each chunk is converted to a 768-dimensional float vector by the configured Ollama model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt; — vectors + original text land in ChromaDB, keyed by a stable document ID.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The embedding call goes through &lt;code&gt;knowledge/embedding_cache.py&lt;/code&gt; (&lt;code&gt;EmbeddingCache&lt;/code&gt;), which deduplicates repeated content and tracks usage via &lt;code&gt;api/analytics_embedding_patterns.py&lt;/code&gt; (Issue #285). Cache hits skip the Ollama round-trip entirely — useful when you re-index after editing a doc.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Index Configuration
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/index.py&lt;/code&gt; — &lt;code&gt;IndexMixin&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;ChromaDB uses HNSW (Hierarchical Navigable Small World) for approximate nearest-neighbour search. AutoBot exposes the tuning parameters directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/index.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_hnsw_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:space&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_space&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;# distance metric (cosine by default)
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:construction_ef&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_construction_ef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:search_ef&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_search_ef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw:M&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hnsw_m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The current defaults are tuned for collections with 545k+ vectors (Issue #72). If you're running on modest hardware with a small KB, you can tighten &lt;code&gt;hnsw:M&lt;/code&gt; to reduce memory pressure.&lt;/p&gt;

&lt;p&gt;All ChromaDB calls are wrapped with &lt;code&gt;asyncio.to_thread()&lt;/code&gt; (Issue #369) to keep the FastAPI event loop unblocked — something to be aware of if you're adding new index operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Query → Answer
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;knowledge/base.py&lt;/code&gt; — &lt;code&gt;KnowledgeBaseCore&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;On query:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The question is embedded with the same Ollama model used at ingestion (same vector space = valid similarity).&lt;/li&gt;
&lt;li&gt;HNSW search finds the top-k most similar chunks.&lt;/li&gt;
&lt;li&gt;The chunks are passed to LlamaIndex as context alongside the query.&lt;/li&gt;
&lt;li&gt;LlamaIndex sends the augmented prompt to the local Ollama LLM.&lt;/li&gt;
&lt;li&gt;The answer references &lt;em&gt;your&lt;/em&gt; documents, not generic training data.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/base.py — core wiring
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Settings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.embeddings.ollama&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbedding&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.llms.ollama&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Ollama&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.vector_stores.chroma&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChromaVectorStore&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Advanced Retrieval
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;autobot-backend/advanced_rag_optimizer.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;For complex queries, AutoBot can upgrade from plain vector search to a hybrid pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid scoring&lt;/strong&gt; — blends semantic similarity (HNSW cosine) with BM25 keyword score via &lt;code&gt;knowledge/search_components/reranking.py&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query expansion&lt;/strong&gt; — reformulates the question to improve recall on technical vocabulary mismatches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAP-Elites diversification&lt;/strong&gt; — ensures results span multiple knowledge categories rather than returning near-duplicate chunks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU acceleration&lt;/strong&gt; — &lt;code&gt;utils/semantic_chunker_gpu.py&lt;/code&gt; uses RTX 4070 / OpenVINO where available.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;SearchResult&lt;/code&gt; dataclass in &lt;code&gt;advanced_rag_optimizer.py&lt;/code&gt; carries both the raw content and all four score dimensions (&lt;code&gt;semantic_score&lt;/code&gt;, &lt;code&gt;keyword_score&lt;/code&gt;, &lt;code&gt;hybrid_score&lt;/code&gt;, &lt;code&gt;rerank_score&lt;/code&gt;) — useful if you want to instrument retrieval quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Background Vectorization
&lt;/h3&gt;

&lt;p&gt;Entry point: &lt;code&gt;autobot-backend/background_vectorization.py&lt;/code&gt; — &lt;code&gt;BackgroundVectorizer&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;When you add new facts or documents while AutoBot is running, &lt;code&gt;BackgroundVectorizer&lt;/code&gt; picks them up asynchronously via FastAPI background tasks. You don't have to trigger a full re-index — the KB stays live.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feeding Your Codebase to the Knowledge Base
&lt;/h2&gt;

&lt;p&gt;AutoBot has a dedicated &lt;code&gt;CodeEmbeddingGenerator&lt;/code&gt; (&lt;code&gt;autobot-backend/code_embedding_generator.py&lt;/code&gt;) that uses &lt;strong&gt;CodeBERT&lt;/strong&gt; instead of a generic text embedding model. Code has different semantics than prose — function names, types, and structure matter — and CodeBERT is trained on code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# code_embedding_generator.py
&lt;/span&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CodeEmbeddingResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;       &lt;span class="c1"&gt;# 768-dim CodeBERT vector
&lt;/span&gt;    &lt;span class="n"&gt;device_used&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;            &lt;span class="c1"&gt;# 'npu', 'cuda', or 'cpu'
&lt;/span&gt;    &lt;span class="n"&gt;processing_time_ms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;cache_hit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To index your codebase:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1 — Via the chat UI&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: Index the ./src directory into the knowledge base
AutoBot: ✓ Scanning ./src...
         Indexed 847 functions across 63 files
         Embedding device: NPU (OpenVINO)
         Ready for semantic code search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 2 — Via the connector system&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;knowledge/connectors/&lt;/code&gt; directory has a registry (&lt;code&gt;registry.py&lt;/code&gt;) and a scheduler (&lt;code&gt;scheduler.py&lt;/code&gt;). You can register a file-server connector pointing at your repo root and let AutoBot watch for changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# knowledge/connectors/file_server.py
# Register your source directory as a watched connector
&lt;/span&gt;&lt;span class="n"&gt;connector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FileServerConnector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;root_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/your/repo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;file_extensions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 3 — Notion, web, database&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Connectors also exist for Notion (&lt;code&gt;notion.py&lt;/code&gt;), web crawl (&lt;code&gt;web_crawler.py&lt;/code&gt;), audio (&lt;code&gt;audio_connector.py&lt;/code&gt;), and database (&lt;code&gt;database.py&lt;/code&gt;). The base class is &lt;code&gt;knowledge/connectors/base.py&lt;/code&gt; — implement &lt;code&gt;fetch()&lt;/code&gt; and register via &lt;code&gt;registry.py&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where to Plug In: Contributing to the KB Engine
&lt;/h2&gt;

&lt;p&gt;Here are the cleanest entry points for first contributions:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/documents.py&lt;/code&gt; — DocumentsMixin
&lt;/h3&gt;

&lt;p&gt;Good for: adding new file format support (EPUB, HTML, DOCX), improving chunking strategy.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;add_document()&lt;/code&gt; and related methods are well-isolated. A chunking improvement here applies to every ingestion path.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/connectors/&lt;/code&gt; — Connector Registry
&lt;/h3&gt;

&lt;p&gt;Good for: adding new data sources (GitHub issues, Jira, Slack export).&lt;/p&gt;

&lt;p&gt;Implement the &lt;code&gt;BaseConnector&lt;/code&gt; interface and register in &lt;code&gt;registry.py&lt;/code&gt;. Look at &lt;code&gt;notion.py&lt;/code&gt; for a reference implementation with authentication handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;advanced_rag_optimizer.py&lt;/code&gt; — Hybrid Search
&lt;/h3&gt;

&lt;p&gt;Good for: retrieval quality improvements, new reranking strategies, better query expansion.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;SearchResult&lt;/code&gt; + &lt;code&gt;QueryContext&lt;/code&gt; dataclasses are clean — adding a new scoring dimension means extending the dataclass and wiring it into &lt;code&gt;compute_blended_score()&lt;/code&gt; in &lt;code&gt;knowledge/search_components/reranking.py&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;knowledge/index.py&lt;/code&gt; — HNSW Tuning
&lt;/h3&gt;

&lt;p&gt;Good for: performance work on large vector collections, memory footprint reduction.&lt;/p&gt;

&lt;p&gt;The HNSW parameter exposure is deliberately simple. There's room for adaptive tuning based on collection size and hardware profile.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;background_vectorization.py&lt;/code&gt; — BackgroundVectorizer
&lt;/h3&gt;

&lt;p&gt;Good for: incremental sync improvements, smarter deduplication, conflict resolution when a connector and a manual upload touch the same document.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the KB Locally
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and start the full stack&lt;/span&gt;
git clone https://github.com/mrveiss/AutoBot-AI
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# Or use the installer script&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/mrveiss/AutoBot-AI/Dev_new_gui/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The knowledge base stores vectors in &lt;code&gt;./data/chromadb/&lt;/code&gt; by default. It persists across container restarts.&lt;/p&gt;

&lt;p&gt;To run just the backend in dev mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;autobot-backend
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
uvicorn app_factory:create_app &lt;span class="nt"&gt;--factory&lt;/span&gt; &lt;span class="nt"&gt;--reload&lt;/span&gt; &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Where to Go Next
&lt;/h2&gt;

&lt;p&gt;If you want to contribute to the Python side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Good first issues (Python label):&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/python" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/labels/python&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All good first issues:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/labels/good%20first%20issue&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contributing guide:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/blob/Dev_new_gui/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Discussions:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this article saved you an hour of reading source code, you can &lt;a href="https://ko-fi.com/mrveiss" rel="noopener noreferrer"&gt;buy me a coffee on Ko-fi&lt;/a&gt; — it goes directly toward hardware time for the project.&lt;/p&gt;




&lt;p&gt;AutoBot is free, open source, and runs entirely on your hardware. The RAG pipeline is the core of what makes a local AI assistant actually useful — and it's a great place to dig in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your data. Your AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;github.com/mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>rag</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>OpenClaw and AutoBot: two different visions for local AI</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 11:21:25 +0000</pubDate>
      <link>https://dev.to/mrveiss/openclaw-and-autobot-two-different-visions-for-local-ai-648</link>
      <guid>https://dev.to/mrveiss/openclaw-and-autobot-two-different-visions-for-local-ai-648</guid>
      <description>&lt;p&gt;OpenClaw hit 100,000 GitHub stars in two months. Its agents built their own social network. PCWorld and TechCrunch ran pieces on the risks. If you've been anywhere near AI Twitter this week, you've seen the wave.&lt;/p&gt;

&lt;p&gt;I've been building AutoBot for three years. People keep asking me the same question: &lt;em&gt;is this your competitor?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It's not. We're solving different problems.&lt;/p&gt;

&lt;p&gt;This piece is for the developers I keep meeting who are excited by OpenClaw and unsettled by it at the same time. There's a real reason for that feeling — and there's room in the local-AI world for both projects to exist.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two philosophies, one ecosystem
&lt;/h2&gt;

&lt;p&gt;OpenClaw is about &lt;strong&gt;agent autonomy&lt;/strong&gt;. You give the agent goals, system access, and time. It figures out the rest. The whole point is that you're not in the loop for every step.&lt;/p&gt;

&lt;p&gt;AutoBot is about &lt;strong&gt;data sovereignty&lt;/strong&gt;. You feed it your docs, your codebase, your business knowledge. It answers questions, drafts copy, helps you code — but it does what you ask, when you ask, on your machine.&lt;/p&gt;

&lt;p&gt;Different problems. Different trade-offs. Both legitimate.&lt;/p&gt;

&lt;p&gt;The PCWorld and TechCrunch coverage didn't say OpenClaw was bad. It said &lt;em&gt;autonomous agents with system-level permissions are a category of risk we don't have great answers for yet&lt;/em&gt;. That's true. It's also the price of admission for what OpenClaw is trying to do, and a lot of people will pay it gladly.&lt;/p&gt;

&lt;p&gt;Some won't. Those are the people I want to talk to.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "Your data. Your AI." actually means
&lt;/h2&gt;

&lt;p&gt;The line we built AutoBot around is &lt;em&gt;Your data. Your AI.&lt;/em&gt; Here's what that resolves to in code:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your data stays on your machine.&lt;/strong&gt; The knowledge base — the documents you upload, the codebase you index, the business processes you paste in — never leaves your hardware. There is no cloud component. There is no telemetry pipe. If your machine is offline, AutoBot is offline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You pick the brain.&lt;/strong&gt; Want to run fully local? Plug in Ollama, LM Studio, llama.cpp — anything with an OpenAI-compatible endpoint. Want GPT-4 or Claude for the heavy lifting? Connect your API key. Your prompts go to that model, but your knowledge base documents don't.&lt;/p&gt;

&lt;p&gt;The brain phones home. Your documents don't. That's the line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You decide what it does.&lt;/strong&gt; AutoBot doesn't run on a schedule. It doesn't take actions while you sleep. It doesn't have system access beyond what its container can see. The trade-off: you have to ask. The benefit: nothing happens that you didn't ask for.&lt;/p&gt;




&lt;h2&gt;
  
  
  When you'd pick which
&lt;/h2&gt;

&lt;p&gt;I'm not going to pretend AutoBot is the answer for everything. Picking the right tool matters more than picking a side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw fits when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want long-running, multi-step automation&lt;/li&gt;
&lt;li&gt;You're comfortable scoping permissions and accepting agent risk&lt;/li&gt;
&lt;li&gt;The win is the agent doing things without you in the loop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AutoBot fits when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your data can't leave your network (regulated industries, proprietary code, client work)&lt;/li&gt;
&lt;li&gt;You want an AI that knows &lt;em&gt;your&lt;/em&gt; domain, not a generic model&lt;/li&gt;
&lt;li&gt;You want to keep the human in the loop — the AI is a tool, not a coworker&lt;/li&gt;
&lt;li&gt;You need something you can deploy once and run forever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are people who will run both. AutoBot for the knowledge base and chat layer over their own data. OpenClaw for autonomous tasks where they've scoped the risk. That's a legitimate stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  What we actually built
&lt;/h2&gt;

&lt;p&gt;Because I keep getting asked: AutoBot is a self-hosted AI platform. The chat interface gets the attention but the knowledge base is the product.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RAG engine&lt;/strong&gt; that turns your raw files into a searchable AI layer that knows your domain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable LLM&lt;/strong&gt; — local via Ollama, or any OpenAI-compatible endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fleet management&lt;/strong&gt; for running AutoBot across multiple machines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker Compose deploy&lt;/strong&gt; — one command, full stack&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's open source. It's actively developed. The roadmap is public. Community PRs are welcome and tagged with skill-based good-first-issue labels for Python, frontend, and DevOps contributors.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it in five minutes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Connect your LLM. Feed it your first document.&lt;/p&gt;

&lt;p&gt;That's it. You're running your own AI.&lt;/p&gt;




&lt;p&gt;If the OpenClaw moment got you thinking harder about where your data lives and who controls your AI — even if you stay on OpenClaw — that's a good thing for the ecosystem. We need more people asking those questions.&lt;/p&gt;

&lt;p&gt;If the answer you land on is &lt;em&gt;I want the AI but I want to stay in control&lt;/em&gt;, AutoBot is here for that.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;Star us on GitHub&lt;/a&gt; · &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;Join the discussions&lt;/a&gt; · &lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;Sponsor the project&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>selfhosted</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Weekly Update: ✨ docs: refresh stale status/changelog and canonical TaskSta</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 11 May 2026 05:00:06 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-docs-refresh-stale-statuschangelog-and-canonical-tasksta-1pj3</link>
      <guid>https://dev.to/mrveiss/weekly-update-docs-refresh-stale-statuschangelog-and-canonical-tasksta-1pj3</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ docs: refresh stale status/changelog and canonical TaskStatus examples (#7498)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 docs: refresh stale status/changelog and canonical TaskStatu&lt;/li&gt;
&lt;li&gt;🔧 feat(web_fetch): add WebFetcher.fetch_raw_html public API (c&lt;/li&gt;
&lt;li&gt;🔧 fix(multimodal): LSP exception contract — VisionProcessor + &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +1&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-04T05:00:03.384055Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-05-04T05:00:03.384055Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Weekly Update: ✨ fix(backend): replace bare singleton aliases with get_*()</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 27 Apr 2026 05:00:10 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-fixbackend-replace-bare-singleton-aliases-with-get-3e0e</link>
      <guid>https://dev.to/mrveiss/weekly-update-fixbackend-replace-bare-singleton-aliases-with-get-3e0e</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ fix(backend): replace bare singleton aliases with get_*() pattern at all call sites (#6196) (#6217)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 fix(backend): replace bare singleton aliases with get_*() pa&lt;/li&gt;
&lt;li&gt;🔧 fix(api): move @with_error_handling below @router.* in workf&lt;/li&gt;
&lt;li&gt;🔧 fix(deploy): add Ansible task to clear stale .pyc files afte&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-20T05:00:04.313917Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-20T05:00:04.313917Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Why We Built AutoBot: The WordPress of AI</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Wed, 15 Apr 2026 19:39:01 +0000</pubDate>
      <link>https://dev.to/mrveiss/why-we-built-autobot-the-wordpress-of-ai-4k7b</link>
      <guid>https://dev.to/mrveiss/why-we-built-autobot-the-wordpress-of-ai-4k7b</guid>
      <description>&lt;p&gt;Three years ago I started building AutoBot because I was tired of renting my own intelligence.&lt;/p&gt;

&lt;p&gt;Every AI tool I used followed the same playbook: send your data to our servers, pay monthly, accept our terms, trust us not to read your prompts. The model was the product. You were the user. Your data was the inventory.&lt;/p&gt;

&lt;p&gt;I wanted something different. I wanted an AI that felt like mine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your data. Your AI.
&lt;/h2&gt;

&lt;p&gt;That's the line we built everything around.&lt;/p&gt;

&lt;p&gt;WordPress gave everyone a website. Before WordPress, having a web presence meant renting space on someone else's platform, playing by their rules, losing your content if they shut down. WordPress flipped that. You install it, you own it, you extend it, you run it forever.&lt;/p&gt;

&lt;p&gt;AutoBot is that for AI.&lt;/p&gt;

&lt;p&gt;Self-hosted. Open source. Yours to extend. The AI platform that belongs to you — not to us.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feed it your world
&lt;/h2&gt;

&lt;p&gt;The core feature isn't the chat interface. It's the knowledge base.&lt;/p&gt;

&lt;p&gt;Drop in your docs. Upload your codebase. Paste your business processes. AutoBot's RAG engine turns your raw files into a searchable, queryable AI layer that actually knows your domain — not just what some model was trained on two years ago.&lt;/p&gt;

&lt;p&gt;Feed it your docs. Your codebase. Your business.&lt;br&gt;
It learns what you know. It stays where you are.&lt;/p&gt;

&lt;p&gt;This is the part that changes things. An AI that knows &lt;em&gt;your&lt;/em&gt; codebase gives better answers than a generic model. An AI that's read &lt;em&gt;your&lt;/em&gt; legal documents is more useful than one guessing at your jurisdiction. An AI trained on &lt;em&gt;your&lt;/em&gt; patient intake forms is more reliable than one pattern-matching across the internet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An AI that's actually about you.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Pick your brain
&lt;/h2&gt;

&lt;p&gt;AutoBot is not opinionated about which LLM powers it. That's your call.&lt;/p&gt;

&lt;p&gt;Want to run fully local? Plug in Ollama. LM Studio. llama.cpp. Anything with an OpenAI-compatible endpoint.&lt;/p&gt;

&lt;p&gt;Prefer GPT-4 or Claude for the heavy lifting? Connect your API key. Your data stays on your machine — your prompts go to the model, but your knowledge base documents don't.&lt;/p&gt;

&lt;p&gt;That distinction matters. The brain phones home. Your documents don't.&lt;/p&gt;




&lt;h2&gt;
  
  
  You shouldn't have to ask permission
&lt;/h2&gt;

&lt;p&gt;The thing that finally broke me on cloud AI wasn't the pricing. It was the 2 AM email.&lt;/p&gt;

&lt;p&gt;"We're updating our terms of service effective next month. By continuing to use the service you agree to..."&lt;/p&gt;

&lt;p&gt;Your data never leaves your machine — whatever brain powers it.&lt;br&gt;
No rate limits on your knowledge. No vendor changing the rules on you overnight.&lt;/p&gt;

&lt;p&gt;Deploy once. Run it your way. Forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  Most AI is rented. AutoBot is yours.
&lt;/h2&gt;

&lt;p&gt;You decide what it does. You decide where it runs. You decide who sees it.&lt;/p&gt;

&lt;p&gt;No subscription. No surveillance. No one reading your prompts.&lt;br&gt;
Install it. Own it. Run it forever.&lt;/p&gt;

&lt;p&gt;For developers who've been burned by API deprecations, pricing pivots, and terms changes — this is for you.&lt;/p&gt;

&lt;p&gt;For law firms, medical startups, and anyone in a regulated industry — your data never touches our servers. No cloud vendor to breach. No third party holding your keys. What stays on your machine stays yours — full stop. You control the perimeter. We just give you the tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get started in 5 minutes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/mrveiss/AutoBot-AI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AutoBot-AI
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Connect your LLM. Feed your first document.&lt;/p&gt;

&lt;p&gt;That's it. You're running your own AI.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AutoBot is open source. If this resonates, &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;star us on GitHub&lt;/a&gt;, join the &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;community discussions&lt;/a&gt;, or &lt;a href="https://github.com/sponsors/mrveiss" rel="noopener noreferrer"&gt;sponsor the project&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>selfhosted</category>
      <category>opensource</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Weekly Update: ✨ docs(claude.md): add codebase-as-source-of-truth rule</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Mon, 13 Apr 2026 05:00:06 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-docsclaudemd-add-codebase-as-source-of-truth-rule-2ol3</link>
      <guid>https://dev.to/mrveiss/weekly-update-docsclaudemd-add-codebase-as-source-of-truth-rule-2ol3</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ docs(claude.md): add codebase-as-source-of-truth rule
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 docs(claude.md): add codebase-as-source-of-truth rule&lt;/li&gt;
&lt;li&gt;🔧 Merge branch 'main' into Dev_new_gui&lt;/li&gt;
&lt;li&gt;🔧 fix(devops): ensure log directory/files have correct ownersh&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-06T05:00:03.370813Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-06T05:00:03.370813Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Weekly Update: ✨ WIP: preserve work from issue-3291 (#4241)</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Sun, 12 Apr 2026 19:33:43 +0000</pubDate>
      <link>https://dev.to/mrveiss/weekly-update-wip-preserve-work-from-issue-3291-4241-4lk7</link>
      <guid>https://dev.to/mrveiss/weekly-update-wip-preserve-work-from-issue-3291-4241-4lk7</guid>
      <description>&lt;h2&gt;
  
  
  Weekly Update: ✨ WIP: preserve work from issue-3291 (#4241)
&lt;/h2&gt;

&lt;p&gt;This week we shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3291 (#4241)&lt;/li&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3290 (#4240)&lt;/li&gt;
&lt;li&gt;🔧 WIP: preserve work from issue-3281 (#4239)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Contributors: +2&lt;/p&gt;

&lt;p&gt;→ Full changelog: &lt;a href="https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-05T16:46:53.357011Z" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/commits/Dev_new_gui?since=2026-04-05T16:46:53.357011Z&lt;/a&gt;&lt;br&gt;
→ Discuss on GitHub: &lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;https://github.com/mrveiss/AutoBot-AI/discussions&lt;/a&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ai</category>
      <category>automation</category>
      <category>weeklyupdate</category>
    </item>
    <item>
      <title>Fleet Management with Ansible — The AutoBot Approach</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Wed, 08 Apr 2026 19:12:50 +0000</pubDate>
      <link>https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach-3kh5</link>
      <guid>https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach-3kh5</guid>
      <description>&lt;h1&gt;
  
  
  Fleet Management with Ansible — The AutoBot Approach
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Part 3: Scaling to Enterprise Infrastructure
&lt;/h2&gt;

&lt;p&gt;You've completed Parts 1 and 2. You're running AutoBot, your knowledge base is populated, and you're comfortable with the basics. Now comes the hard part: &lt;strong&gt;scaling your infrastructure to dozens of servers across multiple data centers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managing 10 servers is manageable with SSH and scripts. Managing 50 servers? That's painful. Managing 100+? That's impossible without orchestration.&lt;/p&gt;

&lt;p&gt;The problems multiply: manual deployment coordination across regions, unpredictable rollback times, team members overwriting each other's changes, onboarding new engineers who don't know your procedures, configuration drift creeping in over weeks. You need something that treats your entire fleet as a cohesive unit—something that can deploy a change, verify health across all servers, and roll back if anything fails.&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;AutoBot + Ansible&lt;/strong&gt;. Together, they solve the orchestration challenge. Ansible has the power. AutoBot adds intelligence, discoverability, and real-time coordination. This post shows you the complete enterprise approach.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ansible Basics: Quick Recap
&lt;/h2&gt;

&lt;p&gt;If you've followed Part 1, you know Ansible is an agentless configuration management tool. You define infrastructure state in &lt;strong&gt;playbooks&lt;/strong&gt; (YAML files describing tasks), organize them into &lt;strong&gt;roles&lt;/strong&gt; (reusable logic), and target servers with &lt;strong&gt;inventories&lt;/strong&gt; (server lists grouped by function).&lt;/p&gt;

&lt;p&gt;A simple playbook looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;webservers&lt;/span&gt;
  &lt;span class="na"&gt;tasks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy app&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/opt/deploy/restart-app.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Traditional Ansible is powerful but has friction: you SSH into a bastion host, run playbook commands, monitor output, troubleshoot manually. At scale, this becomes a bottleneck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoBot extends Ansible&lt;/strong&gt; by making playbooks discoverable through natural language, orchestrating complex multi-step workflows automatically, adding pre-deployment health checks, providing real-time status updates, and enabling intelligent rollback decisions based on actual health metrics—not just task completion.&lt;/p&gt;




&lt;h2&gt;
  
  
  AutoBot + Ansible Architecture
&lt;/h2&gt;

&lt;p&gt;Here's how AutoBot elevates Ansible to enterprise scale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│ Chat Command: "Deploy v2.5 to production"               │
└─────────────┬───────────────────────────────────────────┘
              ↓
    ┌─────────────────────┐
    │ Parse &amp;amp; Intent      │
    │ Determine target    │
    │ Validate access     │
    └────────┬────────────┘
             ↓
  ┌──────────────────────────────────────┐
  │ AutoBot Fleet Orchestrator           │
  │ - Selects matching playbooks         │
  │ - Orders execution by dependency     │
  │ - Determines parallel vs serial      │
  └──────────┬───────────────────────────┘
             ↓
  ┌──────────────────────────────────────────────────┐
  │ Ansible Inventory &amp;amp; Playbooks                    │
  │ (50+ production servers across 5 data centers)   │
  └──────────┬───────────────────────────────────────┘
             ↓
  ┌────────────────────────────────────────────────────┐
  │ Parallel Execution Layer                           │
  │ - Pre-deployment checks (disk, service health)    │
  │ - Rolling deployment (batches)                    │
  │ - Health verification after each batch            │
  │ - Automatic rollback on failure                   │
  └────────────┬─────────────────────────────────────┘
               ↓
  ┌─────────────────────────────────────────────────┐
  │ Real-time Monitoring &amp;amp; Reporting                │
  │ ✓ 50/50 servers deployed successfully           │
  │ ✓ Health checks: All green                       │
  │ ✓ Deployment complete: 12 minutes                │
  └─────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The flow:&lt;/strong&gt; Chat command → intent parsing → playbook selection → dependency orchestration → parallel execution with rolling strategy → health checks at each stage → real-time status updates → completion report.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deep Example: Zero-Downtime Production Deployment
&lt;/h2&gt;

&lt;p&gt;Scenario: Deploy a critical service update (v2.5) to 50+ production servers across 5 data centers. Traditional approach: 2-3 hours of manual work, SSH sessions to each region, testing at each step, risk of human error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With AutoBot + Ansible: 15 minutes, completely orchestrated.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ansible-playbook deploy-v2.5.yml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--inventory&lt;/span&gt; production-inventory.ini &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--limit&lt;/span&gt; &lt;span class="s2"&gt;"webservers:&amp;amp;us-east"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--extra-vars&lt;/span&gt; &lt;span class="s2"&gt;"batch_size=10 health_check=true rollback_on_failure=true"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="s2"&gt;"pre-check,deploy,validate"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 1: Pre-deployment Checks&lt;/strong&gt; (2 minutes)&lt;br&gt;
AutoBot runs checks across all 50 servers in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify 20% free disk space on &lt;code&gt;/opt/app&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Confirm core services are healthy&lt;/li&gt;
&lt;li&gt;Validate database connectivity from each app server&lt;/li&gt;
&lt;li&gt;Check load balancer is accessible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any server fails, deployment stops and reports the issue before touching production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Rolling Deployment&lt;/strong&gt; (10 minutes)&lt;br&gt;
Deploy in batches of 10 servers, removing from load balancer before deployment:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Remove 10 servers from load balancer&lt;/li&gt;
&lt;li&gt;Deploy v2.5 binary (~1 minute per batch, parallelized)&lt;/li&gt;
&lt;li&gt;Run post-deploy smoke test (curl endpoints, verify response codes)&lt;/li&gt;
&lt;li&gt;Restore to load balancer&lt;/li&gt;
&lt;li&gt;Wait 30 seconds for traffic to normalize&lt;/li&gt;
&lt;li&gt;Repeat for next batch&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;During this process, 40 servers continue serving traffic. User impact: zero. The load balancer handles traffic gracefully across remaining capacity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Canary Validation&lt;/strong&gt; (1 minute)&lt;br&gt;
Before declaring success, AutoBot validates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error rate on newly deployed servers &amp;lt; baseline&lt;/li&gt;
&lt;li&gt;Response latency within acceptable bounds&lt;/li&gt;
&lt;li&gt;No spike in database queries per server&lt;/li&gt;
&lt;li&gt;Health check endpoints return 200&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Rollback Capability&lt;/strong&gt; (available immediately)&lt;br&gt;
If any metric fails validation, AutoBot automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stops further deployments&lt;/li&gt;
&lt;li&gt;Rolls back deployed servers to previous version&lt;/li&gt;
&lt;li&gt;Restores original traffic distribution&lt;/li&gt;
&lt;li&gt;Alerts on-call team with detailed logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real performance:&lt;/strong&gt; 50 servers, 100MB binary deployment ≈ 1 minute network transfer (bandwidth-limited), 2-3 minutes per batch at current scale.&lt;/p&gt;


&lt;h2&gt;
  
  
  Advanced Features
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Health Checks &amp;amp; Intelligent Pausing
&lt;/h3&gt;

&lt;p&gt;AutoBot monitors health during deployment. If a health check fails on any batch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Post-deploy health check&lt;/span&gt;
  &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:8080/health&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GET&lt;/span&gt;
  &lt;span class="na"&gt;register&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;health&lt;/span&gt;
  &lt;span class="na"&gt;failed_when&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;health.status != &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deployment pauses. AutoBot provides context: "Batch 3 (us-west-2) failed health checks. Error rate spiked from 0.1% to 2.5%. Rollback batch 3? [Y/n]" You investigate, fix the issue, resume without redeploying unaffected servers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conditional Deployments
&lt;/h3&gt;

&lt;p&gt;Some services have dependencies. Deploy cache service before application layer before API gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy cache tier&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cache_servers&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy app tier&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app_servers&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy API gateway&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api_gateway&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;gateway&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AutoBot respects dependency order, parallelizing independent paths. Cache and database upgrades run in parallel. Application waits for both. Gateway waits for application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-time Status in Chat
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: Deploy cache-v3 to production
AutoBot: Starting deployment to 15 cache servers...
  ✓ Pre-checks passed
  • Batch 1: Deploying (3/5 servers done)
  • Batch 2: Queued
  ✓ Health: All green
  ETA: 6 minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No SSH. No log tailing. Just clear, real-time progress in your chat interface.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance &amp;amp; Scale
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fleet size:&lt;/strong&gt; Tested to 500+ servers. Response time under 30 seconds to start orchestration, sub-second status queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment speed:&lt;/strong&gt; Network bandwidth is the limiting factor. A 100MB binary across 50 servers ≈ 1 minute (assuming 10 Gbps cluster network). Configuration changes without binary transfer ≈ 20 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure handling:&lt;/strong&gt; Detect failure on one server, pause orchestration, investigate, resume remaining batches without redeploying successful servers. Zero re-work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Optimization:&lt;/strong&gt; Choose rolling deployments for critical services (maintain capacity), canary for lower-risk changes (faster feedback), or blue-green for instant rollback on database schema changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;You've now completed the full AutoBot trilogy:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/building-a-self-hosted-ai-platform-with-autobot"&gt;Part 1: Building a Self-Hosted AI Platform&lt;/a&gt;&lt;/strong&gt; — Get AutoBot running, understand the chat interface, manage your first fleet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/how-we-use-rag-for-knowledge-base-search-in-autobot"&gt;Part 2: How We Use RAG for Knowledge Base Search&lt;/a&gt;&lt;/strong&gt; — Turn your scattered runbooks into instant, intelligent answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach"&gt;Part 3: Fleet Management with Ansible&lt;/a&gt;&lt;/strong&gt; — Orchestrate enterprise infrastructure with zero-downtime deployments and intelligent health management.&lt;/p&gt;

&lt;p&gt;Deploy your first fleet. Join the community. Infrastructure automation is no longer a luxury—it's essential for scale.&lt;/p&gt;

&lt;p&gt;What's your biggest orchestration challenge? Let me know in the comments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started with AutoBot
&lt;/h2&gt;

&lt;p&gt;AutoBot is free, open source, and ready to run on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📦 GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;Source Code &amp;amp; Installation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI#readme" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/issues" rel="noopener noreferrer"&gt;Issues &amp;amp; Feature Requests&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;Discussions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Deploy it today with: &lt;code&gt;docker compose up -d&lt;/code&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ansible</category>
      <category>fleetmanagement</category>
      <category>devops</category>
    </item>
    <item>
      <title>Fleet Management with Ansible — The AutoBot Approach</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Wed, 08 Apr 2026 14:19:48 +0000</pubDate>
      <link>https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach-3mnm</link>
      <guid>https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach-3mnm</guid>
      <description>&lt;h1&gt;
  
  
  Fleet Management with Ansible — The AutoBot Approach
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Part 3: Scaling to Enterprise Infrastructure
&lt;/h2&gt;

&lt;p&gt;You've completed Parts 1 and 2. You're running AutoBot, your knowledge base is populated, and you're comfortable with the basics. Now comes the hard part: &lt;strong&gt;scaling your infrastructure to dozens of servers across multiple data centers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managing 10 servers is manageable with SSH and scripts. Managing 50 servers? That's painful. Managing 100+? That's impossible without orchestration.&lt;/p&gt;

&lt;p&gt;The problems multiply: manual deployment coordination across regions, unpredictable rollback times, team members overwriting each other's changes, onboarding new engineers who don't know your procedures, configuration drift creeping in over weeks. You need something that treats your entire fleet as a cohesive unit—something that can deploy a change, verify health across all servers, and roll back if anything fails.&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;AutoBot + Ansible&lt;/strong&gt;. Together, they solve the orchestration challenge. Ansible has the power. AutoBot adds intelligence, discoverability, and real-time coordination. This post shows you the complete enterprise approach.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ansible Basics: Quick Recap
&lt;/h2&gt;

&lt;p&gt;If you've followed Part 1, you know Ansible is an agentless configuration management tool. You define infrastructure state in &lt;strong&gt;playbooks&lt;/strong&gt; (YAML files describing tasks), organize them into &lt;strong&gt;roles&lt;/strong&gt; (reusable logic), and target servers with &lt;strong&gt;inventories&lt;/strong&gt; (server lists grouped by function).&lt;/p&gt;

&lt;p&gt;A simple playbook looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;webservers&lt;/span&gt;
  &lt;span class="na"&gt;tasks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy app&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/opt/deploy/restart-app.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Traditional Ansible is powerful but has friction: you SSH into a bastion host, run playbook commands, monitor output, troubleshoot manually. At scale, this becomes a bottleneck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoBot extends Ansible&lt;/strong&gt; by making playbooks discoverable through natural language, orchestrating complex multi-step workflows automatically, adding pre-deployment health checks, providing real-time status updates, and enabling intelligent rollback decisions based on actual health metrics—not just task completion.&lt;/p&gt;




&lt;h2&gt;
  
  
  AutoBot + Ansible Architecture
&lt;/h2&gt;

&lt;p&gt;Here's how AutoBot elevates Ansible to enterprise scale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│ Chat Command: "Deploy v2.5 to production"               │
└─────────────┬───────────────────────────────────────────┘
              ↓
    ┌─────────────────────┐
    │ Parse &amp;amp; Intent      │
    │ Determine target    │
    │ Validate access     │
    └────────┬────────────┘
             ↓
  ┌──────────────────────────────────────┐
  │ AutoBot Fleet Orchestrator           │
  │ - Selects matching playbooks         │
  │ - Orders execution by dependency     │
  │ - Determines parallel vs serial      │
  └──────────┬───────────────────────────┘
             ↓
  ┌──────────────────────────────────────────────────┐
  │ Ansible Inventory &amp;amp; Playbooks                    │
  │ (50+ production servers across 5 data centers)   │
  └──────────┬───────────────────────────────────────┘
             ↓
  ┌────────────────────────────────────────────────────┐
  │ Parallel Execution Layer                           │
  │ - Pre-deployment checks (disk, service health)    │
  │ - Rolling deployment (batches)                    │
  │ - Health verification after each batch            │
  │ - Automatic rollback on failure                   │
  └────────────┬─────────────────────────────────────┘
               ↓
  ┌─────────────────────────────────────────────────┐
  │ Real-time Monitoring &amp;amp; Reporting                │
  │ ✓ 50/50 servers deployed successfully           │
  │ ✓ Health checks: All green                       │
  │ ✓ Deployment complete: 12 minutes                │
  └─────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The flow:&lt;/strong&gt; Chat command → intent parsing → playbook selection → dependency orchestration → parallel execution with rolling strategy → health checks at each stage → real-time status updates → completion report.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deep Example: Zero-Downtime Production Deployment
&lt;/h2&gt;

&lt;p&gt;Scenario: Deploy a critical service update (v2.5) to 50+ production servers across 5 data centers. Traditional approach: 2-3 hours of manual work, SSH sessions to each region, testing at each step, risk of human error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With AutoBot + Ansible: 15 minutes, completely orchestrated.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ansible-playbook deploy-v2.5.yml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--inventory&lt;/span&gt; production-inventory.ini &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--limit&lt;/span&gt; &lt;span class="s2"&gt;"webservers:&amp;amp;us-east"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--extra-vars&lt;/span&gt; &lt;span class="s2"&gt;"batch_size=10 health_check=true rollback_on_failure=true"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="s2"&gt;"pre-check,deploy,validate"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 1: Pre-deployment Checks&lt;/strong&gt; (2 minutes)&lt;br&gt;
AutoBot runs checks across all 50 servers in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify 20% free disk space on &lt;code&gt;/opt/app&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Confirm core services are healthy&lt;/li&gt;
&lt;li&gt;Validate database connectivity from each app server&lt;/li&gt;
&lt;li&gt;Check load balancer is accessible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any server fails, deployment stops and reports the issue before touching production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Rolling Deployment&lt;/strong&gt; (10 minutes)&lt;br&gt;
Deploy in batches of 10 servers, removing from load balancer before deployment:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Remove 10 servers from load balancer&lt;/li&gt;
&lt;li&gt;Deploy v2.5 binary (~1 minute per batch, parallelized)&lt;/li&gt;
&lt;li&gt;Run post-deploy smoke test (curl endpoints, verify response codes)&lt;/li&gt;
&lt;li&gt;Restore to load balancer&lt;/li&gt;
&lt;li&gt;Wait 30 seconds for traffic to normalize&lt;/li&gt;
&lt;li&gt;Repeat for next batch&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;During this process, 40 servers continue serving traffic. User impact: zero. The load balancer handles traffic gracefully across remaining capacity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Canary Validation&lt;/strong&gt; (1 minute)&lt;br&gt;
Before declaring success, AutoBot validates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error rate on newly deployed servers &amp;lt; baseline&lt;/li&gt;
&lt;li&gt;Response latency within acceptable bounds&lt;/li&gt;
&lt;li&gt;No spike in database queries per server&lt;/li&gt;
&lt;li&gt;Health check endpoints return 200&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Rollback Capability&lt;/strong&gt; (available immediately)&lt;br&gt;
If any metric fails validation, AutoBot automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stops further deployments&lt;/li&gt;
&lt;li&gt;Rolls back deployed servers to previous version&lt;/li&gt;
&lt;li&gt;Restores original traffic distribution&lt;/li&gt;
&lt;li&gt;Alerts on-call team with detailed logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real performance:&lt;/strong&gt; 50 servers, 100MB binary deployment ≈ 1 minute network transfer (bandwidth-limited), 2-3 minutes per batch at current scale.&lt;/p&gt;


&lt;h2&gt;
  
  
  Advanced Features
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Health Checks &amp;amp; Intelligent Pausing
&lt;/h3&gt;

&lt;p&gt;AutoBot monitors health during deployment. If a health check fails on any batch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Post-deploy health check&lt;/span&gt;
  &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:8080/health&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GET&lt;/span&gt;
  &lt;span class="na"&gt;register&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;health&lt;/span&gt;
  &lt;span class="na"&gt;failed_when&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;health.status != &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deployment pauses. AutoBot provides context: "Batch 3 (us-west-2) failed health checks. Error rate spiked from 0.1% to 2.5%. Rollback batch 3? [Y/n]" You investigate, fix the issue, resume without redeploying unaffected servers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conditional Deployments
&lt;/h3&gt;

&lt;p&gt;Some services have dependencies. Deploy cache service before application layer before API gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy cache tier&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cache_servers&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy app tier&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app_servers&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy API gateway&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api_gateway&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;gateway&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AutoBot respects dependency order, parallelizing independent paths. Cache and database upgrades run in parallel. Application waits for both. Gateway waits for application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-time Status in Chat
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;You: Deploy cache-v3 to production
AutoBot: Starting deployment to 15 cache servers...
  ✓ Pre-checks passed
  • Batch 1: Deploying (3/5 servers done)
  • Batch 2: Queued
  ✓ Health: All green
  ETA: 6 minutes
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No SSH. No log tailing. Just clear, real-time progress in your chat interface.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance &amp;amp; Scale
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fleet size:&lt;/strong&gt; Tested to 500+ servers. Response time under 30 seconds to start orchestration, sub-second status queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment speed:&lt;/strong&gt; Network bandwidth is the limiting factor. A 100MB binary across 50 servers ≈ 1 minute (assuming 10 Gbps cluster network). Configuration changes without binary transfer ≈ 20 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure handling:&lt;/strong&gt; Detect failure on one server, pause orchestration, investigate, resume remaining batches without redeploying successful servers. Zero re-work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Optimization:&lt;/strong&gt; Choose rolling deployments for critical services (maintain capacity), canary for lower-risk changes (faster feedback), or blue-green for instant rollback on database schema changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;You've now completed the full AutoBot trilogy:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/building-a-self-hosted-ai-platform-with-autobot"&gt;Part 1: Building a Self-Hosted AI Platform&lt;/a&gt;&lt;/strong&gt; — Get AutoBot running, understand the chat interface, manage your first fleet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/how-we-use-rag-for-knowledge-base-search-in-autobot"&gt;Part 2: How We Use RAG for Knowledge Base Search&lt;/a&gt;&lt;/strong&gt; — Turn your scattered runbooks into instant, intelligent answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/mrveiss/fleet-management-with-ansible-the-autobot-approach"&gt;Part 3: Fleet Management with Ansible&lt;/a&gt;&lt;/strong&gt; — Orchestrate enterprise infrastructure with zero-downtime deployments and intelligent health management.&lt;/p&gt;

&lt;p&gt;Deploy your first fleet. Join the community. Infrastructure automation is no longer a luxury—it's essential for scale.&lt;/p&gt;

&lt;p&gt;What's your biggest orchestration challenge? Let me know in the comments.&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>ansible</category>
      <category>fleetmanagement</category>
      <category>devops</category>
    </item>
    <item>
      <title>How We Use RAG for Knowledge Base Search in AutoBot</title>
      <dc:creator>Mārtiņš Veiss</dc:creator>
      <pubDate>Wed, 08 Apr 2026 14:14:38 +0000</pubDate>
      <link>https://dev.to/mrveiss/how-we-use-rag-for-knowledge-base-search-in-autobot-52ce</link>
      <guid>https://dev.to/mrveiss/how-we-use-rag-for-knowledge-base-search-in-autobot-52ce</guid>
      <description>&lt;h1&gt;
  
  
  How We Use RAG for Knowledge Base Search in AutoBot
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Part 2: Unlocking Your Team's Collective Intelligence
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://dev.to/mrveiss/post-1-getting-started"&gt;Part 1&lt;/a&gt;, you set up AutoBot and experienced how it can execute basic infrastructure tasks. Now let's unlock its real power: &lt;strong&gt;turning your scattered knowledge into instant, intelligent answers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Where does your team's critical knowledge live? Deployment runbooks in Google Drive. Database failover procedures in forgotten Confluence docs. Incident post-mortems buried in Slack. At 3 AM during an outage, finding that knowledge is nearly impossible.&lt;/p&gt;

&lt;p&gt;AutoBot solves this with &lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt;—a technique that lets AutoBot search your actual documentation and generate answers based on your procedures, not generic training data. We'll explore how RAG works, build a practical knowledge base, and show you why this beats traditional keyword search.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is RAG? (Plain English)
&lt;/h2&gt;

&lt;p&gt;RAG stands for &lt;strong&gt;Retrieval-Augmented Generation&lt;/strong&gt;—three operations in one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt;: Find relevant documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Augmented&lt;/strong&gt;: Enhance the AI's answer with those documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generation&lt;/strong&gt;: LLM writes the final answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RAG answers questions using &lt;em&gt;your&lt;/em&gt; knowledge, not the LLM's training data.&lt;/p&gt;

&lt;p&gt;Example: You ask AutoBot: &lt;strong&gt;"How do we handle database replication lag?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Without RAG, the LLM guesses with generic textbook advice. With RAG:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AutoBot searches your knowledge base (runbooks, procedures, incidents)&lt;/li&gt;
&lt;li&gt;Finds documents about your team's replication remediation steps&lt;/li&gt;
&lt;li&gt;Generates an answer grounded in &lt;em&gt;your procedures&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;You get: "Based on your runbook, first check replication status with &lt;code&gt;SHOW REPLICA STATUS&lt;/code&gt;, then..."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Generic advice versus actionable, organization-specific answers. That's why RAG is a game-changer for infrastructure knowledge management.&lt;/p&gt;




&lt;h2&gt;
  
  
  How AutoBot + RAG Works: The Technical Flow
&lt;/h2&gt;

&lt;p&gt;Let's walk through how AutoBot transforms your documents into searchable intelligence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────────────────────────────┐
│           AutoBot RAG Pipeline                      │
├────────────────────────────────────────────────────┤
│                                                    │
│  1. DOCUMENTS                                      │
│     (Runbooks, Procedures, Incidents)             │
│              ↓                                      │
│  2. VECTORIZATION                                  │
│     Convert text → mathematical vectors           │
│     (Embeddings capture meaning)                  │
│              ↓                                      │
│  3. STORAGE                                        │
│     Save vectors in database (ChromaDB)           │
│     With original text for reference              │
│              ↓                                      │
│  ════════════════════════════════════════          │
│              (Knowledge Base Ready)                │
│  ════════════════════════════════════════          │
│              ↓                                      │
│  4. USER QUERY                                     │
│     "How do we handle X?"                         │
│              ↓                                      │
│  5. QUERY VECTORIZATION                            │
│     Convert question → vector                     │
│              ↓                                      │
│  6. SIMILARITY SEARCH                              │
│     Find most similar document vectors            │
│              ↓                                      │
│  7. RETRIEVAL                                      │
│     Extract relevant document chunks              │
│              ↓                                      │
│  8. GENERATION                                     │
│     LLM reads docs + generates answer             │
│              ↓                                      │
│  ANSWER (grounded in YOUR knowledge)              │
│                                                    │
└────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why embeddings beat keyword search&lt;/strong&gt;: Keyword search looks for exact word matches and fails when terminology differs. Embeddings capture &lt;em&gt;meaning&lt;/em&gt;—they understand "lag," "slowness," and "delays" are related. They find the right document even with different wording.&lt;/p&gt;

&lt;p&gt;Vector databases store embeddings efficiently for sub-second retrieval even at massive scale. When your question arrives, AutoBot converts it to the same vector space and finds the closest neighbors—your most relevant documents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Your First Knowledge Base: A Practical Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's get hands-on. Here's how you build a RAG-powered knowledge base in AutoBot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Prepare Your Documents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Gather your source material. For our example, let's use a deployment runbook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Database Failover Runbook&lt;/span&gt;

&lt;span class="gu"&gt;## Quick Reference&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; RTO: 5 minutes
&lt;span class="p"&gt;-&lt;/span&gt; RPO: 0 (synchronous replication)

&lt;span class="gu"&gt;## Detection&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Check replication lag: SHOW REPLICA STATUS
&lt;span class="p"&gt;2.&lt;/span&gt; If lag &amp;gt; 10 seconds, investigate primary
&lt;span class="p"&gt;3.&lt;/span&gt; Monitor replica_lag_ms metric in Prometheus

&lt;span class="gu"&gt;## Failover Process&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Verify replica is caught up (Seconds_Behind_Master = 0)
&lt;span class="p"&gt;2.&lt;/span&gt; Run: STOP SLAVE; CHANGE MASTER TO MASTER_HOST='new_primary'
&lt;span class="p"&gt;3.&lt;/span&gt; Validate data integrity with pt-table-checksum
&lt;span class="p"&gt;4.&lt;/span&gt; Update connection strings in config management
&lt;span class="p"&gt;5.&lt;/span&gt; Run deployment hook to restart services
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Upload to AutoBot&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Open AutoBot's chat interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: Upload database-failover-runbook.md to my knowledge base
AutoBot: ✓ Indexed 1,847 tokens from database-failover-runbook.md
         Document ID: kb_database_failover_001
         Ready for queries
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Test with a Query&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: Our database replica is running 30 seconds behind. What should we do?
AutoBot: Based on your Database Failover Runbook, your target lag is 
         &amp;lt; 10 seconds. Current lag of 30s indicates a problem. 

         Immediate steps:
         1. Check if replica query is slow: SHOW PROCESSLIST
         2. Look for long-running queries blocking replication
         3. Monitor replica_lag_ms in Prometheus for trends

         If lag doesn't improve in 5 minutes, escalate to consider failover
         per your documented RTO of 5 minutes.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4: Build Your Library&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Repeat for each major area:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deployment procedures&lt;/li&gt;
&lt;li&gt;Incident response playbooks&lt;/li&gt;
&lt;li&gt;Network troubleshooting guides&lt;/li&gt;
&lt;li&gt;Capacity planning thresholds&lt;/li&gt;
&lt;li&gt;On-call escalation procedures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pro Tips for Best Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One topic per document&lt;/strong&gt;: Keep deployment separate from scaling separate from incident response&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use clear headers&lt;/strong&gt;: AutoBot chunks by sections—descriptive headers improve retrieval&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Include context&lt;/strong&gt;: Add scope like "This applies to production MySQL 5.7+"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update regularly&lt;/strong&gt;: AutoBot re-indexes when you update documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add decision logic&lt;/strong&gt;: For troubleshooting, explicit decision trees help RAG pick the right path&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real Scenario: 3 AM Production Incident
&lt;/h2&gt;

&lt;p&gt;This happened to us last month. &lt;strong&gt;2:47 AM&lt;/strong&gt;: Database replication lag alert fires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without RAG:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dig through Google Drive for database runbook (3 minutes)&lt;/li&gt;
&lt;li&gt;Find conflicting procedures in Confluence (2 minutes, confused)&lt;/li&gt;
&lt;li&gt;Call groggy database lead (5 minutes)&lt;/li&gt;
&lt;li&gt;Execute unsurely: 15 minutes elapsed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;With AutoBot RAG:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On-call: "AutoBot, show me our database failover procedure"&lt;/li&gt;
&lt;li&gt;AutoBot returns exact current runbook instantly&lt;/li&gt;
&lt;li&gt;Execute with confidence: 5 minutes total&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A 10-minute difference is the gap between contained incident and data corruption spreading. RAG delivers: when you're stressed and the clock is ticking, your team's collective wisdom is one question away.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance &amp;amp; Best Practices
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Common Questions We Hear:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How many documents can AutoBot handle?&lt;/em&gt;&lt;br&gt;
Thousands. We've tested with 10,000+ documents. Response time stays under 5 seconds even at scale.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What about response latency?&lt;/em&gt;&lt;br&gt;
Query vectorization + retrieval + generation = &amp;lt; 5 seconds typically. Most of that is LLM generation time, not RAG overhead.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How do I keep knowledge accurate?&lt;/em&gt;&lt;br&gt;
Update your source documents—AutoBot automatically re-indexes when you upload new versions. Treat your knowledge base like code: versioned, reviewed, maintained.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What formats are supported?&lt;/em&gt;&lt;br&gt;
Markdown, plain text, and PDF. We recommend Markdown for best semantic chunking.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;One more pro tip:&lt;/em&gt;&lt;br&gt;
Organize by functional area. Don't dump everything into one mega-document. "Deployment" should be separate from "Scaling" from "Incident Response." Better documents = better retrieval = better answers.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;You've now seen how AutoBot turns your scattered knowledge into instant, intelligent answers. But infrastructure management is more than just knowledge—it's about &lt;em&gt;orchestration at scale&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://dev.to/mrveiss/post-3-fleet-management"&gt;Part 3: Fleet Management with Ansible&lt;/a&gt;, we'll show you how AutoBot coordinates across your entire infrastructure—deploying to thousands of servers, managing configuration drift, and orchestrating complex multi-step deployments.&lt;/p&gt;

&lt;p&gt;Ready to scale? Let's go.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started with AutoBot
&lt;/h2&gt;

&lt;p&gt;AutoBot is free, open source, and ready to run on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📦 GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;mrveiss/AutoBot-AI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI" rel="noopener noreferrer"&gt;Source Code &amp;amp; Installation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI#readme" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/issues" rel="noopener noreferrer"&gt;Issues &amp;amp; Feature Requests&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mrveiss/AutoBot-AI/discussions" rel="noopener noreferrer"&gt;Discussions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Deploy it today with: &lt;code&gt;docker compose up -d&lt;/code&gt;&lt;/p&gt;

</description>
      <category>autobot</category>
      <category>rag</category>
      <category>ai</category>
      <category>knowledgebase</category>
    </item>
  </channel>
</rss>
