DEV Community

shailendra khade
shailendra khade

Posted on

Building a Containerized GenAI Chatbot with Docker, Ollama, FastAPI & ChromaDB

Introduction

Modern AI systems are not just Python scripts — they are distributed systems involving:

  • LLM engines
  • APIs
  • UI applications
  • Vector databases
  • Container orchestration

In this blog, I share how I built a GenAI Chatbot using Docker-based microservices, similar to real-world AI platforms, and the real DevOps issues I faced while building it.


System Architecture (AI Architect View)
User (Browser)
|
v
[ Streamlit UI ]
|
v
[ FastAPI Backend ]
|
+----> [ Ollama LLM Engine ]
|
+----> [ ChromaDB Vector Database ]


Why Microservices?

Separating AI components into services improves scalability, maintainability, and fault isolation.


Project Structure
genai-docker-project/

├── backend/ # FastAPI + AI logic
├── ui/ # Streamlit UI
├── docker-compose.yml
├── README.md


Docker Compose (Core of System)

version: "3.8"

services:

  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"

  backend:
    build: ./backend
    ports:
      - "8000:8000"
    depends_on:
      - ollama
      - chroma

  chroma:
    image: chromadb/chroma
    ports:
      - "8001:8000"

  ui:
    build: ./ui
    ports:
      - "8501:8501"
    depends_on:
      - backend

Enter fullscreen mode Exit fullscreen mode

Backend (FastAPI + Ollama Integration)

from fastapi import FastAPI
import requests

app = FastAPI()

OLLAMA_URL = "http://ollama:11434/api/generate"

@app.post("/ask")
def ask_ai(question: str):
    payload = {"model": "mistral", "prompt": question}
    response = requests.post(OLLAMA_URL, json=payload)
    return response.json()

Enter fullscreen mode Exit fullscreen mode

UI (Streamlit)

import streamlit as st
import requests

st.title("GenAI Chatbot")

question = st.text_input("Ask a question:")

if st.button("Ask AI"):
    res = requests.post("http://backend:8000/ask", params={"question": question})
    st.write(res.json())

Enter fullscreen mode Exit fullscreen mode

Real Errors & DevOps Solutions
Error 1: Docker Permission Denied
permission denied while trying to connect to the Docker daemon socket
Root Cause
User was not part of the docker group.
Solution
sudo usermod -aG docker $USER
newgrp docker

Error 2: Port Already in Use (11434)
failed to bind host port 0.0.0.0:11434: address already in use
Root Cause
Ollama was already running on host via Snap:
ps -ef | grep ollama
Output:
/snap/ollama/.../ollama serve
Solution:
sudo snap stop ollama
sudo snap disable ollama

(or change Docker port)
`ports:

  • "21434:11434"`

Error 3: Model Not Found in Ollama
"model 'mistral' not found"
Root Cause
Ollama runtime was running, but model was not downloaded inside the container.

Solution
docker exec -it genai-docker-project-ollama-1 bash
ollama pull mistral

Error 4: Container Networking Issues
Problem
Backend could not connect to Ollama.

Root Cause
Using localhost instead of container DNS name.
Fix
OLLAMA_URL = "http://ollama:11434/api/generate"


Key Learnings (AI + DevOps)

  • AI systems are distributed systems.
  • Docker is essential for reproducible ML environments.
  • LLM platforms require careful networking and resource management.
  • MLOps is the bridge between DevOps and AI.

Top comments (0)