DEV.TO ARTICLE 40: Docker for AI Development: Containerizing LLM Applications

Target Keyword: "docker llm application deployment"
Tags: docker,devops,ai,programming,developer
Type: Tutorial

Content

Docker for AI Development: Containerizing LLM Applications

Docker simplifies AI application deployment by providing consistent environments from development to production. Here's how to containerize your AI applications powered by Claude and ofox.ai.

Why Docker for AI Apps?

Reproducible environments — Same behavior locally and in production
Dependency isolation — Python packages, system libraries, CUDA versions
Easy deployment — Ship to any cloud with Docker
Resource control — Limit CPU/memory per container

Basic Dockerfile for AI App

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first (for caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Set environment variables
ENV PYTHONUNBUFFERED=1

# Run the application
CMD ["python", "main.py"]

# requirements.txt
fastapi==0.109.0
uvicorn==0.27.0
httpx==0.26.0
python-dotenv==1.0.0

Docker Compose for AI Services

# docker-compose.yml
version: '3.8'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OFOX_API_KEY=${OFOX_API_KEY}
      - MODEL=claude-3-5-sonnet-20241022
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Optional: Add a Redis cache
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data

volumes:
  redis-data:

Production-Ready FastAPI + ofox.ai

# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import httpx
import os

app = FastAPI(title="Claude API Service", version="1.0.0")

class Message(BaseModel):
    role: str
    content: str

class ChatRequest(BaseModel):
    messages: List[Message]
    model: str = "claude-3-5-sonnet-20241022"
    max_tokens: Optional[int] = 1024
    temperature: Optional[float] = 0.7

@app.get("/health")
async def health():
    return {"status": "healthy"}

@app.post("/chat")
async def chat(request: ChatRequest):
    async with httpx.AsyncClient(timeout=120.0) as client:
        try:
            response = await client.post(
                "https://api.ofox.ai/v1/chat/completions",
                headers={
                    "Authorization": f"Bearer {os.environ['OFOX_API_KEY']}",
                    "Content-Type": "application/json"
                },
                json={
                    "model": request.model,
                    "messages": [m.model_dump() for m in request.messages],
                    "max_tokens": request.max_tokens,
                    "temperature": request.temperature
                }
            )
            response.raise_for_status()
            data = response.json()
            return {
                "content": data["choices"][0]["message"]["content"],
                "model": data["model"],
                "tokens": data["usage"]["total_tokens"]
            }
        except httpx.HTTPStatusError as e:
            raise HTTPException(status_code=e.response.status_code, detail=str(e))
        except Exception as e:
            raise HTTPException(status_code=500, detail=str(e))

GPU Support for Local Models

# Dockerfile with GPU support
FROM nvidia/cuda:12.1.0-base-ubuntu22.04

RUN apt-get update && apt-get install -y \
    python3.11 python3.11-venv python3-pip \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# For running local models like Ollama
RUN curl -fsSL https://ollama.ai/install.sh | sh

COPY . .
CMD ["python3", "main.py"]

# docker-compose.yml with GPU
services:
  api:
    build: .
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Multi-Stage Build (Smaller Images)

# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# Production stage
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "main.py"]

Environment-Based Configuration

# config.py
import os
from dataclasses import dataclass

@dataclass
class Config:
    api_key: str
    model: str
    max_tokens: int
    temperature: float

def get_config() -> Config:
    return Config(
        api_key=os.environ["OFOX_API_KEY"],
        model=os.environ.get("MODEL", "claude-3-5-sonnet-20241022"),
        max_tokens=int(os.environ.get("MAX_TOKENS", "1024")),
        temperature=float(os.environ.get("TEMPERATURE", "0.7"))
    )

Building and Running

# Build
docker build -t claude-api-service .

# Run
docker run -d -p 8000:8000 \
  -e OFOX_API_KEY=your-key-here \
  --name claude-api \
  claude-api-service

# With Docker Compose
docker-compose up -d

# View logs
docker logs -f claude-api

# Shell into container
docker exec -it claude-api /bin/bash

CI/CD with GitHub Actions

name: Build and Deploy

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build image
        run: docker build -t claude-api:${{ github.sha }} .

      - name: Run tests
        run: |
          docker run claude-api:${{ github.sha }} pytest

      - name: Push to registry
        run: |
          docker tag claude-api:${{ github.sha }} registry/app/claude-api:latest
          docker push registry/app/claude-api:latest

Deploy Anywhere

With Docker, your AI application deploys to:

AWS ECS — Managed container service
Google Cloud Run — Serverless containers
Azure Container Instances — Simple deployment
DigitalOcean App Platform — Simple PaaS
Your own server — With docker-compose

Getting Started

Containerize your AI applications and deploy with confidence. Power them with ofox.ai — reliable Claude API with competitive pricing and 99.9% uptime.

👉 Get started with ofox.ai

This article contains affiliate links.

Tags: docker,devops,ai,programming,developer
Canonical URL: https://dev.to/zny10289