DEV Community

ZNY
ZNY

Posted on

DEV.TO ARTICLE 40: Docker for AI Development: Containerizing LLM Applications

Target Keyword: "docker llm application deployment"
Tags: docker,devops,ai,programming,developer
Type: Tutorial


Content

Docker for AI Development: Containerizing LLM Applications

Docker simplifies AI application deployment by providing consistent environments from development to production. Here's how to containerize your AI applications powered by Claude and ofox.ai.

Why Docker for AI Apps?

  1. Reproducible environments — Same behavior locally and in production
  2. Dependency isolation — Python packages, system libraries, CUDA versions
  3. Easy deployment — Ship to any cloud with Docker
  4. Resource control — Limit CPU/memory per container

Basic Dockerfile for AI App

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first (for caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Set environment variables
ENV PYTHONUNBUFFERED=1

# Run the application
CMD ["python", "main.py"]
Enter fullscreen mode Exit fullscreen mode
# requirements.txt
fastapi==0.109.0
uvicorn==0.27.0
httpx==0.26.0
python-dotenv==1.0.0
Enter fullscreen mode Exit fullscreen mode

Docker Compose for AI Services

# docker-compose.yml
version: '3.8'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OFOX_API_KEY=${OFOX_API_KEY}
      - MODEL=claude-3-5-sonnet-20241022
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Optional: Add a Redis cache
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data

volumes:
  redis-data:
Enter fullscreen mode Exit fullscreen mode

Production-Ready FastAPI + ofox.ai

# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import httpx
import os

app = FastAPI(title="Claude API Service", version="1.0.0")

class Message(BaseModel):
    role: str
    content: str

class ChatRequest(BaseModel):
    messages: List[Message]
    model: str = "claude-3-5-sonnet-20241022"
    max_tokens: Optional[int] = 1024
    temperature: Optional[float] = 0.7

@app.get("/health")
async def health():
    return {"status": "healthy"}

@app.post("/chat")
async def chat(request: ChatRequest):
    async with httpx.AsyncClient(timeout=120.0) as client:
        try:
            response = await client.post(
                "https://api.ofox.ai/v1/chat/completions",
                headers={
                    "Authorization": f"Bearer {os.environ['OFOX_API_KEY']}",
                    "Content-Type": "application/json"
                },
                json={
                    "model": request.model,
                    "messages": [m.model_dump() for m in request.messages],
                    "max_tokens": request.max_tokens,
                    "temperature": request.temperature
                }
            )
            response.raise_for_status()
            data = response.json()
            return {
                "content": data["choices"][0]["message"]["content"],
                "model": data["model"],
                "tokens": data["usage"]["total_tokens"]
            }
        except httpx.HTTPStatusError as e:
            raise HTTPException(status_code=e.response.status_code, detail=str(e))
        except Exception as e:
            raise HTTPException(status_code=500, detail=str(e))
Enter fullscreen mode Exit fullscreen mode

GPU Support for Local Models

# Dockerfile with GPU support
FROM nvidia/cuda:12.1.0-base-ubuntu22.04

RUN apt-get update && apt-get install -y \
    python3.11 python3.11-venv python3-pip \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# For running local models like Ollama
RUN curl -fsSL https://ollama.ai/install.sh | sh

COPY . .
CMD ["python3", "main.py"]
Enter fullscreen mode Exit fullscreen mode
# docker-compose.yml with GPU
services:
  api:
    build: .
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
Enter fullscreen mode Exit fullscreen mode

Multi-Stage Build (Smaller Images)

# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# Production stage
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "main.py"]
Enter fullscreen mode Exit fullscreen mode

Environment-Based Configuration

# config.py
import os
from dataclasses import dataclass

@dataclass
class Config:
    api_key: str
    model: str
    max_tokens: int
    temperature: float

def get_config() -> Config:
    return Config(
        api_key=os.environ["OFOX_API_KEY"],
        model=os.environ.get("MODEL", "claude-3-5-sonnet-20241022"),
        max_tokens=int(os.environ.get("MAX_TOKENS", "1024")),
        temperature=float(os.environ.get("TEMPERATURE", "0.7"))
    )
Enter fullscreen mode Exit fullscreen mode

Building and Running

# Build
docker build -t claude-api-service .

# Run
docker run -d -p 8000:8000 \
  -e OFOX_API_KEY=your-key-here \
  --name claude-api \
  claude-api-service

# With Docker Compose
docker-compose up -d

# View logs
docker logs -f claude-api

# Shell into container
docker exec -it claude-api /bin/bash
Enter fullscreen mode Exit fullscreen mode

CI/CD with GitHub Actions

name: Build and Deploy

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build image
        run: docker build -t claude-api:${{ github.sha }} .

      - name: Run tests
        run: |
          docker run claude-api:${{ github.sha }} pytest

      - name: Push to registry
        run: |
          docker tag claude-api:${{ github.sha }} registry/app/claude-api:latest
          docker push registry/app/claude-api:latest
Enter fullscreen mode Exit fullscreen mode

Deploy Anywhere

With Docker, your AI application deploys to:

  • AWS ECS — Managed container service
  • Google Cloud Run — Serverless containers
  • Azure Container Instances — Simple deployment
  • DigitalOcean App Platform — Simple PaaS
  • Your own server — With docker-compose

Getting Started

Containerize your AI applications and deploy with confidence. Power them with ofox.ai — reliable Claude API with competitive pricing and 99.9% uptime.

👉 Get started with ofox.ai


This article contains affiliate links.


Tags: docker,devops,ai,programming,developer
Canonical URL: https://dev.to/zny10289

Top comments (0)