Docker simplifies AI application deployment by providing consistent environments from development to production. Here's how to containerize your AI applications powered by Claude and ofox.ai.
Why Docker for AI Apps?
- Reproducible environments — Same behavior locally and in production
- Dependency isolation — Python packages, system libraries, CUDA versions
- Easy deployment — Ship to any cloud with Docker
- Resource control — Limit CPU/memory per container
Basic Dockerfile for AI App
`dockerfile
Dockerfile
FROM python:3.11-slim
WORKDIR /app
Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*
Copy requirements first (for caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
Copy application
COPY . .
Set environment variables
ENV PYTHONUNBUFFERED=1
Run the application
CMD ["python", "main.py"]
`
dockerfile
requirements.txt
fastapi==0.109.0
uvicorn==0.27.0
httpx==0.26.0
python-dotenv==1.0.0
Docker Compose for AI Services
`yaml
docker-compose.yml
version: '3.8'
services:
api:
build: .
ports:
- "8000:8000" environment:
- OFOXAPIKEY=${OFOXAPIKEY}
- MODEL=claude-3-5-sonnet-20241022 restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/health"] interval: 30s timeout: 10s retries: 3
Optional: Add a Redis cache
redis:
image: redis:7-alpine
ports:
- "6379:6379" volumes:
- redis-data:/data
volumes:
redis-data:
`
Production-Ready FastAPI + ofox.ai
`python
main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import httpx
import os
app = FastAPI(title="Claude API Service", version="1.0.0")
class Message(BaseModel):
role: str
content: str
class ChatRequest(BaseModel):
messages: List[Message]
model: str = "claude-3-5-sonnet-20241022"
max_tokens: Optional[int] = 1024
temperature: Optional[float] = 0.7
@app.get("/health")
async def health():
return {"status": "healthy"}
@app.post("/chat")
async def chat(request: ChatRequest):
async with httpx.AsyncClient(timeout=120.0) as client:
try:
response = await client.post(
"https://api.ofox.ai/v1/chat/completions",
headers={
"Authorization": f"Bearer {os.environ['OFOXAPIKEY']}",
"Content-Type": "application/json"
},
json={
"model": request.model,
"messages": [m.model_dump() for m in request.messages],
"maxtokens": request.maxtokens,
"temperature": request.temperature
}
)
response.raiseforstatus()
data = response.json()
return {
"content": data["choices"][0]["message"]["content"],
"model": data["model"],
"tokens": data["usage"]["total_tokens"]
}
except httpx.HTTPStatusError as e:
raise HTTPException(statuscode=e.response.statuscode, detail=str(e))
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
`
GPU Support for Local Models
`dockerfile
Dockerfile with GPU support
FROM nvidia/cuda:12.1.0-base-ubuntu22.04
RUN apt-get update && apt-get install -y \
python3.11 python3.11-venv python3-pip \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
For running local models like Ollama
RUN curl -fsSL https://ollama.ai/install.sh | sh
COPY . .
CMD ["python3", "main.py"]
`
`yaml
docker-compose.yml with GPU
services:
api:
build: .
deploy:
resources:
reservations:
devices:
- driver: nvidia count: 1 capabilities: [gpu] `
Multi-Stage Build (Smaller Images)
`dockerfile
Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt
Production stage
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "main.py"]
`
Environment-Based Configuration
`python
config.py
import os
from dataclasses import dataclass
@dataclass
class Config:
api_key: str
model: str
max_tokens: int
temperature: float
def get_config() -> Config:
return Config(
apikey=os.environ["OFOXAPI_KEY"],
model=os.environ.get("MODEL", "claude-3-5-sonnet-20241022"),
maxtokens=int(os.environ.get("MAXTOKENS", "1024")),
temperature=float(os.environ.get("TEMPERATURE", "0.7"))
)
`
Building and Running
`bash
Build
docker build -t claude-api-service .
Run
docker run -d -p 8000:8000 \
-e OFOXAPIKEY=your-key-here \
--name claude-api \
claude-api-service
With Docker Compose
docker-compose up -d
View logs
docker logs -f claude-api
Shell into container
docker exec -it claude-api /bin/bash
`
CI/CD with GitHub Actions
`yaml
name: Build and Deploy
on:
push:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
steps:
uses: actions/checkout@v4
name: Build image
run: docker build -t claude-api:${{ github.sha }} .name: Run tests
run: |
docker run claude-api:${{ github.sha }} pytestname: Push to registry
run: |
docker tag claude-api:${{ github.sha }} registry/app/claude-api:latest
docker push registry/app/claude-api:latest
`
Deploy Anywhere
With Docker, your AI application deploys to:
AWS ECS — Managed container service
Google Cloud Run — Serverless containers
Azure Container Instances — Simple deployment
DigitalOcean App Platform — Simple PaaS
Your own server — With docker-compose
Getting Started
Containerize your AI applications and deploy with confidence. Power them with ofox.ai — reliable Claude API with competitive pricing and 99.9% uptime.
👉 Get started with ofox.ai
This article contains affiliate links.
Tags: docker,devops,ai,programming,developer
Canonical URL: https://dev.to/zny10289
Top comments (0)