FastAPI is the best Python framework for building AI-powered APIs. Combined with Claude via ofox.ai, you can create production-ready AI endpoints in minutes. Here's the complete guide.
Why FastAPI for AI APIs?
Async native — Handle concurrent AI requests efficiently
Automatic validation — Pydantic models validate inputs/outputs
OpenAPI docs — Built-in interactive API documentation
Type hints — Full IDE support and error checking
Production-ready — Used by major companies worldwide
Project Setup
bash
mkdir claude-api-service
cd claude-api-service
python3 -m venv venv
source venv/bin/activate
pip install fastapi uvicorn httpx pydantic
Basic FastAPI + Claude Setup
`python
main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import httpx
app = FastAPI(title="Claude API Service")
class Message(BaseModel):
role: str
content: str
class ChatRequest(BaseModel):
model: str = "claude-3-5-sonnet-20241022"
messages: List[Message]
max_tokens: Optional[int] = 1024
temperature: Optional[float] = 0.7
class ChatResponse(BaseModel):
content: str
model: str
tokens_used: int
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
async with httpx.AsyncClient() as client:
response = await client.post(
"https://api.ofox.ai/v1/chat/completions",
headers={
"Authorization": f"Bearer {process.env.OFOXAPIKEY}",
"Content-Type": "application/json"
},
json={
"model": request.model,
"messages": [m.model_dump() for m in request.messages],
"maxtokens": request.maxtokens,
"temperature": request.temperature
},
timeout=60.0
)
if response.status_code != 200:
raise HTTPException(statuscode=response.statuscode, detail=response.text)
data = response.json()
return ChatResponse(
content=data["choices"][0]["message"]["content"],
model=data["model"],
tokensused=data["usage"]["totaltokens"]
)
`
Running the Server
bash
export OFOXAPIKEY="your-key-here"
uvicorn main:app --reload --port 8000
API docs available at http://localhost:8000/docs
Adding Streaming Responses
`python
from fastapi.responses import StreamingResponse
@app.post("/chat/stream")
async def chat_stream(request: ChatRequest):
async def generate():
async with httpx.AsyncClient() as client:
async with client.stream(
"POST",
"https://api.ofox.ai/v1/chat/completions",
headers={
"Authorization": f"Bearer {process.env.OFOXAPIKEY}",
"Content-Type": "application/json"
},
json={
"model": request.model,
"messages": [m.model_dump() for m in request.messages],
"maxtokens": request.maxtokens,
"stream": True
},
timeout=60.0
) as response:
async for chunk in response.aiter_lines():
if chunk.startswith("data: "):
data = chunk[6:]
if data == "[DONE]":
break
yield f"{data}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")
`
Adding Authentication
`python
from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
security = HTTPBearer()
@app.post("/chat", response_model=ChatResponse)
async def chat(
request: ChatRequest,
credentials: HTTPAuthorizationCredentials = Depends(security)
):
if credentials.scheme != "Bearer" or not verifyapikey(credentials.credentials):
raise HTTPException(
statuscode=status.HTTP401_UNAUTHORIZED,
detail="Invalid API key"
)
... rest of chat logic
`
Production Deployment
For production, use:
Gunicorn + Uvicorn workers for concurrency
Redis for request caching
Rate limiting with middleware
Docker for containerization
dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Complete Example: Code Explanation API
`python
class ExplainRequest(BaseModel):
code: str
language: str = "python"
@app.post("/explain", response_model=ChatResponse)
async def explain_code(request: ExplainRequest):
prompt = f"""Explain this {request.language} code in simple terms:
{request.language}
{request.code}
Provide a clear, concise explanation."""
chat_request = ChatRequest(
messages=[Message(role="user", content=prompt)]
)
return await chat(chat_request)
`
Getting Started
Build your AI API with FastAPI and Claude. ofox.ai provides the OpenAI-compatible Claude endpoint — sign up, get your API key, and deploy your first AI API in under 10 minutes.
👉 Get started with ofox.ai
This article contains affiliate links.
Tags: python,fastapi,claude-api,api,programming,developer
Canonical URL: https://dev.to/zny10289
Top comments (0)