Originally published at claudeguide.io/claude-agent-production-deploy
Deploying Claude Agents to Production: Fly.io, Vercel, and Lambda
Claude agent deployments fail for three common reasons: timeouts (LLM calls take 5-30 seconds, most serverless platforms time out at 30s), cold starts (agents with heavy initialization are too slow for serverless), and missing environment variables in production. Choosing the right deployment target for your agent type prevents all three. This guide covers the three main deployment patterns with complete configuration.
Deployment Target Decision Matrix
| Agent type | Recommended platform | Why |
|---|---|---|
| Long-running (5+ minutes) | Fly.io | No timeout limits, persistent processes |
| API endpoint (< 30s response) | Vercel | Zero-config, automatic scaling |
| Event-driven (webhooks, queues) | AWS Lambda | Pay-per-invocation, natural event model |
| Streaming responses | Vercel Edge | Low latency, streaming SSE support |
| High-volume, cost-sensitive | Fly.io + Redis queue | Full control, no per-invocation billing |
Fly.io: Long-Running Agents
Best for: agents that run for minutes, background processing, agents that need to hold state in memory.
Project structure
my-agent/
├── Dockerfile
├── fly.toml
├── requirements.txt
└── agent/
├── __init__.py
├── main.py
└── tools.py
Dockerfile
FROM python:3.12-slim
WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY agent/ ./agent/
# Health check endpoint
EXPOSE 8080
CMD ["python", "-m", "uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8080"]
FastAPI agent server
python
# agent/main.py
import os
import asyncio
from fastapi import FastAPI, HTTPException, BackgroundTasks
from pydantic import BaseModel
import anthropic
app = FastAPI()
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
# In-memory job tracker (use Redis in production for multi-instance)
jobs = {}
class AgentRequest(BaseModel):
goal: str
webhook_url: str | None = None
class JobStatus(BaseModel):
job_id: str
status: str # "running" | "done" | "failed"
result: str | None = None
error: str | None = None
@app.get("/health")
async def health():
return {"status": "ok"}
@app.post("/run")
async def run_agent(request: AgentRequest, background_tasks: BackgroundTasks):
import uuid
job_id = str(uuid.uuid4())
jobs[job_id] = {"status": "running", "result": None, "error": None}
background_tasks.add_task(execute_agent_job, job_id, request.goal, request.webhook_url)
return {"job_id": job_id}
@app.get("/status/{job_id}")
async def get_status(job_id: str) -
[→ Get the Agent SDK Cookbook — $49](https://shoutfirst.gumroad.com/l/ogxhmy?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-agent-production-deploy)
*30-day money-back guarantee. Instant download.*
Top comments (0)