How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026)
Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably.
This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.
## Choosing Your Deployment Model
Approach
Best For
Monthly Cost
Complexity
Always-On?
**VPS (bare metal)**
24/7 autonomous agents
$5-20
Medium
Yes
**Docker + VPS**
Reproducible, multi-agent
$10-30
Medium-High
Yes
**Serverless (Lambda/Cloud Run)**
Event-triggered agents
$1-50 (pay-per-use)
Low-Medium
No (triggered)
**Managed platforms**
No-ops teams
$20-200
Low
Varies
## Option 1: VPS Deployment (What We Use)
The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.
### Step 1: Choose a VPS Provider
Provider
Cheapest Plan
Specs
Best For
**Hetzner**
$4.50/mo
2 vCPU, 4GB RAM, 40GB SSD
Best value in EU
**DigitalOcean**
$6/mo
1 vCPU, 1GB RAM, 25GB SSD
Simple UI, good docs
**Vultr**
$6/mo
1 vCPU, 1GB RAM, 25GB SSD
Global locations
**Contabo**
$6.50/mo
4 vCPU, 8GB RAM, 50GB SSD
Most specs per dollar
**What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.
### Step 2: Initial Server Setup
# SSH into your new server
ssh root@your-server-ip
# Create a non-root user
adduser agent
usermod -aG sudo agent
# Install essentials
apt update && apt install -y python3 python3-pip python3-venv git curl
# Switch to agent user
su - agent
# Clone your agent code
git clone https://github.com/your-org/your-agent.git
cd your-agent
# Set up Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Create environment file for credentials
cat > .env > logs/pipeline.log 2>&1
# Social media posting: Every 6 hours
0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1
# Daily monitoring report
30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1
## Option 2: Docker Deployment
Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl git && rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy agent code
COPY . .
# Non-root user for security
RUN useradd -m agent
USER agent
CMD ["python3", "agent.py"]
# docker-compose.yml
version: '3.8'
services:
agent:
build: .
restart: always
env_file: .env
volumes:
- ./data:/app/data # Persist agent memory/state
- ./logs:/app/logs # Persist logs
deploy:
resources:
limits:
memory: 2G
cpus: '1.0'
healthcheck:
test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"]
interval: 60s
timeout: 10s
retries: 3
# Optional: vector database for RAG
chromadb:
image: chromadb/chroma:latest
restart: always
volumes:
- chroma_data:/chroma/chroma
ports:
- "8000:8000"
volumes:
chroma_data:
# Deploy
docker compose up -d
# View logs
docker compose logs -f agent
# Update agent
git pull && docker compose build && docker compose up -d
## Option 3: Serverless Deployment
For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.
### AWS Lambda + EventBridge
# handler.py
import json
import boto3
def lambda_handler(event, context):
"""Triggered by EventBridge cron or API Gateway webhook"""
# Your agent logic here
from agent import run_agent
result = run_agent(event)
return {
'statusCode': 200,
'body': json.dumps(result)
}
# serverless.yml (Serverless Framework)
service: ai-agent
provider:
name: aws
runtime: python3.12
timeout: 300 # 5 minutes max
memorySize: 512
environment:
OPENAI_API_KEY: ${ssm:/ai-agent/openai-key}
functions:
newsletter:
handler: handler.lambda_handler
events:
- schedule: cron(0 8 ? * MON,WED,FRI *) # Mon/Wed/Fri 8am
webhook:
handler: handler.lambda_handler
events:
- httpApi:
path: /webhook
method: post
### Google Cloud Run
# For longer-running agents (up to 60 min)
gcloud run deploy ai-agent \
--source . \
--region us-central1 \
--memory 1Gi \
--timeout 3600 \
--set-env-vars "OPENAI_API_KEY=sk-..." \
--no-allow-unauthenticated
Platform
Max Runtime
Cold Start
Cost per Run
AWS Lambda
15 minutes
1-5 seconds
$0.0001-0.01
Google Cloud Run
60 minutes
2-10 seconds
$0.001-0.05
Vercel Functions
5 minutes (pro: 15)
$0.0001-0.005
Cloudflare Workers
30 seconds (free)
$0.00005
## Monitoring Your Deployed Agent
A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:
### Health Check Endpoint
from flask import Flask, jsonify
import psutil
app = Flask(__name__)
@app.route('/health')
def health():
return jsonify({
"status": "healthy",
"uptime_hours": get_uptime(),
"memory_mb": psutil.Process().memory_info().rss / 1024 / 1024,
"last_run": get_last_run_timestamp(),
"errors_24h": get_error_count(hours=24),
"api_balance": check_api_balance()
})
### Alert System
import requests
def send_alert(message, level="warning"):
"""Send alert via Telegram/Slack/email"""
if level == "critical":
# Telegram for immediate attention
requests.post(
f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
data={"chat_id": OWNER_ID, "text": f"🚨 {message}"}
)
else:
# Slack webhook for non-critical
requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"})
# Alerts to configure:
# - Agent crash / restart
# - API balance below threshold
# - Error rate spike (3+ errors in 10 min)
# - Agent stuck (no activity for 2+ hours)
# - Cost spike (daily spend > 2x average)
### Log Management
import logging
from logging.handlers import RotatingFileHandler
# Structured logging
handler = RotatingFileHandler(
'logs/agent.log',
maxBytes=10_000_000, # 10MB per file
backupCount=5 # Keep 5 rotated files
)
handler.setFormatter(logging.Formatter(
'%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))
logger = logging.getLogger('agent')
logger.addHandler(handler)
# Log every significant action
logger.info("Scraping 12 RSS feeds")
logger.info("Scored 97 articles, top score: 28")
logger.warning("API rate limited, retrying in 30s")
logger.error("Beehiiv publish failed: 401 Unauthorized")
## Production Hardening Checklist
### Security
- API keys in environment variables or secrets manager, never in code
- Non-root user for the agent process
- Firewall: only allow SSH (22) and necessary ports
- SSH key auth only, disable password login
- Auto-update OS security patches (`unattended-upgrades`)
### Reliability
- Process manager with auto-restart (systemd, Docker restart policy)
- Graceful shutdown handling (catch SIGTERM, finish current task)
- Exponential backoff on API errors (not infinite retry loops)
- Circuit breaker for external services (stop calling after N failures)
- Daily backup of agent state/memory to external storage
### Cost Control
- Daily API spend limit with hard cutoff
- Max steps per agent run (prevent infinite loops)
- Token counting before API calls (reject oversized prompts)
- Alert when daily spend exceeds 2x average
- Weekly cost report to the team
## Deployment Patterns by Use Case
Agent Type
Best Deployment
Why
24/7 autonomous agent
VPS + systemd
Always-on, persistent state
Scheduled pipeline
VPS + cron or serverless
Runs on schedule, sleeps between
Webhook-triggered
Serverless (Lambda/Cloud Run)
Pay-per-use, auto-scales
Multi-agent system
Docker Compose on VPS
Isolated containers, shared network
Customer-facing chatbot
Cloud Run or managed platform
Auto-scale with traffic
Development/testing
Local Docker
Reproducible environment
## Key Takeaways
- **VPS + systemd is the simplest path** for always-on agents. $5-15/month, full control, works for 90% of use cases.
- **Docker adds value** when you have complex dependencies, multiple agents, or need reproducibility across environments.
- **Serverless is cheaper for sporadic workloads** but has runtime limits (15 min for Lambda) that don't suit long-running agents.
- **Monitoring is not optional.** Health checks, alerts, and log rotation are the minimum. An unmonitored agent will fail silently.
- **Security basics matter.** Non-root user, env vars for secrets, firewall, SSH keys. Takes 30 minutes, prevents disasters.
- **Start simple, scale later.** A $5 VPS with cron jobs is a perfectly valid production deployment. Don't over-engineer until you need to.
### Deploy With Confidence
Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents.
[Get the Playbook — $29](https://paxrel.gumroad.com/l/ai-agent-playbook)
### Stay Updated on AI Agents
Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam.
[Subscribe to AI Agents Weekly](/newsletter.html)
Want more AI agent content?
- Free: Download our AI Agent Starter Kit (5 templates + security checklist)
- Free: Subscribe to AI Agents Weekly for curated news 3x/week
- $29: Get The AI Agent Playbook — 80+ pages of templates and guides
Originally published on paxrel.com
Top comments (0)