Originally published on AIdeazz — cross-posted here with canonical link.
After burning through $3,000 in AWS credits last quarter running production agents, I made a decision that raised eyebrows: migrate everything to Oracle Cloud's free tier. Six months later, my multi-agent systems serve 400+ daily users without touching my credit card. Here's the infrastructure reality behind that choice.
The Economics of Running Always-On Agents
Production agents eat compute differently than traditional apps. My WhatsApp customer service bot processes 3,000 messages daily, each triggering Claude API calls, vector searches, and state management. The Telegram code review agent runs continuous background jobs. These aren't request-response microservices—they're persistent processes with memory.
On AWS, this meant:
- EC2 t3.small instances: $15/month each (needed 3 for redundancy)
- RDS Postgres: $25/month for the smallest production setup
- NAT Gateway: $45/month (the silent killer)
- Data transfer: $20-50/month depending on traffic
Oracle Cloud free tier gives you:
- 4 ARM Ampere A1 cores with 24GB RAM (split across VMs)
- 2 AMD compute instances
- 2 Autonomous Databases (20GB each)
- 10TB outbound data transfer monthly
The math is straightforward: $0 vs $150+/month for equivalent resources. But the real story is in the operational details.
Autonomous Database: The Overlooked Agent Backbone
Everyone talks about LLMs and embeddings. Nobody talks about state management at scale. Oracle's Autonomous Database became my agent memory solution—not because it's fancy, but because it handles the boring parts automatically.
My agent architecture stores:
- Conversation history with vector embeddings
- User context and preferences
- Rate limiting counters
- Async job queues
- Checkpoint states for long-running workflows
The database self-tunes, auto-scales within free tier limits, and handles backups. No manual vacuuming, no index bloat, no 3am pages about connection pools. The built-in JSON support means I store Claude responses directly without ORM overhead:
INSERT INTO agent_memory (
user_id,
conversation,
embedding,
metadata
) VALUES (
:user_id,
JSON(:claude_response),
:ada_embedding,
JSON_OBJECT('model' VALUE 'claude-3-sonnet',
'tokens' VALUE :token_count)
);
The 20GB limit per database forces good hygiene. I partition old conversations to object storage, keeping only active embeddings hot. This constraint improved my architecture—infinite storage encourages lazy design.
mTLS and Network Security Without the Ceremony
Oracle enforces mTLS for Autonomous Database connections. Initially annoying, now essential for my distributed agent setup. Each agent VM gets its own wallet, preventing the security theater of hardcoded connection strings.
My setup:
- Generate wallet per agent service
- Mount as Kubernetes secrets (yes, Oracle free tier runs K3s fine)
- Rotate quarterly via simple automation
The network security model is refreshingly rigid. No public endpoints by default—you explicitly allow traffic. This forced me to properly architect agent communication through private subnets, eliminating an entire class of exposure risks.
Real example: My Groq router (balances requests across Groq/Claude based on load) runs on a private subnet, accessible only from agent VMs. External webhooks hit a reverse proxy with rate limiting. Simple topology, enforced by default constraints.
VM Shapes and the Reality of Agent Workloads
Oracle's free ARM instances outperform what you'd expect. The 4 OCPU ARM cores handle my Python agents better than AWS t3.medium instances. Here's the actual resource usage from production:
WhatsApp Business Agent (1 OCPU, 6GB RAM):
- Handles 100 concurrent conversations
- Vector search across 50k documents
- 15ms p95 response time to webhook
- CPU: 40% average, 80% peak
Telegram Code Review Agent (2 OCPU, 12GB RAM):
- Processes GitHub webhooks
- Runs AST analysis before LLM calls
- Manages diff queues for large PRs
- CPU: 60% average during business hours
Multi-Model Router (1 OCPU, 6GB RAM):
- Groq Llama for simple queries
- Claude for complex reasoning
- Tracks rate limits and fallbacks
- CPU: 25% average
The free tier's 200GB block storage seems limiting until you realize agents shouldn't store much locally. Conversation logs go to Autonomous DB, file uploads to object storage, everything else is ephemeral.
Keeping Agents Alive: The Boring Critical Path
Production agents die in predictable ways. Memory leaks from long-running Python processes. Webhook timeouts. Rate limit exhaustion. Database connection pool starvation. The infrastructure must handle these gracefully.
My monitoring stack on Oracle free tier:
- Systemd for process management (automatic restarts)
- Prometheus node exporter (1% resource overhead)
- Custom health checks every 30 seconds
- Dead letter queues in Autonomous DB
Example systemd unit that's saved me dozens of incidents:
[Service]
Type=simple
Restart=always
RestartSec=10
StartLimitBurst=5
StartLimitIntervalSec=60
MemoryMax=5G
MemoryAccounting=true
ExecStart=/home/ubuntu/venv/bin/python agent.py
StandardOutput=journal
StandardError=journal
The MemoryMax prevents runaway processes. StartLimitBurst stops crash loops from hammering APIs. Simple, boring, effective.
For distributed state, I use Autonomous DB's built-in job scheduler:
BEGIN
DBMS_SCHEDULER.create_job(
job_name => 'cleanup_stale_conversations',
job_type => 'PLSQL_BLOCK',
job_action => 'BEGIN cleanup_old_chats(); END;',
repeat_interval => 'FREQ=HOURLY',
enabled => TRUE
);
END;
No external cron, no Kubernetes jobs, no Lambda functions. The database handles it.
Integration Patterns That Actually Scale
The free tier constraints shaped better patterns. Limited compute means aggressive caching. Fixed database size means data lifecycle policies. No managed Kubernetes means simple deployment.
My standard agent template:
- FastAPI app with built-in health checks
- PostgreSQL wire protocol to Autonomous DB
- Redis-compatible caching (Valkey on separate VM)
- Webhook endpoints with exponential backoff
- Structured logging to local disk (rotated)
Real code from production WhatsApp agent:
class AgentCore:
def __init__(self):
self.db = oracledb.create_pool(
user=os.environ['DB_USER'],
password=os.environ['DB_PASS'],
dsn=os.environ['DB_DSN'],
min=2, max=10, increment=1
)
self.cache = Redis(host='10.0.0.5', decode_responses=True)
self.llm_router = LLMRouter(
groq_key=os.environ['GROQ_KEY'],
anthropic_key=os.environ['ANTHROPIC_KEY']
)
async def process_message(self, user_id: str, message: str):
# Check rate limits
if not await self._check_rate_limit(user_id):
return "Please wait before sending another message"
# Get conversation context
context = await self._get_context(user_id)
# Route to appropriate model
response = await self.llm_router.complete(
message=message,
context=context,
complexity=self._assess_complexity(message)
)
# Store in database
await self._store_interaction(user_id, message, response)
return response
Nothing clever, just solid patterns that survive production.
Migration Realities and Gotchas
Moving from AWS revealed hidden dependencies. Aurora PostgreSQL has subtle differences from Autonomous DB. S3 APIs differ from Oracle Object Storage. Network topology requires rethinking.
Specific issues I hit:
- Database connections: Oracle uses wallets, not connection strings. Solution: Environment-specific initialization scripts.
- Object storage: Different signing process for presigned URLs. Solution: Abstraction layer for storage operations.
- Monitoring: No CloudWatch equivalent. Solution: Self-hosted Grafana on free tier compute.
- Secrets management: No AWS Secrets Manager. Solution: Encrypted files in object storage, keys in environment variables.
The migration took 3 weeks of nights and weekends. Not seamless, but manageable.
When Free Tier Isn't Enough
Oracle Cloud free tier has hard limits. You'll hit them with:
- More than 1000 concurrent users
- Real-time video/audio processing
- Large model fine-tuning
- Multi-region deployment
My escape hatches:
- Groq for high-volume inference ($0.10/million tokens vs Claude's $3)
- Cloudflare R2 for blob storage overflow
- Hetzner boxes for GPU workloads
- Oracle paid tier only for specific overages
The free tier handles 80% of my workload. The remaining 20% costs $50/month across providers—still 70% less than pure AWS.
The Verdict After Six Months
Oracle Cloud free tier works for production agents if you embrace its constraints. It's not about the free resources—it's about forced architectural discipline. Limited compute means efficient code. Fixed database size means data lifecycle management. No managed services means understanding your stack.
My agents serve real customers, handle production load, and maintain 99.9% uptime (measured, not promised). The infrastructure cost: $0 for Oracle resources, ~$50/month for LLM APIs and overflow compute.
For developers building agent systems: try the Oracle Cloud free tier for a proof of concept. The worst case? You learn infrastructure patterns that work anywhere. Best case? You run production workloads without AWS bills.
The future isn't about unlimited resources. It's about doing more with less, and Oracle's free tier accidentally enforces that discipline.
Top comments (0)