Elena Revicheva

Posted on Apr 24 • Originally published at aideazz.hashnode.dev

Why I Run Production AI Agents on Oracle Cloud's Free Tier

#ai #programming #machinelearning

Originally published on AIdeazz — cross-posted here with canonical link.

I've been shipping AI agents on Oracle Cloud's free tier for over a year now. Not as a hobby project or proof-of-concept, but actual production systems handling real workloads. The setup handles everything from WhatsApp customer service bots to multi-agent research systems, all running on infrastructure that costs exactly $0/month.

The Always Free Reality Check

Oracle's Always Free tier isn't a trial. It's a permanent allocation: 2 AMD compute instances (1/8 OCPU, 1GB RAM each), 2 Autonomous Databases (20GB storage each), 10GB object storage, and critically — 10TB/month outbound data transfer. That last part matters when you're routing requests to Groq's inference API or streaming responses back to Telegram webhooks.

The VM shape (VM.Standard.E2.1.Micro) runs Ubuntu 22.04 comfortably. Yes, 1GB RAM per instance is tight. I run Node.js agents with --max-old-space-size=768 to leave headroom for the OS. Memory pressure forces good architecture decisions: stateless processes, external queues, aggressive garbage collection. These constraints make the system more robust, not less.

What Oracle doesn't advertise clearly: you can run up to 4 ARM Ampere A1 instances with 24GB RAM total instead of the AMD instances. The A1 shapes perform better for most agent workloads, but availability varies by region. I stick with AMD in Phoenix because I can provision them reliably.

Database Architecture That Actually Scales

Oracle Autonomous Database on the free tier changes the agent persistence game entirely. You get two options: Autonomous Transaction Processing (ATP) or Autonomous Data Warehouse (ADW). I use ATP for agent state management — it handles ACID transactions, automatic indexing, and connection pooling without configuration.

Here's what makes it production-grade: built-in mTLS authentication (no passwords floating around), automatic backups, and 99.95% uptime SLA even on free tier. The database scales to 20GB storage and 1 OCPU (with auto-scaling disabled on free tier, but you can manually adjust).

My agent architecture uses ATP for:

Conversation state persistence (user context, message history)
Task queues with row-level locking for multi-agent coordination
Configuration management with versioning
Audit logs with automatic partitioning by date

The JSON document store capability means I can store agent responses, API payloads, and structured data in the same table without schema migrations. Performance stays consistent up to about 300 requests/second on free tier — well above what most agent systems need.

Network Security Without the Theater

Oracle Cloud forces you to configure security properly from day one. Every compute instance sits behind a Virtual Cloud Network (VCN) with explicit Security Lists. No "allow all from 0.0.0.0/0" shortcuts that plague other cloud providers' defaults.

For agent deployments, I run this network topology:

Public subnet for the load balancer (accepting HTTPS only)
Private subnet for compute instances
Database accessible only from compute subnet
NAT gateway for outbound API calls to Groq/Anthropic

The mTLS requirement for Autonomous Database connections means even if someone breaches your compute instance, they can't connect to your database without the wallet files. Oracle's Identity and Access Management (IAM) provides API keys with granular permissions — my deployment scripts have write access to specific buckets only.

Boring? Yes. Secure? Also yes. This setup survived multiple penetration tests from clients without any findings in the infrastructure layer.

Deployment Patterns for Fault Tolerance

Running production systems on 1GB RAM instances requires specific patterns. Here's my standard agent deployment:

Process supervision: SystemD for service management, not PM2 or Forever. SystemD handles resource limits, automatic restarts, and log rotation natively. My unit files set Restart=always with exponential backoff.

Memory management: Each agent process gets hard memory limits via SystemD's MemoryMax directive. When an agent approaches the limit, SystemD kills it cleanly and restarts. No memory leaks accumulating over weeks.

Request routing: Nginx on each instance with upstream health checks. Even with 2 instances, proper load balancing matters. I use least_conn method since agent requests have variable processing time.

State externalization: All conversation state lives in ATP. Agents fetch context on each request. Yes, this adds 5-10ms latency. No, users don't notice. The tradeoff enables horizontal scaling and clean restarts.

Queue processing: Oracle Transactional Event Queues (TEQ) in ATP for async work. Built-in message ordering, automatic retries, and poison message handling. Free tier allows 1000 messages/second throughput.

Real Workload Performance

Let me share actual metrics from production agents running on Oracle Cloud free tier:

WhatsApp Business Agent (customer service for an e-commerce client):

2 AMD instances load-balanced
Handles 8,000 messages/day average
P95 response time: 1.2 seconds (including Groq API calls)
Memory usage: 650-750MB per instance
Zero downtime in 6 months

Multi-Agent Research System (market analysis automation):

4 ARM A1 instances (6GB RAM each)
Coordinates 12 specialized agents
Processes 500 research requests/day
Uses 8GB in ATP for report storage
Costs: $0 infrastructure, ~$45/month API calls

Telegram Bot Cluster (various utility bots):

2 AMD instances
Serves 15,000 active users
200,000 requests/day
Webhook processing: <50ms P99
Bandwidth usage: 2TB/month (well under 10TB limit)

The Groq API integration deserves special mention. Their inference speed (70ms for Llama 3.1 8B) means network latency from Oracle Cloud regions becomes the bottleneck. I locate agents in Phoenix or San Jose for optimal Groq routing — adds only 15-20ms RTT.

Operational Annoyances and Workarounds

Free tier limitations create specific operational challenges:

No auto-scaling: You can't dynamically add instances. Plan capacity properly. I over-provision slightly and use SystemD resource limits to prevent one agent from starving others.

Backup restrictions: Free tier Autonomous DB includes automatic backups but no manual snapshots. I export critical data daily to Object Storage using Data Pump.

Monitoring gaps: No access to paid monitoring features. I ship logs to a self-hosted Grafana Loki instance on one of the compute instances. Basic but functional.

Region locks: Free tier resources must stay in the initially selected region. Choose carefully based on your user geography and API endpoint locations.

Support absence: No technical support on free tier. Oracle's documentation is comprehensive but assumes enterprise knowledge. Community forums are sparse compared to AWS/GCP.

Migration Path When You Outgrow Free

The brutal truth: you probably won't outgrow free tier as quickly as you think. I've seen teams burning $500/month on AWS for workloads that fit comfortably in Oracle's free allocation.

But when you do need to scale:

Pay-as-you-go upgrade: Keep free tier resources, add paid instances. Seamless transition, no migration required.
Flex shapes: E3/E4 Flex instances let you customize CPU/RAM ratios. Start with 1 OCPU / 8GB RAM for $40/month.
Autonomous Database scaling: Enable auto-scaling to handle traffic spikes. Costs ~$120/month per OCPU.
Multi-region: Replicate your free tier setup across regions for geographic distribution.

The architecture patterns from free tier — stateless processes, external state, proper security groups — transfer directly to larger deployments. You're not learning "toy" patterns that break at scale.

Why This Stack for AI Agents

After building on AWS, GCP, and various PaaS providers, Oracle Cloud free tier hits a specific sweet spot for AI agent workloads:

Predictable resources: Not a trial that expires or credits that run out
Production features: mTLS, managed database, proper networking
Geographic coverage: 41 regions globally for API latency optimization
Bandwidth allocation: 10TB/month handles serious agent traffic
Database performance: ATP free tier outperforms many paid PostgreSQL instances

The constraints force architectural discipline. Every decision — from memory management to state externalization — improves the system. My agents run more reliably on Oracle's free tier than previous iterations on paid infrastructure.

Is it perfect? No. The documentation assumes enterprise experience. The console UI feels like 2015. Community resources are limited. But for shipping production AI agents without burning venture capital on cloud bills, Oracle Cloud's free tier delivers exactly what matters: reliable infrastructure that stays free.

— Elena Revicheva · AIdeazz · Portfolio

DEV Community