Elena Revicheva

Posted on May 4 • Originally published at aideazz.hashnode.dev

Oracle Cloud Free Tier for Production AI Agents: Why I Moved from AWS

#ai #programming #machinelearning

Originally published on AIdeazz — cross-posted here with canonical link.

After burning through $3,000 in AWS credits last quarter running production agents, I made a decision that raised eyebrows: migrate everything to Oracle Cloud's free tier. Six months later, my multi-agent systems serve 400+ daily users without touching my credit card. Here's the infrastructure reality behind that choice.

The Economics of Running Always-On Agents

Production agents eat compute differently than traditional apps. My WhatsApp customer service bot processes 3,000 messages daily, each triggering Claude API calls, vector searches, and state management. The Telegram code review agent runs continuous background jobs. These aren't request-response microservices—they're persistent processes with memory.

On AWS, this meant:

EC2 t3.small instances: $15/month each (needed 3 for redundancy)
RDS Postgres: $25/month for the smallest production setup
NAT Gateway: $45/month (the silent killer)
Data transfer: $20-50/month depending on traffic

Oracle Cloud free tier gives you:

4 ARM Ampere A1 cores with 24GB RAM (split across VMs)
2 AMD compute instances
2 Autonomous Databases (20GB each)
10TB outbound data transfer monthly

The math is straightforward: $0 vs $150+/month for equivalent resources. But the real story is in the operational details.

Autonomous Database: The Overlooked Agent Backbone

Everyone talks about LLMs and embeddings. Nobody talks about state management at scale. Oracle's Autonomous Database became my agent memory solution—not because it's fancy, but because it handles the boring parts automatically.

My agent architecture stores:

Conversation history with vector embeddings
User context and preferences
Rate limiting counters
Async job queues
Checkpoint states for long-running workflows

The database self-tunes, auto-scales within free tier limits, and handles backups. No manual vacuuming, no index bloat, no 3am pages about connection pools. The built-in JSON support means I store Claude responses directly without ORM overhead:

INSERT INTO agent_memory (
  user_id, 
  conversation,
  embedding,
  metadata
) VALUES (
  :user_id,
  JSON(:claude_response),
  :ada_embedding,
  JSON_OBJECT('model' VALUE 'claude-3-sonnet', 
              'tokens' VALUE :token_count)
);

The 20GB limit per database forces good hygiene. I partition old conversations to object storage, keeping only active embeddings hot. This constraint improved my architecture—infinite storage encourages lazy design.

mTLS and Network Security Without the Ceremony

Oracle enforces mTLS for Autonomous Database connections. Initially annoying, now essential for my distributed agent setup. Each agent VM gets its own wallet, preventing the security theater of hardcoded connection strings.

My setup:

Generate wallet per agent service
Mount as Kubernetes secrets (yes, Oracle free tier runs K3s fine)
Rotate quarterly via simple automation

The network security model is refreshingly rigid. No public endpoints by default—you explicitly allow traffic. This forced me to properly architect agent communication through private subnets, eliminating an entire class of exposure risks.

Real example: My Groq router (balances requests across Groq/Claude based on load) runs on a private subnet, accessible only from agent VMs. External webhooks hit a reverse proxy with rate limiting. Simple topology, enforced by default constraints.

VM Shapes and the Reality of Agent Workloads

Oracle's free ARM instances outperform what you'd expect. The 4 OCPU ARM cores handle my Python agents better than AWS t3.medium instances. Here's the actual resource usage from production:

WhatsApp Business Agent (1 OCPU, 6GB RAM):

Handles 100 concurrent conversations
Vector search across 50k documents
15ms p95 response time to webhook
CPU: 40% average, 80% peak

Telegram Code Review Agent (2 OCPU, 12GB RAM):

Processes GitHub webhooks
Runs AST analysis before LLM calls
Manages diff queues for large PRs
CPU: 60% average during business hours

Multi-Model Router (1 OCPU, 6GB RAM):

Groq Llama for simple queries
Claude for complex reasoning
Tracks rate limits and fallbacks
CPU: 25% average

The free tier's 200GB block storage seems limiting until you realize agents shouldn't store much locally. Conversation logs go to Autonomous DB, file uploads to object storage, everything else is ephemeral.

Keeping Agents Alive: The Boring Critical Path

Production agents die in predictable ways. Memory leaks from long-running Python processes. Webhook timeouts. Rate limit exhaustion. Database connection pool starvation. The infrastructure must handle these gracefully.

My monitoring stack on Oracle free tier:

Systemd for process management (automatic restarts)
Prometheus node exporter (1% resource overhead)
Custom health checks every 30 seconds
Dead letter queues in Autonomous DB

Example systemd unit that's saved me dozens of incidents:

[Service]
Type=simple
Restart=always
RestartSec=10
StartLimitBurst=5
StartLimitIntervalSec=60
MemoryMax=5G
MemoryAccounting=true
ExecStart=/home/ubuntu/venv/bin/python agent.py
StandardOutput=journal
StandardError=journal

The MemoryMax prevents runaway processes. StartLimitBurst stops crash loops from hammering APIs. Simple, boring, effective.

For distributed state, I use Autonomous DB's built-in job scheduler:

BEGIN
  DBMS_SCHEDULER.create_job(
    job_name => 'cleanup_stale_conversations',
    job_type => 'PLSQL_BLOCK',
    job_action => 'BEGIN cleanup_old_chats(); END;',
    repeat_interval => 'FREQ=HOURLY',
    enabled => TRUE
  );
END;

No external cron, no Kubernetes jobs, no Lambda functions. The database handles it.

Integration Patterns That Actually Scale

The free tier constraints shaped better patterns. Limited compute means aggressive caching. Fixed database size means data lifecycle policies. No managed Kubernetes means simple deployment.

My standard agent template:

FastAPI app with built-in health checks
PostgreSQL wire protocol to Autonomous DB
Redis-compatible caching (Valkey on separate VM)
Webhook endpoints with exponential backoff
Structured logging to local disk (rotated)

Real code from production WhatsApp agent:

class AgentCore:
    def __init__(self):
        self.db = oracledb.create_pool(
            user=os.environ['DB_USER'],
            password=os.environ['DB_PASS'],
            dsn=os.environ['DB_DSN'],
            min=2, max=10, increment=1
        )
        self.cache = Redis(host='10.0.0.5', decode_responses=True)
        self.llm_router = LLMRouter(
            groq_key=os.environ['GROQ_KEY'],
            anthropic_key=os.environ['ANTHROPIC_KEY']
        )

    async def process_message(self, user_id: str, message: str):
        # Check rate limits
        if not await self._check_rate_limit(user_id):
            return "Please wait before sending another message"

        # Get conversation context
        context = await self._get_context(user_id)

        # Route to appropriate model
        response = await self.llm_router.complete(
            message=message,
            context=context,
            complexity=self._assess_complexity(message)
        )

        # Store in database
        await self._store_interaction(user_id, message, response)

        return response

Nothing clever, just solid patterns that survive production.

Migration Realities and Gotchas

Moving from AWS revealed hidden dependencies. Aurora PostgreSQL has subtle differences from Autonomous DB. S3 APIs differ from Oracle Object Storage. Network topology requires rethinking.

Specific issues I hit:

Database connections: Oracle uses wallets, not connection strings. Solution: Environment-specific initialization scripts.
Object storage: Different signing process for presigned URLs. Solution: Abstraction layer for storage operations.
Monitoring: No CloudWatch equivalent. Solution: Self-hosted Grafana on free tier compute.
Secrets management: No AWS Secrets Manager. Solution: Encrypted files in object storage, keys in environment variables.

The migration took 3 weeks of nights and weekends. Not seamless, but manageable.

When Free Tier Isn't Enough

Oracle Cloud free tier has hard limits. You'll hit them with:

More than 1000 concurrent users
Real-time video/audio processing
Large model fine-tuning
Multi-region deployment

My escape hatches:

Groq for high-volume inference ($0.10/million tokens vs Claude's $3)
Cloudflare R2 for blob storage overflow
Hetzner boxes for GPU workloads
Oracle paid tier only for specific overages

The free tier handles 80% of my workload. The remaining 20% costs $50/month across providers—still 70% less than pure AWS.

The Verdict After Six Months

Oracle Cloud free tier works for production agents if you embrace its constraints. It's not about the free resources—it's about forced architectural discipline. Limited compute means efficient code. Fixed database size means data lifecycle management. No managed services means understanding your stack.

My agents serve real customers, handle production load, and maintain 99.9% uptime (measured, not promised). The infrastructure cost: $0 for Oracle resources, ~$50/month for LLM APIs and overflow compute.

For developers building agent systems: try the Oracle Cloud free tier for a proof of concept. The worst case? You learn infrastructure patterns that work anywhere. Best case? You run production workloads without AWS bills.

The future isn't about unlimited resources. It's about doing more with less, and Oracle's free tier accidentally enforces that discipline.

— Elena Revicheva · AIdeazz · Portfolio

DEV Community