Elena Revicheva

Posted on May 14 • Originally published at aideazz.xyz

Why I Run Production AI Agents on Oracle Cloud's Free Tier

#ai #programming #machinelearning

Originally published on AIdeazz — cross-posted here with canonical link.

When I tell other builders I run production AI agents on Oracle Cloud's free tier, I get the same look every time. The one that says "why would you do that to yourself?" But after shipping dozens of autonomous agents from Panama, I've learned something counterintuitive: Oracle's free infrastructure forces architectural decisions that make systems more reliable, not less.

The Always Free Reality Check

Oracle Cloud's Always Free tier isn't generous in the way developers expect. You get:

2 AMD-based Compute VMs (1/8 OCPU, 1 GB memory each)
2 Autonomous Databases (20 GB storage each)
10 GB object storage
10 TB outbound data transfer per month

That's it. No GPU access. No fancy vector databases. No managed Kubernetes. Just basic compute, storage, and Oracle's flagship autonomous database.

Most developers see these limits and move on. I saw constraints that would force me to build leaner agents. After burning through $3,000 in AWS bills for a client's "simple" chatbot that spiraled into complexity, I needed a different approach.

The revelation came when building a WhatsApp order-taking agent for a Panama City restaurant. Instead of spinning up a t3.medium instance and calling it a day, Oracle's limits forced me to think in terms of compute scheduling. The agent didn't need to be always-on — it needed to wake up, process messages in batches, update the database, and sleep.

This pattern of intermittent compute became the foundation for everything I've built since.

Autonomous Database: The Overlooked Workhorse

Oracle's Autonomous Database changes how you think about persistence for agents. It's not just a managed database — it's a database that handles its own indexing, backup, patching, and performance tuning. For AI agents, this translates to:

Built-in JSON document store: Agent memory, conversation state, and tool outputs live naturally as JSON documents. No ORM complexity.

Automatic scaling: The free tier auto-scales between 0.02 and 0.2 OCPUs based on workload. Your agent's 3 AM batch processing doesn't consume your daytime capacity.

Native REST APIs: Every Autonomous Database exposes REST endpoints for your data. Your agent can query its memory without maintaining connection pools.

Here's what this looks like in practice. My document processing agent stores extraction results like this:

# Simplified version of production code
async def store_extraction(self, doc_id: str, extraction: dict):
    await self.odb.insert_json(
        collection='extractions',
        document={
            '_id': doc_id,
            'timestamp': datetime.utcnow().isoformat(),
            'source': extraction['source'],
            'entities': extraction['entities'],
            'confidence': extraction['confidence_scores'],
            'model_used': self.current_model,
            'processing_time_ms': extraction['duration']
        }
    )

The database handles indexing on _id, timestamp, and any nested JSON paths I query frequently. No manual index management. No migration scripts.

mTLS by Default: Security Without the Ceremony

Every connection to Oracle Cloud infrastructure uses mutual TLS. This isn't optional — it's enforced at the infrastructure level. Your agents authenticate with client certificates, not just passwords or API keys.

This seemed like overkill until I had a client whose previous chatbot was compromised through a misconfigured Redis instance. The attacker injected prompt instructions through the cache layer. With mTLS, even if someone discovers your connection string, they can't connect without the client certificate.

Setting it up requires downloading a wallet file (a zip containing certificates and connection strings) and pointing your database drivers to it:

# Oracle Autonomous Database connection
connection = oracledb.connect(
    user="agent_user",
    password=os.environ['ODB_PASSWORD'],
    dsn="agentdb_high",
    config_dir="/opt/oracle/wallet",
    wallet_location="/opt/oracle/wallet",
    wallet_password=os.environ['WALLET_PASSWORD']
)

The friction of certificate management pays off when you're storing conversation history, extracted PII, or business logic. I've never had to write a security postmortem for an Oracle-hosted agent.

Process Supervision: The Boring Part That Matters

The free tier VMs are small, but they're actual VMs, not containers that might get recycled. This means you need real process supervision. Here's what I learned the hard way:

systemd is your friend: Every agent runs as a systemd service with automatic restarts. No custom process managers.

Memory limits are real: A 1 GB VM running Python leaves you about 600 MB for your agent after the OS. Memory leaks kill processes fast.

Disk fills up: 47 GB might seem like plenty until your agent starts logging every LLM response. Implement log rotation early.

Here's a production systemd unit file for a Telegram agent:

[Unit]
Description=Telegram Support Agent
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=agent
WorkingDirectory=/opt/agents/telegram
Environment="PYTHONPATH=/opt/agents/telegram"
Environment="OCI_CONFIG_FILE=/home/agent/.oci/config"
ExecStart=/opt/agents/telegram/venv/bin/python -m agent.main
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

# Memory limits
MemoryMax=600M
MemorySwapMax=0

# CPU limits
CPUQuota=80%

[Install]
WantedBy=multi-user.target

The MemoryMax and CPUQuota settings prevent one misbehaving agent from taking down the entire VM. When the agent hits memory limits, systemd restarts it cleanly.

Multi-Model Routing on Minimal Resources

Running Claude, GPT-4, and Groq inference on 1 GB of RAM sounds impossible because it is. The free tier forces a routing architecture where your agents are coordinators, not inference hosts.

My production setup:

Groq: High-volume, low-latency tasks (intent classification, entity extraction)
Claude: Complex reasoning, nuanced responses, safety-critical decisions
GPT-4: Fallback for specific capabilities (function calling, structured outputs)

The routing logic lives in a base agent class:

class ModelRouter:
    def __init__(self):
        self.groq_client = Groq(api_key=os.environ['GROQ_KEY'])
        self.claude_client = anthropic.Anthropic(api_key=os.environ['CLAUDE_KEY'])
        self.openai_client = openai.OpenAI(api_key=os.environ['OPENAI_KEY'])

    async def route_completion(self, task_type: str, prompt: str, **kwargs):
        if task_type == 'classification':
            return await self._groq_complete(prompt, model='mixtral-8x7b-32768')
        elif task_type == 'analysis':
            return await self._claude_complete(prompt, model='claude-3-opus')
        elif task_type == 'function_call':
            return await self._openai_complete(prompt, model='gpt-4')
        else:
            # Fallback to cheapest option
            return await self._groq_complete(prompt)

This architecture means a single VM can handle hundreds of concurrent conversations by offloading inference to APIs while maintaining state locally.

Operational Lessons from Constraint

After two years of running agents on Oracle's free tier, patterns emerged:

Batch everything: Instead of processing messages as they arrive, accumulate them and process in batches. This lets you shut down agents between runs, saving memory.

State in database, not memory: Every piece of agent state lives in Autonomous Database. Agents can restart without losing context.

Monitoring through database: Since Autonomous Database includes performance insights, you can monitor agent behavior through SQL queries instead of external APM tools.

Geographic arbitrage: Oracle's free tier is available in all regions. I run agents in São Paulo for Latin American clients, Mumbai for Asian clients. Latency matters for conversational AI.

The most surprising outcome? These constraints led to more reliable systems. When you can't throw resources at problems, you build better architectures. My Oracle-hosted agents have better uptime than the over-provisioned AWS deployments they replaced.

Gotchas and Realistic Expectations

Oracle Cloud's free tier has sharp edges:

Account suspension: Oracle suspends idle accounts. You need real workloads running to keep your account active.

No support: Free tier accounts get no support. When region-wide issues occur (rare but happens), you wait like everyone else.

Upgrade pressure: Oracle regularly emails about upgrading. The emails look urgent but you can ignore them if you're within free tier limits.

API availability: Some OCI services aren't available in all regions on free tier. Check region availability before architecting.

Learning curve: Oracle's documentation assumes enterprise experience. Budget time for learning OCI concepts.

The Unexpected Benefit

Building on extreme constraints taught me to see infrastructure differently. When a client now asks for an AI agent, I don't start with "what's your budget?" I start with "what's the minimum compute needed?"

This shift matters because AI agent economics are broken. Most agents lose money on every interaction when you factor in infrastructure. By proving agents can run profitably on free infrastructure, we change the conversation from "can we afford AI?" to "what problems should AI solve?"

Oracle's free tier isn't for everyone. If you need GPUs, managed vector databases, or always-on compute, look elsewhere. But if you're building practical agents that solve real problems — document extraction, customer support, order processing — the constraints might be exactly what you need.

The restaurant WhatsApp agent I mentioned earlier? Still running on the same free tier account after 18 months. It's processed over 50,000 orders, saved the owner 20 hours per week, and costs exactly $0 in infrastructure.

That's the real lesson: sometimes the best infrastructure is the one that forces you to build better software.

— Elena Revicheva · AIdeazz · Portfolio

DEV Community