I Had a Free Oracle Cloud ARM Box With 24GB RAM — So I Got Weird With It

#webdev #seo

My Oracle Cloud free tier instance sat running Nginx for three months. Four ARM cores, 24GB of RAM, 200 Mbps network — serving a static HTML page. That's like buying a Porsche to drive to the mailbox.

The free ARM tier is absurdly overpowered. So I started experimenting. Here are five non-obvious things you can actually run on it that justify the spec.

1. Self-Hosted LLM Inference with Ollama

What it is: Run a local LLM API endpoint that your apps can call instead of paying per-token to OpenAI.

Why this spec: 24GB RAM is the magic number. A 7B quantized model (Q4_K_M) fits in ~4.5GB, leaving headroom for a 13B model or multiple concurrent 3B models. llama.cpp has native ARM/NEON optimizations — inference is genuinely fast, not just "acceptable."

Tool: Ollama + Llama 3.2 3B or Mistral 7B

Setup:

curl -fsSL https://ollama.com/install.sh | sh — ARM binary installs cleanly
ollama pull llama3.2 then ollama serve to expose on port 11434
Reverse proxy via Caddy with a bearer token check in front

docker run -d \
  --name ollama \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  ollama/ollama

Resource usage: ~15–25% CPU during inference, ~5–8GB RAM at rest for a 7B model, near-zero when idle.

2. Remote Build Cache Server (Bazel / Turborepo)

What it is: A persistent cache layer that stores compiled artifacts so your CI or teammates never rebuild the same code twice.

Why this spec: Build caches are network-I/O bound, not compute-bound. The server sits at ~2% CPU most of the time and bursts briefly when serving cache hits. 24GB RAM means you can keep a massive in-memory index. Oracle's 200 Mbps network means cache fetches feel instant.

Tool: Bazel Remote Cache or Turborepo Remote Cache (ducktape)

Setup:

Deploy buchgr/bazel-remote via Docker with --max_size 20 (20GB cache on disk)
Point your ~/.bazelrc at --remote_cache=http://YOUR_IP:9090
Add digest auth; your team CI tokens cache hits in seconds, not minutes

docker run -d \
  -p 9090:9090 \
  -v /data/bazel-cache:/data \
  buchgr/bazel-remote \
  --max_size=20

Resource usage: 1–3% CPU idle, burst to 15% on parallel pushes, ~1–2GB RAM, predictable disk I/O.

3. Personal Data Pipeline (Airbyte OSS + DuckDB)

What it is: A self-hosted ETL platform that syncs data from APIs, databases, and SaaS tools into a local analytical store — no Fivetran bill.

Why this spec: Airbyte's orchestration containers are memory-hungry at startup (~6GB for the stack). Four ARM cores handle parallel sync workers without throttling. DuckDB runs analytics queries directly on Parquet files in memory — 24GB lets you process mid-size datasets without spilling to disk.

Tool: Airbyte OSS + DuckDB

Setup:

git clone https://github.com/airbytehq/airbyte && cd airbyte && ./run-ab-platform.sh
Configure sources (Postgres, Stripe, GitHub) and sync to local S3-compatible storage (MinIO)
Query synced Parquet files with DuckDB: SELECT * FROM read_parquet('/data/stripe/*.parquet')

Resource usage: ~30–40% CPU during active syncs, ~8GB RAM for full Airbyte stack, near-zero between runs.

4. Game Server Orchestrator (Pterodactyl + Valheim)

What it is: A web-based panel for spinning up and managing multiple game servers — Minecraft, Valheim, Terraria — with one interface.

Why this spec: ARM binaries for popular game servers are now mainstream. Valheim dedicated server runs at ~1.5GB RAM; Minecraft Paper at ~2–4GB. With 24GB, you host 3–4 game worlds simultaneously. Four cores handle the tick loops without contention.

Tool: Pterodactyl Panel + Valheim or Paper Minecraft

Setup:

Install Pterodactyl via their wings daemon: follows standard Docker + MySQL setup
Import community ARM-compatible egg configs for Valheim and Minecraft
Set per-server RAM limits in the panel UI; firewall UDP ports per game (2456–2458 for Valheim)

Resource usage: 40–70% CPU under active player load, 10–18GB RAM for 3 concurrent servers, ~0 when empty.

5. Browser Automation Farm (Playwright via Browserless)

What it is: A pool of headless Chromium instances you can hit via API for scraping, screenshot generation, PDF rendering, or test execution.

Why this spec: Headless Chrome is brutally RAM-hungry — each instance eats 200–400MB. With 24GB, you run 20–40 concurrent sessions. ARM Chromium builds are now first-class. CPU usage spikes during JS-heavy page rendering but normalizes quickly.

Tool: Browserless (self-hosted) or raw Playwright server

Setup:

docker run -d -p 3000:3000 -e MAX_CONCURRENT_SESSIONS=20 ghcr.io/browserless/chromium
Hit ws://YOUR_IP:3000 from Playwright: chromium.connect({ wsEndpoint: ... })
Set TOKEN env var and add iptables rule to restrict access to your CI IP range

Resource usage: ~5% idle, 60–80% CPU during active sessions, up to 16GB RAM at 20 concurrent sessions.

Resource Comparison at a Glance

Use Case	Avg CPU	RAM Usage	Idle Cost
LLM Inference (Ollama)	20%	5–8 GB	Very low
Build Cache Server	2–3%	1–2 GB	Near zero
Data Pipeline (Airbyte)	35%	8 GB	Low (scheduled)
Game Server Orchestrator	55%	12–18 GB	Medium
Browser Automation Farm	65%	10–16 GB	Low

Oracle Cloud Gotchas Nobody Tells You

The firewall has two layers. OS-level iptables AND Oracle's Security List in the VCN console. Opening a port in one and not the other will drive you insane. Always update both.
ARM-only binaries aren't always obvious. Some Docker images silently pull x86 and run under emulation. Always check docker inspect <image> | grep Architecture. Multi-arch images are your friend.
Idle termination is real. Oracle can reclaim "idle" instances. Run a lightweight cron job that pings an endpoint or does a small write every hour to prove activity.
No guaranteed IPv4 egress SLA. Outbound traffic is unmetered but not prioritized. If you're doing bulk scraping, expect variability.
Account holds happen. Free tier accounts with unusual traffic patterns (port scanning signatures, bulk outbound) get flagged. Keep workloads clearly inbound-serving.

My Pick

If I could only run one thing: Ollama with a 7B model. The spec fit is almost suspiciously perfect — 24GB RAM handles the model weights with room to breathe, ARM's NEON extensions give real inference speed, and the API-compatible endpoint drops into any existing OpenAI SDK call with one env var change. It turns a free server into a private, unlimited LLM backend. Everything else on this list is useful. This one is genuinely game-changing for the price of $0.