DEV Community

cited
cited

Posted on

I Had a Free Oracle Cloud ARM Box With 24GB RAM — So I Got Weird With It

My Oracle Cloud free tier instance sat running Nginx for three months. Four ARM cores, 24GB of RAM, 200 Mbps network — serving a static HTML page. That's like buying a Porsche to drive to the mailbox.

The free ARM tier is absurdly overpowered. So I started experimenting. Here are five non-obvious things you can actually run on it that justify the spec.


1. Self-Hosted LLM Inference with Ollama

What it is: Run a local LLM API endpoint that your apps can call instead of paying per-token to OpenAI.

Why this spec: 24GB RAM is the magic number. A 7B quantized model (Q4_K_M) fits in ~4.5GB, leaving headroom for a 13B model or multiple concurrent 3B models. llama.cpp has native ARM/NEON optimizations — inference is genuinely fast, not just "acceptable."

Tool: Ollama + Llama 3.2 3B or Mistral 7B

Setup:

  • curl -fsSL https://ollama.com/install.sh | sh — ARM binary installs cleanly
  • ollama pull llama3.2 then ollama serve to expose on port 11434
  • Reverse proxy via Caddy with a bearer token check in front
docker run -d \
  --name ollama \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  ollama/ollama
Enter fullscreen mode Exit fullscreen mode

Resource usage: ~15–25% CPU during inference, ~5–8GB RAM at rest for a 7B model, near-zero when idle.


2. Remote Build Cache Server (Bazel / Turborepo)

What it is: A persistent cache layer that stores compiled artifacts so your CI or teammates never rebuild the same code twice.

Why this spec: Build caches are network-I/O bound, not compute-bound. The server sits at ~2% CPU most of the time and bursts briefly when serving cache hits. 24GB RAM means you can keep a massive in-memory index. Oracle's 200 Mbps network means cache fetches feel instant.

Tool: Bazel Remote Cache or Turborepo Remote Cache (ducktape)

Setup:

  • Deploy buchgr/bazel-remote via Docker with --max_size 20 (20GB cache on disk)
  • Point your ~/.bazelrc at --remote_cache=http://YOUR_IP:9090
  • Add digest auth; your team CI tokens cache hits in seconds, not minutes
docker run -d \
  -p 9090:9090 \
  -v /data/bazel-cache:/data \
  buchgr/bazel-remote \
  --max_size=20
Enter fullscreen mode Exit fullscreen mode

Resource usage: 1–3% CPU idle, burst to 15% on parallel pushes, ~1–2GB RAM, predictable disk I/O.


3. Personal Data Pipeline (Airbyte OSS + DuckDB)

What it is: A self-hosted ETL platform that syncs data from APIs, databases, and SaaS tools into a local analytical store — no Fivetran bill.

Why this spec: Airbyte's orchestration containers are memory-hungry at startup (~6GB for the stack). Four ARM cores handle parallel sync workers without throttling. DuckDB runs analytics queries directly on Parquet files in memory — 24GB lets you process mid-size datasets without spilling to disk.

Tool: Airbyte OSS + DuckDB

Setup:

  • git clone https://github.com/airbytehq/airbyte && cd airbyte && ./run-ab-platform.sh
  • Configure sources (Postgres, Stripe, GitHub) and sync to local S3-compatible storage (MinIO)
  • Query synced Parquet files with DuckDB: SELECT * FROM read_parquet('/data/stripe/*.parquet')

Resource usage: ~30–40% CPU during active syncs, ~8GB RAM for full Airbyte stack, near-zero between runs.


4. Game Server Orchestrator (Pterodactyl + Valheim)

What it is: A web-based panel for spinning up and managing multiple game servers — Minecraft, Valheim, Terraria — with one interface.

Why this spec: ARM binaries for popular game servers are now mainstream. Valheim dedicated server runs at ~1.5GB RAM; Minecraft Paper at ~2–4GB. With 24GB, you host 3–4 game worlds simultaneously. Four cores handle the tick loops without contention.

Tool: Pterodactyl Panel + Valheim or Paper Minecraft

Setup:

  • Install Pterodactyl via their wings daemon: follows standard Docker + MySQL setup
  • Import community ARM-compatible egg configs for Valheim and Minecraft
  • Set per-server RAM limits in the panel UI; firewall UDP ports per game (2456–2458 for Valheim)

Resource usage: 40–70% CPU under active player load, 10–18GB RAM for 3 concurrent servers, ~0 when empty.


5. Browser Automation Farm (Playwright via Browserless)

What it is: A pool of headless Chromium instances you can hit via API for scraping, screenshot generation, PDF rendering, or test execution.

Why this spec: Headless Chrome is brutally RAM-hungry — each instance eats 200–400MB. With 24GB, you run 20–40 concurrent sessions. ARM Chromium builds are now first-class. CPU usage spikes during JS-heavy page rendering but normalizes quickly.

Tool: Browserless (self-hosted) or raw Playwright server

Setup:

  • docker run -d -p 3000:3000 -e MAX_CONCURRENT_SESSIONS=20 ghcr.io/browserless/chromium
  • Hit ws://YOUR_IP:3000 from Playwright: chromium.connect({ wsEndpoint: ... })
  • Set TOKEN env var and add iptables rule to restrict access to your CI IP range

Resource usage: ~5% idle, 60–80% CPU during active sessions, up to 16GB RAM at 20 concurrent sessions.


Resource Comparison at a Glance

Use Case Avg CPU RAM Usage Idle Cost
LLM Inference (Ollama) 20% 5–8 GB Very low
Build Cache Server 2–3% 1–2 GB Near zero
Data Pipeline (Airbyte) 35% 8 GB Low (scheduled)
Game Server Orchestrator 55% 12–18 GB Medium
Browser Automation Farm 65% 10–16 GB Low

Oracle Cloud Gotchas Nobody Tells You

  • The firewall has two layers. OS-level iptables AND Oracle's Security List in the VCN console. Opening a port in one and not the other will drive you insane. Always update both.
  • ARM-only binaries aren't always obvious. Some Docker images silently pull x86 and run under emulation. Always check docker inspect <image> | grep Architecture. Multi-arch images are your friend.
  • Idle termination is real. Oracle can reclaim "idle" instances. Run a lightweight cron job that pings an endpoint or does a small write every hour to prove activity.
  • No guaranteed IPv4 egress SLA. Outbound traffic is unmetered but not prioritized. If you're doing bulk scraping, expect variability.
  • Account holds happen. Free tier accounts with unusual traffic patterns (port scanning signatures, bulk outbound) get flagged. Keep workloads clearly inbound-serving.

My Pick

If I could only run one thing: Ollama with a 7B model. The spec fit is almost suspiciously perfect — 24GB RAM handles the model weights with room to breathe, ARM's NEON extensions give real inference speed, and the API-compatible endpoint drops into any existing OpenAI SDK call with one env var change. It turns a free server into a private, unlimited LLM backend. Everything else on this list is useful. This one is genuinely game-changing for the price of $0.

Top comments (0)