Oracle Cloud Always Free for crypto builders: the $0/mo infra stack for a solo dev
I've been running a Solana signal engine, a webhook-driven x402 paywall, and a public wallet scanner on under $0/mo of infrastructure for a few months now. Not $0.99, not a "free tier that flips to $20 in month 4" — actually free, and not in a way that falls over if somebody runs ab -n 10000 at it.
This post is the field manual for the stack I ended up with. Everything here is either truly free ("Always Free" at Oracle), free-tier-with-a-credit-card-that-never-gets-charged (Cloudflare, Sentry, Grafana, BetterStack, Healthchecks), or open source running on the first two. No paid tiers. No trial expirations.
Code and config templates are in cipher-starter under MIT. The deeper ops chapter — systemd hardening, Cloudflare Tunnel ACLs, Neon migration cutover — is a chapter on cipher-x402 for $0.25 USDC. The earlier writeup covers what building the playbook itself taught me.
Why Oracle, and why not something simpler
A fair question. You probably looked at DigitalOcean or Hetzner first.
The case for Oracle Always Free, for a crypto/quant solo-dev workload specifically:
- 4 OCPUs and 24 GB RAM on Ampere A1 ARM. That's 4x the CPU and 6x the RAM of any other "free forever" tier I know of. Free DigitalOcean isn't a thing. Fly.io's free allotment dropped to $5 in late 2024. Oracle's is still there.
- ARM64. Everything I run compiles cleanly on ARM (Python, Node, Rust, pnpm, Postgres, Redis). The 20% price-performance win doesn't matter when it's already $0, but it does mean your container builds stay small.
- Block storage. 200 GB of boot + block volume free. More than enough for five years of OHLCV on 50 symbols + a SQLite WAL.
- Egress. 10 TB/mo egress free. Cloudflare + Mastodon + Nostr posting is noise relative to that.
The case against:
- Provisioning is a fight. Ampere A1 capacity is often exhausted in popular regions. You will see "Out of host capacity" errors. The workaround is scripted retry; see below.
- Support is ~none. Free tier gets community forums and that's it. You are SRE.
- Terms of service. Read them. The "no crypto mining" clause is broadly interpreted; I interpret it as "don't run a miner." Running a trading bot or signal service is fine, but I'm not a lawyer.
Provisioning: the retry-script workaround
The honest tactic: oci compute instance launch in a loop until the region has Ampere A1 capacity. This is ugly but well-known.
#!/usr/bin/env bash
# launch_a1.sh - provision a 4-OCPU/24GB Ampere A1 until capacity appears.
set -euo pipefail
COMPARTMENT_ID="ocid1.compartment.oc1..aaaa"
AD="AD-1" # Availability Domain, e.g. "Abcd:CA-TORONTO-1-AD-1"
SUBNET_ID="ocid1.subnet.oc1..aaaa"
IMAGE_ID="ocid1.image.oc1..aaaa" # Ubuntu 22.04 ARM
SSH_PUB="$(cat ~/.ssh/id_ed25519.pub)"
while true; do
if oci compute instance launch \
--compartment-id "$COMPARTMENT_ID" \
--availability-domain "$AD" \
--shape "VM.Standard.A1.Flex" \
--shape-config '{"ocpus": 4, "memoryInGBs": 24}' \
--image-id "$IMAGE_ID" \
--subnet-id "$SUBNET_ID" \
--assign-public-ip true \
--metadata "{\"ssh_authorized_keys\": \"$SSH_PUB\"}" \
--wait-for-state RUNNING 2>&1 | tee /tmp/oci.log; then
echo "Launched."
break
fi
if grep -q "Out of host capacity" /tmp/oci.log; then
echo "No capacity, sleeping 60s..."
sleep 60
else
echo "Unexpected error, aborting."
break
fi
done
I got an instance on attempt 23, after ~40 minutes of polling. Once provisioned, Oracle doesn't reclaim the instance unless it's idle for 7+ days, and even then you just re-provision. I keep my instance on a weekly cron that ssh's in and touches a file.
The core systemd unit for a Solana RPC worker
Here's the pattern I use for every long-running process: a dedicated unix user, a hardened systemd unit, restart-on-failure, journal logging, and structured metrics exposed on localhost only.
# /etc/systemd/system/cipher-worker.service
[Unit]
Description=Cipher signal worker
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=cipher
Group=cipher
WorkingDirectory=/opt/cipher
Environment=PYTHONUNBUFFERED=1
Environment=CIPHER_DB=/var/lib/cipher/cipher.db
EnvironmentFile=/etc/cipher/env
ExecStart=/opt/cipher/.venv/bin/python -m cipher.worker
Restart=on-failure
RestartSec=5s
StartLimitBurst=10
StartLimitIntervalSec=600
# Hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/cipher /var/log/cipher
PrivateTmp=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
LockPersonality=true
MemoryDenyWriteExecute=true
SystemCallArchitectures=native
# Limits
LimitNOFILE=65535
MemoryMax=8G
TasksMax=512
[Install]
WantedBy=multi-user.target
Two things worth highlighting for crypto work:
-
EnvironmentFile=/etc/cipher/env. All secrets (Helius API key, signer keypair path, etc.) live in achmod 600root-owned file, not in the unit. Never bake secrets into the unit file — they leak intosystemctl catoutput visible to any local user. -
MemoryDenyWriteExecute=true. Blocks any child process from allocating writable+executable memory. Rules out most in-process shellcode injection paths. Requires that you're not using a JIT (no PyPy, no Node in JIT mode, standard CPython is fine).
Enable with:
sudo useradd --system --home /var/lib/cipher --shell /usr/sbin/nologin cipher
sudo install -d -o cipher -g cipher /var/lib/cipher /var/log/cipher
sudo install -d -o root -g root -m 700 /etc/cipher
sudo install -o root -g root -m 600 env /etc/cipher/env
sudo systemctl daemon-reload
sudo systemctl enable --now cipher-worker.service
SQLite WAL: when to use it, when to migrate
SQLite with WAL is the best-kept secret in crypto infra. It gives you:
- Single-file database, no daemon.
- Concurrent readers with a single writer, atomic commits.
-
PRAGMA journal_mode=WAL; PRAGMA synchronous=NORMAL;gets you ~20k write/sec on Ampere A1. - Backups via
sqlite3 db .backup /tmp/snap.db— hot, online, consistent.
The correct pragmas for a signal/trading workload:
import sqlite3
def open_db(path: str) -> sqlite3.Connection:
conn = sqlite3.connect(path, timeout=30.0, isolation_level=None,
check_same_thread=False)
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA synchronous=NORMAL") # not FULL; WAL makes NORMAL safe
conn.execute("PRAGMA wal_autocheckpoint=1000")
conn.execute("PRAGMA busy_timeout=30000")
conn.execute("PRAGMA foreign_keys=ON")
conn.execute("PRAGMA temp_store=MEMORY")
conn.execute("PRAGMA cache_size=-64000") # 64 MB
return conn
The thresholds at which I migrate off SQLite:
| Metric | SQLite | Time to migrate |
|---|---|---|
| Writes/sec sustained | < 5k | OK |
| Writes/sec peak | < 20k | OK |
| DB size | < 50 GB | OK |
| Concurrent writers | 1 | Always 1 |
| Reader count | < 100 | OK |
| Need replication / HA | No | Migrate |
| Need regional read replicas | No | Migrate |
| Analytical queries > 30s | Rare | Migrate |
When you cross one of those, Neon's free Postgres tier (3 GB storage, 100 hrs compute-time/mo, branching) is the obvious jump because the migration is trivial: sqlite3 db .dump | psql. No data model changes. Neon gives you branch-per-PR for free, which is genuinely useful.
Don't migrate preemptively. A signal engine on 50 symbols with minute candles back 5 years is ~6 GB of data and a few hundred writes/sec. SQLite will outlast your product-market fit.
Cloudflare Tunnel + no-open-ports pattern
Opening port 443 on a VM is how you end up in a Shodan breach report. The cleaner pattern:
- Run your service on
127.0.0.1:8080. No public binding. - Install
cloudflared. Authenticate once, create a named tunnel. - Map
api.yourdomain.com -> http://localhost:8080in the tunnel config. - Leave every inbound port closed. Outbound 443 only, to Cloudflare's edge.
Install and config on Ubuntu ARM:
# Install cloudflared
curl -L --output cloudflared.deb \
https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-arm64.deb
sudo dpkg -i cloudflared.deb
# Auth (opens browser on your laptop)
cloudflared tunnel login
# Create tunnel
cloudflared tunnel create cipher-api
# Outputs a tunnel ID and credentials file
# Config
sudo mkdir -p /etc/cloudflared
sudo tee /etc/cloudflared/config.yml <<EOF
tunnel: cipher-api
credentials-file: /etc/cloudflared/cipher-api.json
ingress:
- hostname: api.cryptomotifs.dev
service: http://localhost:8080
originRequest:
connectTimeout: 10s
noTLSVerify: false
- service: http_status:404
EOF
# Route DNS
cloudflared tunnel route dns cipher-api api.cryptomotifs.dev
# Install as service
sudo cloudflared --config /etc/cloudflared/config.yml service install
sudo systemctl enable --now cloudflared
The iptables hardening to match:
# Default-deny inbound except SSH
sudo iptables -P INPUT DROP
sudo iptables -P FORWARD DROP
sudo iptables -P OUTPUT ACCEPT
sudo iptables -A INPUT -i lo -j ACCEPT
sudo iptables -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 22 -j ACCEPT
sudo netfilter-persistent save
Now nmap from the outside reports one open port (22), which you can further lock to a Cloudflare WARP CIDR or a VPN. No port scanners, no bot-net PoC exploits, no "ssh root user brute-force" logs.
The crypto-specific win: your Solana RPC endpoint, your worker admin interface, and your webhook receiver are all behind a Cloudflare Access policy if you want (free for up to 50 users). Two-factor on your admin panel without writing a single line of auth code.
Observability stack: Grafana + Sentry + BetterStack + Healthchecks.io
All four have generous free tiers, and together they cover the observability triangle: logs, metrics, traces, uptime, errors.
- Grafana Cloud Free. 10k series metrics, 50 GB logs/mo, 14-day retention. Push via Prometheus remote-write.
-
Sentry Free. 5k errors/mo, 1 team. Drop-in
sentry-sdk. - BetterStack (Logtail + Uptime) Free. 1 GB logs/mo, 10 monitors, 3-min check interval. Get incident pages for free.
-
Healthchecks.io Free. 20 cron checks, email + Slack + Telegram notifications. For every cron job, you register a check, and
curl ${URL}at the end of the job. Miss the ping, get an alert.
The glue is a single observability.py:
import logging
import os
import socket
from contextlib import contextmanager
import sentry_sdk
from prometheus_client import Counter, Histogram, start_http_server
SERVICE = os.environ.get("SERVICE_NAME", "cipher-worker")
HOST = socket.gethostname()
# Sentry
if dsn := os.environ.get("SENTRY_DSN"):
sentry_sdk.init(dsn=dsn, traces_sample_rate=0.1, release=os.environ.get("GIT_SHA"))
# Prometheus (scraped by Grafana Agent into Grafana Cloud)
JOB_COUNT = Counter("cipher_jobs_total", "Jobs processed", ["job", "status"])
JOB_LAT = Histogram("cipher_job_seconds", "Job latency", ["job"])
start_http_server(9100, addr="127.0.0.1")
# Logs -> BetterStack via journald relay
@contextmanager
def job(name: str):
with JOB_LAT.labels(job=name).time():
try:
yield
JOB_COUNT.labels(job=name, status="ok").inc()
except Exception:
JOB_COUNT.labels(job=name, status="fail").inc()
sentry_sdk.capture_exception()
raise
def heartbeat(check_url: str) -> None:
"""Ping Healthchecks.io at the end of a cron job."""
import urllib.request
try:
urllib.request.urlopen(check_url, timeout=5).read()
except Exception:
logging.exception("healthcheck ping failed")
Usage in your worker:
from observability import job, heartbeat
with job("pipeline_tick"):
run_pipeline()
heartbeat(os.environ["HC_PIPELINE_URL"])
Four services, one file, covers: exceptions (Sentry), metrics (Grafana), logs (BetterStack), aliveness (Healthchecks). The 9100 metrics port is on 127.0.0.1 only; Grafana Agent scrapes it locally and ships to the cloud.
The actual $/mo
| Component | Provider | $/mo |
|---|---|---|
| 4 OCPU / 24 GB / 200 GB ARM VM | Oracle Always Free | $0 |
| DNS + tunnel + CDN | Cloudflare | $0 |
| Error tracking | Sentry Free | $0 |
| Metrics + logs | Grafana Cloud Free | $0 |
| Uptime + incident pages | BetterStack Free | $0 |
| Cron watchdog | Healthchecks.io | $0 |
| Domain | — | ~$1 |
| Postgres (when needed) | Neon Free | $0 |
| Total | ~$1 |
One dollar a month for a domain. Everything else is free in a way that doesn't bait-and-switch.
Three gotchas that bit me
-
Oracle block-volume snapshots are NOT in the free tier by default. You get 200 GB of active volume; snapshots count against a separate quota that flips to paid at 20 GB. Turn off auto-snapshot unless you need it, and prefer
sqlite3 .backup→ S3-compatible object storage (Cloudflare R2 Free: 10 GB). -
Cloudflare Tunnel + websockets works, but you must enable
Web Socketson your Cloudflare account dashboard under Network. It's off-by-default for some zones. Symptom: HTTP works, WS upgrades 502. -
Ampere A1
bare metalvsVMconfusion. The Always Free lane is VM.Standard.A1.Flex. BM.Standard.A1.160 looks free in the UI when you're poking around — it's not. Stick toVM.Standard.A1.Flexwith max 4 OCPUs and 24 GB.
The playbook link
Full config templates (systemd unit, Cloudflare Tunnel, observability glue, SQLite pragmas, iptables script) are under MIT in cipher-starter. The deep chapter — the full threat model, KMS envelope pattern, Cloudflare Access policy, and Neon migration cutover runbook — is here behind x402 for $0.25 USDC on Base. The prior post walks through why building the playbook in public was the right move.
If you're running similar free-tier infra for crypto/quant, I'd love to compare notes — especially anyone running Redis Stack on A1 ARM (I've been putting off the migration because the free tier of Upstash is enough for my footprint, but I know a day is coming).
Top comments (0)