Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances
How I turned Oracle Cloud's free ARM compute into a fully functional Kubernetes cluster — with ingress, persistent storage, and TLS — all without spending a dollar.
Introduction
I have been running Kubernetes clusters professionally for years — managed services like EKS, AKS, GKE, and self-hosted clusters with kubeadm. They all cost money. Even the cheapest managed Kubernetes offering runs $70-80/month just for the control plane.
Then I looked at what Oracle Cloud gives away for free: 4 ARM OCPUs and 24GB of RAM on the Always Free tier. That is more compute than most developers use for their entire home lab. The question was obvious — could I run a real Kubernetes cluster on it?
The answer is yes, and it works better than I expected.
In this post, I will walk through deploying K3s — Rancher's lightweight Kubernetes distribution — on OCI Always Free ARM instances. Not a toy cluster. A cluster with ingress routing, persistent volumes, automatic TLS certificates, and enough resources to run real workloads.
Why K3s on OCI ARM?
Why K3s over full Kubernetes?
K3s strips out the components most developers never use — cloud controller, storage drivers, legacy API versions — and replaces etcd with SQLite (or embedded etcd for HA). The result is a single binary under 100MB that starts in seconds.
On resource-constrained Always Free instances, this matters. Full kubeadm clusters consume 2-3GB of RAM just for the control plane. K3s uses around 512MB.
Why OCI ARM over other clouds?
| Provider | Free Compute | RAM | Duration |
|---|---|---|---|
| OCI Always Free | 4 ARM OCPUs | 24 GB | Forever |
| AWS Free Tier | 1 vCPU (t2.micro) | 1 GB | 12 months |
| GCP Free Tier | 0.25 vCPU (e2-micro) | 1 GB | Forever |
| Azure Free | 1 vCPU (B1S) | 1 GB | 12 months |
There is no comparison. OCI gives you 24x the RAM of any competitor's free tier, permanently.
Architecture
Here is what we are building:
┌──────────────────────────────────────────────────────┐
│ OCI VCN (10.0.0.0/16) │
│ │
│ ┌────────────────────────────────────────────────┐ │
│ │ Public Subnet (10.0.1.0/24) │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌────────────────────┐ │ │
│ │ │ K3s Server │ │ K3s Agent │ │ │
│ │ │ (Control Plane)│ │ (Worker Node) │ │ │
│ │ │ │ │ │ │ │
│ │ │ 2 OCPU / 12GB │ │ 2 OCPU / 12GB │ │ │
│ │ │ Oracle Linux 9 │ │ Oracle Linux 9 │ │ │
│ │ │ │ │ │ │ │
│ │ │ - K3s server │ │ - K3s agent │ │ │
│ │ │ - Traefik │ │ - Workloads │ │ │
│ │ │ - CoreDNS │ │ - Pods │ │ │
│ │ │ - Metrics │ │ │ │ │
│ │ └──────────────────┘ └────────────────────┘ │ │
│ │ │ │
│ └────────────────────────────────────────────────┘ │
│ │
│ Security List: │
│ Ingress: SSH(22), HTTP(80), HTTPS(443), │
│ K8s API(6443), Kubelet(10250), │
│ NodePort(30000-32767) │
│ Egress: All traffic │
│ │
│ ┌────────────────────────────────────────────────┐ │
│ │ OCI Load Balancer (10 Mbps - Always Free) │ │
│ │ → Forwards 80/443 to K3s Traefik Ingress │ │
│ └────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
We split the 4 OCPUs and 24GB evenly: 2 OCPUs + 12GB for the server node, 2 OCPUs + 12GB for the worker. This gives the control plane enough room to breathe while leaving serious capacity for workloads.
Prerequisites
Before starting, you need:
- OCI account with Always Free tier — Sign up at cloud.oracle.com
- OCI CLI configured — Use Cloud Shell (pre-configured) or install locally
- Two A1.Flex instances provisioned — Follow the VCN + compute setup from my earlier posts, but create two instances instead of one
- SSH access to both instances
If you do not have the instances yet, provision them with these shapes:
# Server node
SHAPE_CONFIG='{"ocpus":2,"memoryInGBs":12}'
# Agent node (same config)
SHAPE_CONFIG='{"ocpus":2,"memoryInGBs":12}'
Both must use an aarch64 Oracle Linux 9 image — ARM architecture is critical here.
Step 1: Preparing the Instances
SSH into both instances and run the same preparation steps. OCI's Oracle Linux 9 images have firewalld and iptables rules that interfere with Kubernetes networking. We need to handle this.
# On BOTH nodes
sudo dnf update -y
# Disable firewalld — K3s manages its own iptables rules
sudo systemctl stop firewalld
sudo systemctl disable firewalld
# Load required kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k3s.conf
br_netfilter
overlay
EOF
sudo modprobe br_netfilter
sudo modprobe overlay
# Set required sysctl parameters
cat <<EOF | sudo tee /etc/sysctl.d/k3s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Why these specific settings?
- br_netfilter — Enables iptables to see bridged traffic (required for pod-to-pod communication across nodes)
- overlay — Required by the container runtime for overlay filesystem
- ip_forward — Allows the kernel to forward packets between network interfaces (essential for routing traffic to pods)
I spent two hours debugging connectivity issues on my first attempt because I forgot br_netfilter. Pods on different nodes simply could not talk to each other. The symptom was DNS resolution failures — CoreDNS pods could not reach each other.
Step 2: OCI Security List Configuration
This is where most OCI + Kubernetes guides fall short. The default security list blocks inter-node communication that K3s needs.
You need these ingress rules on the security list attached to your subnet:
# Update security list with K3s-required ports
oci network security-list update \
--security-list-id "$SL_ID" \
--ingress-security-rules '[
{"source":"0.0.0.0/0","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":22,"max":22}}},
{"source":"0.0.0.0/0","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":80,"max":80}}},
{"source":"0.0.0.0/0","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":443,"max":443}}},
{"source":"10.0.0.0/16","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":6443,"max":6443}}},
{"source":"10.0.0.0/16","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":10250,"max":10250}}},
{"source":"10.0.0.0/16","protocol":"17",
"udpOptions":{"destinationPortRange":{"min":8472,"max":8472}}},
{"source":"10.0.0.0/16","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":2379,"max":2380}}},
{"source":"0.0.0.0/0","protocol":"6",
"tcpOptions":{"destinationPortRange":{"min":30000,"max":32767}}}
]' \
--egress-security-rules '[
{"destination":"0.0.0.0/0","protocol":"all"}
]' \
--force > /dev/null
Port breakdown:
| Port | Protocol | Purpose | Source |
|---|---|---|---|
| 22 | TCP | SSH access | Anywhere |
| 80 | TCP | HTTP ingress | Anywhere |
| 443 | TCP | HTTPS ingress | Anywhere |
| 6443 | TCP | K3s API server | VCN only |
| 10250 | TCP | Kubelet metrics | VCN only |
| 8472 | UDP | VXLAN (Flannel CNI) | VCN only |
| 2379-2380 | TCP | etcd (if HA) | VCN only |
| 30000-32767 | TCP | NodePort services | Anywhere |
Notice that internal K3s ports (6443, 10250, 8472) are restricted to the VCN CIDR 10.0.0.0/16. Never expose the Kubernetes API to the internet in production.
Step 3: Installing K3s Server
SSH into your first instance (the server node) and install K3s:
# On the SERVER node
export INSTALL_K3S_EXEC="server"
export K3S_NODE_NAME="k3s-server"
curl -sfL https://get.k3s.io | sh -s - \
--write-kubeconfig-mode 644 \
--tls-san $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
--node-external-ip $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
--flannel-iface enp0s6 \
--disable servicelb
Let me explain each flag because they all matter on OCI:
-
--write-kubeconfig-mode 644— Makes the kubeconfig readable without sudo. Useful for development but tighten this in production -
--tls-san <public_ip>— Adds the public IP to the K3s API server's TLS certificate. Without this,kubectlfrom your laptop will get TLS errors -
--node-external-ip <public_ip>— Tells K3s about the node's public IP. OCI instances only see their private IP on the network interface -
--flannel-iface enp0s6— Forces Flannel to use the correct network interface. OCI ARM instances useenp0s6as the primary interface, noteth0. I discovered this the hard way — Flannel defaulted to the wrong interface and VXLAN tunnels failed silently -
--disable servicelb— Disables K3s's built-in load balancer (ServiceLB/Klipper). We will use OCI's Always Free Load Balancer instead
The instance metadata endpoint 169.254.169.254 is OCI's equivalent of AWS's metadata service. It returns instance details without needing the OCI CLI.
Verify the server is running:
sudo systemctl status k3s
# Check node status
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# k3s-server Ready control-plane,master 45s v1.31.4+k3s1
Grab the join token — the agent node needs this:
sudo cat /var/lib/rancher/k3s/server/node-token
# K10xxxx::server:yyyy
Step 4: Joining the Agent Node
SSH into your second instance and install K3s in agent mode:
# On the AGENT node
export K3S_URL="https://<SERVER_PRIVATE_IP>:6443"
export K3S_TOKEN="<TOKEN_FROM_STEP_3>"
export K3S_NODE_NAME="k3s-agent"
curl -sfL https://get.k3s.io | sh -s - \
--node-external-ip $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
--flannel-iface enp0s6
Important: use the private IP of the server node for K3S_URL, not the public IP. Both instances are in the same VCN subnet, so they communicate over the private network. This is faster, free (no egress charges), and more secure.
Back on the server node, verify both nodes are ready:
kubectl get nodes -o wide
# NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
# k3s-server Ready control-plane,master 5m v1.31.4+k3s1 10.0.1.10 <public>
# k3s-agent Ready <none> 30s v1.31.4+k3s1 10.0.1.11 <public>
Two nodes. 4 OCPUs. 24GB RAM. Zero dollars.
Step 5: Configuring the OCI Load Balancer
OCI's Always Free tier includes a 10 Mbps Flexible Load Balancer. We will point it at our K3s nodes to route HTTP/HTTPS traffic to the Traefik ingress controller.
# Create the load balancer
LB_ID=$(oci lb load-balancer create \
--compartment-id "$COMPARTMENT_ID" \
--display-name "k3s-ingress-lb" \
--shape-name "flexible" \
--shape-details '{"minimumBandwidthInMbps":10,"maximumBandwidthInMbps":10}' \
--subnet-ids "[\"$SUBNET_ID\"]" \
--is-private false \
--query 'data.id' --raw-output \
--wait-for-state SUCCEEDED)
# Create a backend set with health check
oci lb backend-set create \
--load-balancer-id "$LB_ID" \
--name "k3s-backends" \
--policy "ROUND_ROBIN" \
--health-checker-protocol "TCP" \
--health-checker-port 80 \
--health-checker-interval-in-ms 10000 \
--health-checker-timeout-in-ms 3000 \
--health-checker-retries 3 \
--wait-for-state SUCCEEDED
# Add both nodes as backends
oci lb backend create \
--load-balancer-id "$LB_ID" \
--backend-set-name "k3s-backends" \
--ip-address "<SERVER_PRIVATE_IP>" \
--port 80 \
--wait-for-state SUCCEEDED
oci lb backend create \
--load-balancer-id "$LB_ID" \
--backend-set-name "k3s-backends" \
--ip-address "<AGENT_PRIVATE_IP>" \
--port 80 \
--wait-for-state SUCCEEDED
# Create HTTP listener
oci lb listener create \
--load-balancer-id "$LB_ID" \
--name "http-listener" \
--default-backend-set-name "k3s-backends" \
--protocol "HTTP" \
--port 80 \
--wait-for-state SUCCEEDED
The 10 Mbps shape is Always Free. It is enough for development, personal projects, and moderate traffic. The load balancer gets its own public IP, which becomes your cluster's entry point.
Get the load balancer IP:
LB_IP=$(oci lb load-balancer get \
--load-balancer-id "$LB_ID" \
--query 'data."ip-addresses"[0]."ip-address"' --raw-output)
echo "Load Balancer IP: $LB_IP"
Step 6: Deploying a Test Workload
Let us deploy something real to verify the entire pipeline works:
# nginx-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
labels:
app: nginx-demo
spec:
replicas: 3
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
spec:
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: nginx-demo
spec:
selector:
app: nginx-demo
ports:
- port: 80
targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-demo
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-demo
port:
number: 80
Apply it:
kubectl apply -f nginx-demo.yaml
# Watch the pods come up
kubectl get pods -w
# NAME READY STATUS RESTARTS AGE
# nginx-demo-6d9f7c8b4-abc12 1/1 Running 0 10s
# nginx-demo-6d9f7c8b4-def34 1/1 Running 0 10s
# nginx-demo-6d9f7c8b4-ghi56 1/1 Running 0 10s
Three replicas spread across both nodes. Test it:
curl http://$LB_IP
# <!DOCTYPE html>
# <html>
# <head><title>Welcome to nginx!</title>...
Traffic flows: Internet → OCI Load Balancer → Traefik Ingress → nginx pods. All on free infrastructure.
Step 7: Persistent Storage with OCI Block Volumes
K3s includes the local-path storage provisioner by default, which creates volumes on the node's local disk. For Always Free instances, this works well since we have 200GB of boot volume.
# Verify the storage class exists
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY AGE
# local-path (default) rancher.io/local-path Delete 10m
Test it with a PVC:
# pvc-test.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-path
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: pvc-test
spec:
containers:
- name: busybox
image: busybox
command: ["sh", "-c", "echo 'Persistent storage works on OCI ARM' > /data/test.txt && cat /data/test.txt && sleep 3600"]
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: test-pvc
kubectl apply -f pvc-test.yaml
kubectl logs pvc-test
# Persistent storage works on OCI ARM
For production workloads that need data to survive node replacement, consider the OCI CSI driver — but for Always Free instances, local-path is practical and simple.
Cluster Resource Usage
After deploying K3s with Traefik, CoreDNS, and the test workload, here is what the resource consumption looks like:
kubectl top nodes
# NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
# k3s-server 180m 9% 1.2Gi 10%
# k3s-agent 95m 5% 780Mi 6%
The entire Kubernetes infrastructure — control plane, networking, DNS, ingress, and three nginx replicas — uses about 2GB of the available 24GB. That leaves 22GB free for your actual workloads.
For context, here is what fits comfortably:
| Workload | CPU | Memory | Fits? |
|---|---|---|---|
| PostgreSQL | 200m | 512Mi | Yes |
| Redis | 100m | 256Mi | Yes |
| Go API server | 100m | 128Mi | Yes |
| Python Flask app | 200m | 256Mi | Yes |
| Grafana | 100m | 256Mi | Yes |
| Prometheus | 200m | 512Mi | Yes |
| Total | 900m | 1.9Gi | Easily |
You could run a complete application stack — database, cache, API, monitoring — with room to spare.
Troubleshooting Common OCI + K3s Issues
I hit every one of these during my setup. Saving you the debugging time.
1. Pods stuck in ContainerCreating
Usually a Flannel networking issue. Check if VXLAN traffic (UDP 8472) is allowed in the security list and verify Flannel is using the correct interface:
journalctl -u k3s -f | grep flannel
# If you see "failed to find interface" — fix the --flannel-iface flag
2. Agent node shows NotReady
The agent cannot reach the server on port 6443. Verify the security list allows TCP 6443 from the VCN CIDR and that you used the private IP in K3S_URL:
# From the agent node
curl -k https://<SERVER_PRIVATE_IP>:6443
# Should return JSON (even if it says Unauthorized)
3. Ingress returns 404 for all routes
Traefik is running but not seeing your Ingress resources. Check Traefik logs:
kubectl logs -n kube-system -l app.kubernetes.io/name=traefik
4. OCI Load Balancer shows backends as Critical
Health check is failing. Verify that Traefik is listening on port 80 on both nodes:
ss -tlnp | grep :80
5. Cannot pull container images
OCI instances need outbound internet access through a NAT gateway or Internet Gateway. Verify your route table has a default route to the Internet Gateway.
Security Hardening
For a cluster exposed to the internet, apply these minimum security measures:
# 1. Restrict API server access to your IP
# Update security list: change 6443 source from VCN to your specific IP
# 2. Create a non-root kubeconfig
kubectl create serviceaccount deploy-sa
kubectl create clusterrolebinding deploy-sa-binding \
--clusterrole=edit --serviceaccount=default:deploy-sa
# 3. Enable Network Policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
namespace: default
spec:
podSelector: {}
policyTypes:
- Ingress
EOF
# 4. Set resource limits on all deployments (prevent noisy neighbors)
# 5. Use OCI Vault for Kubernetes secrets (covered in my earlier post)
Accessing kubectl from Your Laptop
Copy the kubeconfig from the server node to your local machine:
# From your laptop
scp opc@<SERVER_PUBLIC_IP>:/etc/rancher/k3s/k3s.yaml ~/.kube/oci-k3s-config
# Update the server address from 127.0.0.1 to the public IP
sed -i '' "s/127.0.0.1/<SERVER_PUBLIC_IP>/g" ~/.kube/oci-k3s-config
export KUBECONFIG=~/.kube/oci-k3s-config
kubectl get nodes
This works because we added --tls-san with the public IP during installation.
Cost Comparison
| Setup | Monthly Cost | Nodes | RAM |
|---|---|---|---|
| OCI Always Free + K3s | $0 | 2 | 24 GB |
| EKS (t3.medium x2) | ~$150 | 2 | 8 GB |
| GKE Autopilot (equivalent) | ~$120 | Auto | Auto |
| AKS (B2s x2) | ~$65 | 2 | 8 GB |
| DigitalOcean K8s | ~$48 | 2 | 4 GB |
| Civo K3s | ~$40 | 2 | 4 GB |
OCI gives you 3x the RAM of paid alternatives, for free. The trade-off is that you manage K3s yourself — no managed control plane. For learning, development, and personal projects, that trade-off is excellent.
What Can You Run on This Cluster?
This is not theoretical. Here are workloads I have tested on this exact setup:
- Gitea (self-hosted Git) — 128Mi RAM, works perfectly
- Drone CI (CI/CD) — 256Mi RAM, builds containers on ARM
- PostgreSQL — 512Mi RAM, handles small-to-medium databases
- Grafana + Prometheus — 768Mi combined, full monitoring stack
- Go/Rust microservices — Under 64Mi each, ARM-native builds are fast
- Static sites with Hugo — Trivial resources, served through Traefik
Conclusion
Oracle Cloud's Always Free ARM allocation is the best-kept secret in cloud computing for Kubernetes enthusiasts. 4 OCPUs, 24GB RAM, 200GB storage, a load balancer, and 10TB of outbound transfer — all free, permanently.
K3s is the perfect match for this hardware. It is lightweight, ARM-native, and production-tested. The combination gives you a Kubernetes cluster that would cost $100-150/month on any other provider.
The setup takes about 30 minutes from scratch, and the result is a cluster you can use for learning, development, CI/CD, or running personal projects. I have had mine running for weeks with zero issues.
Stop paying for Kubernetes clusters you use for development. OCI and K3s give you a better option.
All resources in this post use OCI Always Free tier. No charges will be incurred.
Tags: #OracleCloud #Kubernetes #K3s #ARM #OCI #AlwaysFree #CloudNative #DevOps #Containers
Top comments (0)