Pavan Madduri

Posted on Mar 11

Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances

#oracle #k3s #kubernetes #cloudnative

Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances

How I turned Oracle Cloud's free ARM compute into a fully functional Kubernetes cluster — with ingress, persistent storage, and TLS — all without spending a dollar.

Introduction

I have been running Kubernetes clusters professionally for years — managed services like EKS, AKS, GKE, and self-hosted clusters with kubeadm. They all cost money. Even the cheapest managed Kubernetes offering runs $70-80/month just for the control plane.

Then I looked at what Oracle Cloud gives away for free: 4 ARM OCPUs and 24GB of RAM on the Always Free tier. That is more compute than most developers use for their entire home lab. The question was obvious — could I run a real Kubernetes cluster on it?

The answer is yes, and it works better than I expected.

In this post, I will walk through deploying K3s — Rancher's lightweight Kubernetes distribution — on OCI Always Free ARM instances. Not a toy cluster. A cluster with ingress routing, persistent volumes, automatic TLS certificates, and enough resources to run real workloads.

Why K3s on OCI ARM?

Why K3s over full Kubernetes?

K3s strips out the components most developers never use — cloud controller, storage drivers, legacy API versions — and replaces etcd with SQLite (or embedded etcd for HA). The result is a single binary under 100MB that starts in seconds.

On resource-constrained Always Free instances, this matters. Full kubeadm clusters consume 2-3GB of RAM just for the control plane. K3s uses around 512MB.

Why OCI ARM over other clouds?

Provider	Free Compute	RAM	Duration
OCI Always Free	4 ARM OCPUs	24 GB	Forever
AWS Free Tier	1 vCPU (t2.micro)	1 GB	12 months
GCP Free Tier	0.25 vCPU (e2-micro)	1 GB	Forever
Azure Free	1 vCPU (B1S)	1 GB	12 months

There is no comparison. OCI gives you 24x the RAM of any competitor's free tier, permanently.

Architecture

Here is what we are building:

┌──────────────────────────────────────────────────────┐
│                    OCI VCN (10.0.0.0/16)             │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │           Public Subnet (10.0.1.0/24)          │  │
│  │                                                │  │
│  │  ┌──────────────────┐  ┌────────────────────┐  │  │
│  │  │   K3s Server     │  │   K3s Agent        │  │  │
│  │  │   (Control Plane)│  │   (Worker Node)    │  │  │
│  │  │                  │  │                    │  │  │
│  │  │  2 OCPU / 12GB   │  │  2 OCPU / 12GB     │  │  │
│  │  │  Oracle Linux 9  │  │  Oracle Linux 9    │  │  │
│  │  │                  │  │                    │  │  │
│  │  │  - K3s server    │  │  - K3s agent       │  │  │
│  │  │  - Traefik        │  │  - Workloads       │  │  │
│  │  │  - CoreDNS       │  │  - Pods            │  │  │
│  │  │  - Metrics       │  │                    │  │  │
│  │  └──────────────────┘  └────────────────────┘  │  │
│  │                                                │  │
│  └────────────────────────────────────────────────┘  │
│                                                      │
│  Security List:                                      │
│    Ingress: SSH(22), HTTP(80), HTTPS(443),           │
│             K8s API(6443), Kubelet(10250),           │
│             NodePort(30000-32767)                    │
│    Egress:  All traffic                               │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │  OCI Load Balancer (10 Mbps - Always Free)     │  │
│  │  → Forwards 80/443 to K3s Traefik Ingress       │  │
│  └────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘

We split the 4 OCPUs and 24GB evenly: 2 OCPUs + 12GB for the server node, 2 OCPUs + 12GB for the worker. This gives the control plane enough room to breathe while leaving serious capacity for workloads.

Prerequisites

Before starting, you need:

OCI account with Always Free tier — Sign up at cloud.oracle.com
OCI CLI configured — Use Cloud Shell (pre-configured) or install locally
Two A1.Flex instances provisioned — Follow the VCN + compute setup from my earlier posts, but create two instances instead of one
SSH access to both instances

If you do not have the instances yet, provision them with these shapes:

# Server node
SHAPE_CONFIG='{"ocpus":2,"memoryInGBs":12}'

# Agent node (same config)
SHAPE_CONFIG='{"ocpus":2,"memoryInGBs":12}'

Both must use an aarch64 Oracle Linux 9 image — ARM architecture is critical here.

Step 1: Preparing the Instances

SSH into both instances and run the same preparation steps. OCI's Oracle Linux 9 images have firewalld and iptables rules that interfere with Kubernetes networking. We need to handle this.

# On BOTH nodes
sudo dnf update -y

# Disable firewalld — K3s manages its own iptables rules
sudo systemctl stop firewalld
sudo systemctl disable firewalld

# Load required kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k3s.conf
br_netfilter
overlay
EOF

sudo modprobe br_netfilter
sudo modprobe overlay

# Set required sysctl parameters
cat <<EOF | sudo tee /etc/sysctl.d/k3s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Why these specific settings?

br_netfilter — Enables iptables to see bridged traffic (required for pod-to-pod communication across nodes)
overlay — Required by the container runtime for overlay filesystem
ip_forward — Allows the kernel to forward packets between network interfaces (essential for routing traffic to pods)

I spent two hours debugging connectivity issues on my first attempt because I forgot br_netfilter. Pods on different nodes simply could not talk to each other. The symptom was DNS resolution failures — CoreDNS pods could not reach each other.

Step 2: OCI Security List Configuration

This is where most OCI + Kubernetes guides fall short. The default security list blocks inter-node communication that K3s needs.

You need these ingress rules on the security list attached to your subnet:

# Update security list with K3s-required ports
oci network security-list update \
    --security-list-id "$SL_ID" \
    --ingress-security-rules '[
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":22,"max":22}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":80,"max":80}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":443,"max":443}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":6443,"max":6443}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":10250,"max":10250}}},
        {"source":"10.0.0.0/16","protocol":"17",
         "udpOptions":{"destinationPortRange":{"min":8472,"max":8472}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":2379,"max":2380}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":30000,"max":32767}}}
    ]' \
    --egress-security-rules '[
        {"destination":"0.0.0.0/0","protocol":"all"}
    ]' \
    --force > /dev/null

Port breakdown:

Port	Protocol	Purpose	Source
22	TCP	SSH access	Anywhere
80	TCP	HTTP ingress	Anywhere
443	TCP	HTTPS ingress	Anywhere
6443	TCP	K3s API server	VCN only
10250	TCP	Kubelet metrics	VCN only
8472	UDP	VXLAN (Flannel CNI)	VCN only
2379-2380	TCP	etcd (if HA)	VCN only
30000-32767	TCP	NodePort services	Anywhere

Notice that internal K3s ports (6443, 10250, 8472) are restricted to the VCN CIDR 10.0.0.0/16. Never expose the Kubernetes API to the internet in production.

Step 3: Installing K3s Server

SSH into your first instance (the server node) and install K3s:

# On the SERVER node
export INSTALL_K3S_EXEC="server"
export K3S_NODE_NAME="k3s-server"

curl -sfL https://get.k3s.io | sh -s - \
    --write-kubeconfig-mode 644 \
    --tls-san $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
    --node-external-ip $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
    --flannel-iface enp0s6 \
    --disable servicelb

Let me explain each flag because they all matter on OCI:

--write-kubeconfig-mode 644 — Makes the kubeconfig readable without sudo. Useful for development but tighten this in production
--tls-san <public_ip> — Adds the public IP to the K3s API server's TLS certificate. Without this, kubectl from your laptop will get TLS errors
--node-external-ip <public_ip> — Tells K3s about the node's public IP. OCI instances only see their private IP on the network interface
--flannel-iface enp0s6 — Forces Flannel to use the correct network interface. OCI ARM instances use enp0s6 as the primary interface, not eth0. I discovered this the hard way — Flannel defaulted to the wrong interface and VXLAN tunnels failed silently
--disable servicelb — Disables K3s's built-in load balancer (ServiceLB/Klipper). We will use OCI's Always Free Load Balancer instead

The instance metadata endpoint 169.254.169.254 is OCI's equivalent of AWS's metadata service. It returns instance details without needing the OCI CLI.

Verify the server is running:

sudo systemctl status k3s

# Check node status
kubectl get nodes
# NAME         STATUS   ROLES                  AGE   VERSION
# k3s-server   Ready    control-plane,master   45s   v1.31.4+k3s1

Grab the join token — the agent node needs this:

sudo cat /var/lib/rancher/k3s/server/node-token
# K10xxxx::server:yyyy

Step 4: Joining the Agent Node

SSH into your second instance and install K3s in agent mode:

# On the AGENT node
export K3S_URL="https://<SERVER_PRIVATE_IP>:6443"
export K3S_TOKEN="<TOKEN_FROM_STEP_3>"
export K3S_NODE_NAME="k3s-agent"

curl -sfL https://get.k3s.io | sh -s - \
    --node-external-ip $(curl -s http://169.254.169.254/opc/v1/instance/metadata/public_ip) \
    --flannel-iface enp0s6

Important: use the private IP of the server node for K3S_URL, not the public IP. Both instances are in the same VCN subnet, so they communicate over the private network. This is faster, free (no egress charges), and more secure.

Back on the server node, verify both nodes are ready:

kubectl get nodes -o wide
# NAME         STATUS   ROLES                  AGE     VERSION        INTERNAL-IP   EXTERNAL-IP
# k3s-server   Ready    control-plane,master   5m      v1.31.4+k3s1   10.0.1.10     <public>
# k3s-agent    Ready    <none>                 30s     v1.31.4+k3s1   10.0.1.11     <public>

Two nodes. 4 OCPUs. 24GB RAM. Zero dollars.

Step 5: Configuring the OCI Load Balancer

OCI's Always Free tier includes a 10 Mbps Flexible Load Balancer. We will point it at our K3s nodes to route HTTP/HTTPS traffic to the Traefik ingress controller.

# Create the load balancer
LB_ID=$(oci lb load-balancer create \
    --compartment-id "$COMPARTMENT_ID" \
    --display-name "k3s-ingress-lb" \
    --shape-name "flexible" \
    --shape-details '{"minimumBandwidthInMbps":10,"maximumBandwidthInMbps":10}' \
    --subnet-ids "[\"$SUBNET_ID\"]" \
    --is-private false \
    --query 'data.id' --raw-output \
    --wait-for-state SUCCEEDED)

# Create a backend set with health check
oci lb backend-set create \
    --load-balancer-id "$LB_ID" \
    --name "k3s-backends" \
    --policy "ROUND_ROBIN" \
    --health-checker-protocol "TCP" \
    --health-checker-port 80 \
    --health-checker-interval-in-ms 10000 \
    --health-checker-timeout-in-ms 3000 \
    --health-checker-retries 3 \
    --wait-for-state SUCCEEDED

# Add both nodes as backends
oci lb backend create \
    --load-balancer-id "$LB_ID" \
    --backend-set-name "k3s-backends" \
    --ip-address "<SERVER_PRIVATE_IP>" \
    --port 80 \
    --wait-for-state SUCCEEDED

oci lb backend create \
    --load-balancer-id "$LB_ID" \
    --backend-set-name "k3s-backends" \
    --ip-address "<AGENT_PRIVATE_IP>" \
    --port 80 \
    --wait-for-state SUCCEEDED

# Create HTTP listener
oci lb listener create \
    --load-balancer-id "$LB_ID" \
    --name "http-listener" \
    --default-backend-set-name "k3s-backends" \
    --protocol "HTTP" \
    --port 80 \
    --wait-for-state SUCCEEDED

The 10 Mbps shape is Always Free. It is enough for development, personal projects, and moderate traffic. The load balancer gets its own public IP, which becomes your cluster's entry point.

Get the load balancer IP:

LB_IP=$(oci lb load-balancer get \
    --load-balancer-id "$LB_ID" \
    --query 'data."ip-addresses"[0]."ip-address"' --raw-output)
echo "Load Balancer IP: $LB_IP"

Step 6: Deploying a Test Workload

Let us deploy something real to verify the entire pipeline works:

# nginx-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-demo
  labels:
    app: nginx-demo
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-demo
  template:
    metadata:
      labels:
        app: nginx-demo
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            cpu: 100m
            memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-demo
spec:
  selector:
    app: nginx-demo
  ports:
  - port: 80
    targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-demo
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx-demo
            port:
              number: 80

Apply it:

kubectl apply -f nginx-demo.yaml

# Watch the pods come up
kubectl get pods -w
# NAME                          READY   STATUS    RESTARTS   AGE
# nginx-demo-6d9f7c8b4-abc12   1/1     Running   0          10s
# nginx-demo-6d9f7c8b4-def34   1/1     Running   0          10s
# nginx-demo-6d9f7c8b4-ghi56   1/1     Running   0          10s

Three replicas spread across both nodes. Test it:

curl http://$LB_IP
# <!DOCTYPE html>
# <html>
# <head><title>Welcome to nginx!</title>...

Traffic flows: Internet → OCI Load Balancer → Traefik Ingress → nginx pods. All on free infrastructure.

Step 7: Persistent Storage with OCI Block Volumes

K3s includes the local-path storage provisioner by default, which creates volumes on the node's local disk. For Always Free instances, this works well since we have 200GB of boot volume.

# Verify the storage class exists
kubectl get storageclass
# NAME                   PROVISIONER             RECLAIMPOLICY   AGE
# local-path (default)   rancher.io/local-path   Delete          10m

Test it with a PVC:

# pvc-test.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: pvc-test
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["sh", "-c", "echo 'Persistent storage works on OCI ARM' > /data/test.txt && cat /data/test.txt && sleep 3600"]
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: test-pvc

kubectl apply -f pvc-test.yaml
kubectl logs pvc-test
# Persistent storage works on OCI ARM

For production workloads that need data to survive node replacement, consider the OCI CSI driver — but for Always Free instances, local-path is practical and simple.

Cluster Resource Usage

After deploying K3s with Traefik, CoreDNS, and the test workload, here is what the resource consumption looks like:

kubectl top nodes
# NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
# k3s-server   180m         9%     1.2Gi           10%
# k3s-agent    95m          5%     780Mi           6%

The entire Kubernetes infrastructure — control plane, networking, DNS, ingress, and three nginx replicas — uses about 2GB of the available 24GB. That leaves 22GB free for your actual workloads.

For context, here is what fits comfortably:

Workload	CPU	Memory	Fits?
PostgreSQL	200m	512Mi	Yes
Redis	100m	256Mi	Yes
Go API server	100m	128Mi	Yes
Python Flask app	200m	256Mi	Yes
Grafana	100m	256Mi	Yes
Prometheus	200m	512Mi	Yes
Total	900m	1.9Gi	Easily

You could run a complete application stack — database, cache, API, monitoring — with room to spare.

Troubleshooting Common OCI + K3s Issues

I hit every one of these during my setup. Saving you the debugging time.

1. Pods stuck in ContainerCreating

Usually a Flannel networking issue. Check if VXLAN traffic (UDP 8472) is allowed in the security list and verify Flannel is using the correct interface:

journalctl -u k3s -f | grep flannel
# If you see "failed to find interface" — fix the --flannel-iface flag

2. Agent node shows NotReady

The agent cannot reach the server on port 6443. Verify the security list allows TCP 6443 from the VCN CIDR and that you used the private IP in K3S_URL:

# From the agent node
curl -k https://<SERVER_PRIVATE_IP>:6443
# Should return JSON (even if it says Unauthorized)

3. Ingress returns 404 for all routes

Traefik is running but not seeing your Ingress resources. Check Traefik logs:

kubectl logs -n kube-system -l app.kubernetes.io/name=traefik

4. OCI Load Balancer shows backends as Critical

Health check is failing. Verify that Traefik is listening on port 80 on both nodes:

ss -tlnp | grep :80

5. Cannot pull container images

OCI instances need outbound internet access through a NAT gateway or Internet Gateway. Verify your route table has a default route to the Internet Gateway.

Security Hardening

For a cluster exposed to the internet, apply these minimum security measures:

# 1. Restrict API server access to your IP
# Update security list: change 6443 source from VCN to your specific IP

# 2. Create a non-root kubeconfig
kubectl create serviceaccount deploy-sa
kubectl create clusterrolebinding deploy-sa-binding \
    --clusterrole=edit --serviceaccount=default:deploy-sa

# 3. Enable Network Policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress
  namespace: default
spec:
  podSelector: {}
  policyTypes:
  - Ingress
EOF

# 4. Set resource limits on all deployments (prevent noisy neighbors)
# 5. Use OCI Vault for Kubernetes secrets (covered in my earlier post)

Accessing kubectl from Your Laptop

Copy the kubeconfig from the server node to your local machine:

# From your laptop
scp opc@<SERVER_PUBLIC_IP>:/etc/rancher/k3s/k3s.yaml ~/.kube/oci-k3s-config

# Update the server address from 127.0.0.1 to the public IP
sed -i '' "s/127.0.0.1/<SERVER_PUBLIC_IP>/g" ~/.kube/oci-k3s-config

export KUBECONFIG=~/.kube/oci-k3s-config
kubectl get nodes

This works because we added --tls-san with the public IP during installation.

Cost Comparison

Setup	Monthly Cost	Nodes	RAM
OCI Always Free + K3s	$0	2	24 GB
EKS (t3.medium x2)	~$150	2	8 GB
GKE Autopilot (equivalent)	~$120	Auto	Auto
AKS (B2s x2)	~$65	2	8 GB
DigitalOcean K8s	~$48	2	4 GB
Civo K3s	~$40	2	4 GB

OCI gives you 3x the RAM of paid alternatives, for free. The trade-off is that you manage K3s yourself — no managed control plane. For learning, development, and personal projects, that trade-off is excellent.

What Can You Run on This Cluster?

This is not theoretical. Here are workloads I have tested on this exact setup:

Gitea (self-hosted Git) — 128Mi RAM, works perfectly
Drone CI (CI/CD) — 256Mi RAM, builds containers on ARM
PostgreSQL — 512Mi RAM, handles small-to-medium databases
Grafana + Prometheus — 768Mi combined, full monitoring stack
Go/Rust microservices — Under 64Mi each, ARM-native builds are fast
Static sites with Hugo — Trivial resources, served through Traefik

Conclusion

Oracle Cloud's Always Free ARM allocation is the best-kept secret in cloud computing for Kubernetes enthusiasts. 4 OCPUs, 24GB RAM, 200GB storage, a load balancer, and 10TB of outbound transfer — all free, permanently.

K3s is the perfect match for this hardware. It is lightweight, ARM-native, and production-tested. The combination gives you a Kubernetes cluster that would cost $100-150/month on any other provider.

The setup takes about 30 minutes from scratch, and the result is a cluster you can use for learning, development, CI/CD, or running personal projects. I have had mine running for weeks with zero issues.

Stop paying for Kubernetes clusters you use for development. OCI and K3s give you a better option.

All resources in this post use OCI Always Free tier. No charges will be incurred.

Tags: #OracleCloud #Kubernetes #K3s #ARM #OCI #AlwaysFree #CloudNative #DevOps #Containers

DEV Community

Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances

Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances

Introduction

Why K3s on OCI ARM?

Architecture

Prerequisites

Step 1: Preparing the Instances

Step 2: OCI Security List Configuration

Step 3: Installing K3s Server

Step 4: Joining the Agent Node

Step 5: Configuring the OCI Load Balancer

Step 6: Deploying a Test Workload

Step 7: Persistent Storage with OCI Block Volumes

Cluster Resource Usage

Troubleshooting Common OCI + K3s Issues

Security Hardening

Accessing kubectl from Your Laptop

Cost Comparison

What Can You Run on This Cluster?

Conclusion

Top comments (0)