yep

Posted on Apr 16 • Originally published at yepchaos.com

K3s, MetalLB & Cilium

#kubernetes #cilium #k3s #metallb

This post covers how I set up the on-premise Kubernetes cluster — picking a distribution, getting k3s running on CentOS, solving load balancing with MetalLB, and eventually replacing both MetalLB and the default CNI with Cilium.

Picking a Distribution

There are several lightweight Kubernetes distributions for on-prem setups like RKE, k0s, MicroK8s, and k3s. For a small cluster, I care mostly about simplicity and footprint. k3s fits that well — it’s a single binary under 100MB, easy to install, and doesn’t bring much overhead. It’s still stable enough for production use and handles multi-node setups without much complexity.

Setting Up k3s

Firewall First

Before installing anything, open the necessary ports on all nodes:

# Allow essential services
sudo firewall-cmd --permanent --add-service=ssh
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-service=https

# Trust pod and service networks
sudo firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16  # Pods CIDR
sudo firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16  # Services CIDR

# k3s-specific ports
sudo firewall-cmd --permanent --new-service=k3s
sudo firewall-cmd --permanent --service=k3s --set-description="K3s Firewall Rules"
sudo firewall-cmd --permanent --service=k3s --add-port=2379-2380/tcp  # etcd
sudo firewall-cmd --permanent --service=k3s --add-port=6443/tcp       # API server
sudo firewall-cmd --permanent --service=k3s --add-port=8472/udp       # Flannel VXLAN
sudo firewall-cmd --permanent --service=k3s --add-port=10250-10252/tcp # Kubelet
sudo firewall-cmd --permanent --service=k3s --add-port=30000-32767/tcp # NodePort
sudo firewall-cmd --permanent --add-service=k3s
sudo firewall-cmd --reload

Master Node

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.31.1+k3s1" sh -s - server \
  --cluster-init \
  --disable=traefik

--cluster-init initializes a new etcd-backed cluster. I disabled Traefik here because I use NGINX for ingress instead.

After installation, grab the node token for the worker nodes:

sudo cat /var/lib/rancher/k3s/server/node-token

Worker Nodes

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.31.1+k3s1" \
  K3S_TOKEN=<cluster_token> sh -s - server \
  --server https://<master_ip>:6443 \
  --disable=traefik

Note I'm running workers as server nodes, not agent nodes. This means all three nodes run the control plane — full HA with etcd across all nodes. If any one node goes down, the cluster keeps running.

Verify

kubectl get nodes

All nodes should show as Ready.

NGINX Ingress

With Traefik disabled, install NGINX for ingress:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.9.1/deploy/static/provider/cloud/deploy.yaml

Load Balancing: MetalLB

On cloud Kubernetes, creating a Service of type LoadBalancer automatically provisions a cloud load balancer. On-premise, Kubernetes has no implementation for this by default — services just sit in <pending> forever waiting for an external IP that never comes.

MetalLB solves this. It implements the LoadBalancer service type for bare-metal clusters using ARP at Layer 2 — when a service gets an IP from the pool, MetalLB announces it on the local network so traffic routes to the right node.

Install

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.10/config/manifests/metallb-native.yaml

Configure

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: pool
  namespace: metallb-system
spec:
  addresses:
    - 10.20.30.100-10.20.30.105
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: l2-advertisement
  namespace: metallb-system
spec:
  ipAddressPools:
    - pool

kubectl apply -f metallb.yaml

The IP range 10.20.30.100-10.20.30.105 is the actual range on my network. Make sure the IPs you use aren't in your DHCP server's allocation range or assigned to anything else.

MetalLB worked well. But then I started reading about Cilium.

Replacing MetalLB and Flannel with Cilium

Cilium is a networking, security, and load balancing solution for Kubernetes built on eBPF — a Linux kernel technology that lets run code in kernel space safely, without kernel modules. The main draws for me were the eBPF angle (genuinely interesting technology), the security features, and the observability tooling (Hubble and Tetragon). It can also replace both the CNI (Flannel in k3s's case) and MetalLB, which simplifies the stack.

To be honest — at my current scale (small cluster, few users), I haven't noticed any measurable performance difference from the switch. I did this to learn and explore, not because I was hitting limits. But the observability alone has been worth it.

Reinstall k3s Without Flannel and kube-proxy

To use Cilium as the CNI, k3s needs to be installed without its default networking components. On the master node:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.31.1+k3s1" sh -s - server \
  --cluster-init \
  --disable=servicelb \
  --disable=traefik \
  --flannel-backend=none \
  --disable-network-policy \
  --disable-kube-proxy

On worker nodes:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.31.1+k3s1" \
  K3S_TOKEN=<token> sh -s - server \
  --server https://<master_ip>:6443 \
  --disable=servicelb \
  --disable=traefik \
  --flannel-backend=none \
  --disable-network-policy \
  --disable-kube-proxy

--flannel-backend=none removes the default CNI.
--disable-kube-proxy removes kube-proxy since Cilium replaces it.
--disable=servicelb removes k3s's built-in service load balancer.

Install Cilium — Basic First

I started with a basic Cilium install to make sure networking worked before adding L2 load balancing:

helm repo add cilium https://helm.cilium.io/
helm repo update

helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=strict \
  --set k8sServiceHost=127.0.0.1 \
  --set k8sServicePort=6444

kubeProxyReplacement=strict tells Cilium to fully replace kube-proxy using eBPF instead of iptables for service routing.

Key Considerations for this Setup:

Localhost vs Node IP: Using 127.0.0.1 for k8sServiceHost is ideal if Cilium is running as a DaemonSet on a node where K3s is also running (like a single-node or small-scale HA setup), as it hits the K3s supervisor proxy directly.
How it works: When we point Cilium to 127.0.0.1:6444, it talks to the local K3s agent. This agent maintains a dynamic list of all available Master nodes.
Failover: If the current Master fails, the local proxy immediately reroutes Cilium’s traffic to a healthy Master.
The Benefit: Cilium remains connected to localhost, completely unaware of the backend failure. This ensures seamless networking uptime and removes the need for external infrastructure.

Add L2 Load Balancing

Once I confirmed everything was working, I upgraded to enable L2 announcements — Cilium's built-in equivalent of MetalLB:

helm upgrade cilium cilium/cilium --version 1.16.3 \
  --namespace kube-system \
  --set operator.replicas=1 \
  --set l2announcements.enabled=true \
  --set externalIPs.enabled=true \
  --set kubeProxyReplacement=strict \
  --set k8sServiceHost=127.0.0.1 \
  --set k8sServicePort=6444 \
  --set k8sClientRateLimit.qps=50 \
  --set k8sClientRateLimit.burst=100

Then configure the announcement policy — which interfaces Cilium uses to announce IPs:

apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-l2-announcement-policy
  namespace: kube-system
spec:
  nodeSelector: {}
  interfaces:
    - ens192
    - '^eth[0-9]+'
  externalIPs: true
  loadBalancerIPs: true

And the IP pool — same range I had in MetalLB:

apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: default-pool
spec:
  blocks:
    - cidr: 10.20.30.100/29

kubectl apply -f cilium-l2-announcement-policy.yaml
kubectl apply -f cilium-load-balancer-ip-pool.yaml

Enable Hubble

Hubble is Cilium's observability platform — real-time network flow monitoring with a UI. I use it occasionally to debug traffic and see what's happening in the cluster:

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --reuse-values \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

Current State

The cluster runs k3s on three CentOS VMs, all as server nodes with etcd for HA. Cilium handles CNI, kube-proxy replacement, and L2 load balancing. NGINX handles ingress. MetalLB is gone.

DEV Community