DEV Community

Achyuta Das
Achyuta Das

Posted on

HA K8s cluster using Keepalived and HAProxy

Overview

A stacked HA cluster is a topology where the distributed data storage cluster provided by etcd is stacked on top of the cluster formed by the nodes managed by kubeadm that run control plane components.

Each control plane node runs an instance of the kube-apiserver, kube-scheduler, and kube-controller-manager. The kube-apiserver is exposed to worker nodes using a load balancer.

Each control plane node creates a local etcd member and this etcd member communicates only with the kube-apiserver of this node. The same applies to the local kube-controller-manager and kube-scheduler instances.

This topology couples the control planes and etcd members on the same nodes. It is simpler to set up than a cluster with external etcd nodes, and simpler to manage for replication.

Here's what happens in a 3-node stacked cluster:

Each control plane node runs:

  • etcd member
  • kube-apiserver, scheduler, controller-manager

So, you have:

  • 3 etcd members → quorum = 2
  • 3 API servers → load balanced (can handle 1 down)

If one node fails: You still have:

  • 2 etcd members → quorum maintained
  • 2 control plane instances → still available

This is the default topology deployed by kubeadm. A local etcd member is created automatically on control plane nodes when using kubeadm init and kubeadm join --control-plane

Assumptions: You have done cluster bootstrapping using kubeadm before as this document won’t cover everything in detail.

Setting up the machines

To set up HAProxy + Keepalived for Kubernetes High Availability (HA) with 3 master nodes and a Virtual IP (VIP), follow this structured approach:

Masters: 10.238.40.162, 10.238.40.163, 10.238.40.164
VIP: 10.238.40.166

  • Install HAProxy + Keepalived on all 3 Masters
sudo apt update 
sudo apt install -y haproxy keepalived
Enter fullscreen mode Exit fullscreen mode
  • HAProxy Configuration Edit /etc/haproxy/haproxy.cfg on all 3 master nodes:
global
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    mode http
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    option httplog
    option dontlognull

frontend kubernetes-apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend kubernetes-apiserver

backend kubernetes-apiserver
    mode tcp
    balance roundrobin
    option tcp-check
    server master1 10.238.40.162:6443 check fall 3 rise 2
    server master2 10.238.40.163:6443 check fall 3 rise 2
    server master3 10.238.40.164:6443 check fall 3 rise 2
Enter fullscreen mode Exit fullscreen mode
  • Keepalived Configuration Only one node at a time will "own" the VIP (managed by Keepalived), but config is present on all. Edit /etc/keepalived/keepalived.conf on each master nodes:

Note: Change the priority value for each node:

  • Master1: priority 110 (MASTER)
  • Master2: priority 100 (BACKUP)
  • Master3: priority 90 (BACKUP)
global_defs {
    router_id LVS_DEVEL
    script_user root
    enable_script_security
}

vrrp_script chk_haproxy {
    script "/bin/curl -f http://localhost:6443/healthz || exit 1"
    interval 2
    weight -2
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface enp19s0
    virtual_router_id 51
    priority 110
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass k8s-ha-cluster
    }
    virtual_ipaddress {
        10.238.40.166/24
    }
    track_script {
        chk_haproxy
    }
}
Enter fullscreen mode Exit fullscreen mode
  • Restart HAProxy and Keepalived
sudo systemctl restart haproxy keepalived
sudo systemctl enable haproxy keepalived
Enter fullscreen mode Exit fullscreen mode
  • Validate the VIP appears on one node
ip addr show | grep 10.238.40.166
Enter fullscreen mode Exit fullscreen mode

  • Check service status
sudo systemctl status haproxy
sudo systemctl status keepalived
Enter fullscreen mode Exit fullscreen mode

Bootstrap the cluster

  • Create a kubeadm-config.yaml file on the first master node Make sure to use the VIP as the control plane endpoint, and include it in the apiServer.certSANs

Important: Change the advertiseAddress field in InitConfiguration to match each master node's IP address.

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.32.6
apiServer:
  certSANs:
    - "10.238.40.166"      # VIP
    - "127.0.0.1"           # Localhost
    - "0.0.0.0"             # Wildcard
    - "10.96.0.1"           # Kubernetes service IP
    - "10.238.40.162"
    - "10.238.40.163"
    - "10.238.40.164"
  extraArgs:
    authorization-mode: Node,RBAC
certificatesDir: /etc/kubernetes/pki
clusterName: pcai
controlPlaneEndpoint: "10.238.40.166:8443"
controllerManager:
  extraArgs:
    bind-address: 0.0.0.0
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
networking:
  dnsDomain: cluster.local
  podSubnet: "172.20.0.0/16"
  serviceSubnet: "172.30.0.0/16"
scheduler:
  extraArgs:
    bind-address: 0.0.0.0
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: "10.238.40.162"
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
Enter fullscreen mode Exit fullscreen mode
  • Initialize the cluster
kubeadm init --upload-certs --config kubeadm-config.yaml -v=5
Enter fullscreen mode Exit fullscreen mode

Note: Save the output! It contains the join commands for control plane and worker nodes.

  • Configure kubectl access
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Enter fullscreen mode Exit fullscreen mode
  • Install your choice of networking solutions
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
Enter fullscreen mode Exit fullscreen mode
  • Wait for networking pods to be ready
kubectl wait --for=condition=ready pod -l k8s-app=calico-node -n kube-system --timeout=300s
Enter fullscreen mode Exit fullscreen mode
  • Run the control plane node join command (output of the kubeadm init) on the other master nodes.
kubeadm join 10.238.40.166:8443 --token <token> \
        --discovery-token-ca-cert-hash sha256:<hash> \
        --control-plane --certificate-key <cert-key>
Enter fullscreen mode Exit fullscreen mode

Note: The certificate key is only valid for 2 hours. If it expires, generate a new one:

kubeadm init phase upload-certs --upload-certs
Enter fullscreen mode Exit fullscreen mode

Verification and Health Checks

After setting up all control plane nodes, verify the cluster health:

  • Check all nodes are ready
kubectl get nodes -o wide
Enter fullscreen mode Exit fullscreen mode
  • Verify control plane components
kubectl get pods -n kube-system
Enter fullscreen mode Exit fullscreen mode
  • Check etcd cluster health
kubectl exec -n kube-system etcd-<master-node-name> -- etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list
Enter fullscreen mode Exit fullscreen mode
  • Test VIP failover
# Stop keepalived on the master node that owns the VIP
sudo systemctl stop keepalived

# Verify VIP moves to another node
ip addr show | grep 10.238.40.166

# Test API access via VIP
curl -k https://10.238.40.166:8443/healthz

# Restart keepalived
sudo systemctl start keepalived
Enter fullscreen mode Exit fullscreen mode

Conclusion

Congratulations! You have successfully deployed a highly available Kubernetes cluster using a stacked etcd topology with HAProxy and Keepalived. This setup provides:

Key Benefits

  • High Availability: Automatic failover with no single point of failure
  • Load Distribution: Traffic distributed across all API servers via HAProxy
  • Automatic Recovery: Keepalived handles VIP failover in seconds
  • Simplified Architecture: Stacked topology reduces complexity compared to external etcd

Cluster Capabilities

With this 3-master node configuration:

  • Tolerates 1 node failure while maintaining full cluster functionality
  • Maintains etcd quorum with 2 out of 3 members
  • Continues serving API requests through the remaining healthy masters
  • Automatically fails over VIP to operational nodes

Top comments (0)