Achyuta Das

Posted on Jan 9

HA K8s cluster using kube-vip

#cluster #kubernetes #tutorial

Overview

A stacked HA cluster is a topology where the distributed data storage cluster provided by etcd is stacked on top of the cluster formed by the nodes managed by kubeadm that run control plane components.

Each control plane node runs an instance of the kube-apiserver, kube-scheduler, and kube-controller-manager. The kube-apiserver is exposed to worker nodes using a load balancer.

Each control plane node creates a local etcd member and this etcd member communicates only with the kube-apiserver of this node. The same applies to the local kube-controller-manager and kube-scheduler instances.

This topology couples the control planes and etcd members on the same nodes. It is simpler to set up than a cluster with external etcd nodes, and simpler to manage for replication.

Here's what happens in a 3-node stacked cluster:
Each control plane node runs:

etcd member
kube-apiserver, scheduler, controller-manager

So, you have:

3 etcd members → quorum = 2
3 API servers → load balanced (can handle 1 down)

If one node fails: You still have:

2 etcd members → quorum maintained
2 control plane instances → still available

This is the default topology deployed by kubeadm. A local etcd member is created automatically on control plane nodes when using kubeadm init and kubeadm join --control-plane

Assumptions: You have done cluster bootstrapping using kubeadm before as this document won’t cover everything in detail.

Kubernetes Static Pods

Static Pods are Kubernetes Pods that are run by the kubelet on a single node and are not managed by the Kubernetes cluster itself. This means that whilst the Pod can appear within Kubernetes, it can't make use of a variety of Kubernetes functionality (such as the Kubernetes token or ConfigMap resources). The static Pod approach is primarily required for kubeadm as this is due to the sequence of actions performed by kubeadm. Ideally, we want kube-vip to be part of the Kubernetes cluster, but for various bits of functionality we also need kube-vip to provide a HA virtual IP as part of the installation.

Setting up the infrastructure

Prerequisite:

containerd is installed on the node.
kubeadm, kubelet, kubectl are install on the node.

To set up kube-vip for Kubernetes High Availability (HA) with 3 master nodes and a Virtual IP (VIP), follow this structured approach:

Masters: 10.238.40.162, 10.238.40.163, 10.238.40.164
VIP: 10.238.40.166

For this validation, I have used Static Pods way of setting up kube-vip on the cluster.

Cluster bootstrap

Generate static pod manifest

export VIP=10.238.40.166
export INTERFACE=enp19s0

# at the time of writing this doc the latest version if v0.9.2
KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")
# we will be using kube-vip container to generate the manifest
alias kube-vip="ctr image pull ghcr.io/kube-vip/kube-vip:$KVVERSION; ctr run --rm --net-host ghcr.io/kube-vip/kube-vip:$KVVERSION vip /kube-vip"

mkdir -p /etc/kubernetes/manifests

kube-vip manifest pod \
    --interface $INTERFACE \
    --address $VIP \
    --controlplane \
    --services \
    --arp \
    --leaderElection | tee /etc/kubernetes/manifests/kube-vip.yaml

Issues during getting a lease for leader election


E0710 19:09:28.208775       1 leaderelection.go:332] error retrieving resource lock kube-system/plndr-cp-lock: leases.coordination.k8s.io "plndr-cp-lock" is forbidden: User "kubernetes-admin" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"

There is an open issue with kube-vip needed super-admin.conf access to boot up. This is the case from kubeadm v1.29.x onwards.

kubeadm init is doing the following:

writes an admin.conf that has no binding to cluster-admin yet
writes a super-admin.conf that has system:masters (super user)
create cluster-admin binding using the super-admin

If kube-vip needs permission during setup it can, either:

wait for admin.conf to receive permissions (not possible AFAIK, given the bootstrap sequence)
use super-admin.conf during bootstrap and then moving to admin.conf
Modify the static pod manifest to use the super-admin.conf

sed -i 's#path: /etc/kubernetes/admin.conf#path: /etc/kubernetes/super-admin.conf#' \
          /etc/kubernetes/manifests/kube-vip.yaml

Initialize the first master node; Use the VIP as the cluster endpoint while running the kubeadm init

sudo kubeadm init  --control-plane-endpoint="10.238.40.166:6443"  --upload-certs  --apiserver-cert-extra-sans "10.238.40.166,10.238.40.164,10.238.40.163,10.238.40.162,10.96.0.1,127.0.0.1,0.0.0.0"  --apiserver-advertise-address 10.238.40.162 -v=5

Modify the static pod manifest to use the admin.conf. Once the first node is initialized. update the static pod manifest which will restart the kube-vip pod. The access to the cluster will be lose for couple of minutes.

sed -i 's#path: /etc/kubernetes/super-admin.conf#path: /etc/kubernetes/admin.conf#' \
          /etc/kubernetes/manifests/kube-vip.yaml

Copy the static pod manifest to other master nodes before adding them to the control plane

scp /etc/kubernetes/manifests/kube-vip.yaml root@10.238.40.163:/etc/kubernetes/manifests
scp /etc/kubernetes/manifests/kube-vip.yaml root@10.238.40.164:/etc/kubernetes/manifests

Run the control plane node join command (output of the kubeadm init) on the other master nodes.

kubeadm join 10.238.40.166:8443 --token <> \
        --discovery-token-ca-cert-hash sha256:<> \
        --control-plane --certificate-key <>

Conclusion

Congratulations! You have successfully deployed a highly available Kubernetes cluster using a stacked etcd topology with kube-vip.

DEV Community