DEV Community

Cover image for Multi node Kubernetes cluster for local development of Kube components
Richard Kovacs
Richard Kovacs

Posted on • Updated on

Multi node Kubernetes cluster for local development of Kube components

I'm one of the luckiest folks who live in his dream. I make money from Kubernetes development (this is a one-feature ticket). Huge shout-out to my team at Ondat for making this possible. I spent a few days setting up my development environment and hopefully, the "end" result should be useful for others too.

The story in short, the feature which we would like to deliver requires a distributed environment. It isn't possible to test it on a single node. Kubernetes devs made a pretty nice job related to how to run a single node Kubernetes from source code on your local machine. But what about multi-node clusters? There are several solutions to run Kubernetes cluster on a machine, but they are using some sort of images (or they just didn't come to my face). But I don't want to build images, push them to a repository, start a cluster, pull images, etc. There are too many BUTs, let's solve the problem instead!

I started my project based on Techiescamp's Vagrant setup, thanks for the great work. I use this environment just for running Kubernetes, I compile source code on my host Linux machine (way faster), but all the build tools are included, so if you would like to use it as a build environment, just Go ahead and don't forget to increase master node resources. With this setup, I'm able to test code changes in 6 minutes on my laptop on a multi-node system by executing 4 simple commands.

If you are not interested in the details, please follow the readme how to start the cluster and skip the rest of the post.

Here are the things I had to change

Systemd resolver configures localhost as resolver ...

... which created a loop and coredns wasn't able to start. I simply replaced /etc/resolv.conf with a static config. Simple but powerful.

systemctl disable systemd-resolved
systemctl stop systemd-resolved
rm -f /etc/resolv.conf
cat <<EOF > /etc/resolv.conf
nameserver 1.1.1.1
nameserver 8.8.8.8
EOF
Enter fullscreen mode Exit fullscreen mode

Load br_netfilter module at boot time

modprobe br_netfilter
echo "br_netfilter" >> /etc/modules
Enter fullscreen mode Exit fullscreen mode

Containerd on the base repo of Ubuntu is too old for the HEAD of Kubernetes ...

... and the right version depends on the version of Kubernetes. I think the easiest (not most secure) way to install a specific version of the software is to download the binary.

wget https://github.com/containerd/containerd/releases/download/v1.6.8/containerd-1.6.8-linux-amd64.tar.gz
sudo tar Czxvf /usr/local containerd-1.6.8-linux-amd64.tar.gz
wget https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
sudo mv containerd.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now containerd
Enter fullscreen mode Exit fullscreen mode

Download cfssl

    curl -Lo /usr/local/bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64
    chmod +x /usr/local/bin/cfssl

    curl -Lo /usr/local/bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64
    chmod +x /usr/local/bin/cfssljson
Enter fullscreen mode Exit fullscreen mode

Somehow we have to share data between Kubernetes nodes

Vagrant has a built-in shared volume, but it requires two (or N+) time synchronization. That doesn't sounds well, so I created an NFS share on the master node.

if [[ $(hostname) = ${MASTER_NAME} ]]; then
    mkdir -p /var/run/kubernetes
    apt install -y nfs-kernel-server make
    cat <<EOF > /etc/exports
/var/run/kubernetes  ${MASTER_IP}/24(rw,sync,no_subtree_check,all_squash,insecure)
EOF
    exportfs -a
    systemctl restart nfs-kernel-server
else
    apt install -y nfs-common
fi
Enter fullscreen mode Exit fullscreen mode

Disable firewall

systemctl disable --now ufw
ufw reset ||:
apt remove -y ufw
Enter fullscreen mode Exit fullscreen mode

Shell setup

Vagrant user jumps to the source location

cat <<EOF >> /home/vagrant/.bashrc
(cd ${SOURCE} ; sudo su) ; exit
EOF
Enter fullscreen mode Exit fullscreen mode

Bunch of environment variables

alias k=kubectl
export CNI_CONFIG_DIR=/tmp
export LOG_LEVEL=4
export ALLOW_PRIVILEGED=1
export ETCD_HOST=${MASTER_IP}
export API_SECURE_PORT=443
export API_HOST=${MASTER_IP}
export ADVERTISE_ADDRESS=${MASTER_IP}
export API_CORS_ALLOWED_ORIGINS=".*"
export KUBE_CONTROLLERS="*,bootstrapsigner,tokencleaner"
export KUBE_ENABLE_NODELOCAL_DNS=true
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
export WHAT="cmd/kube-proxy cmd/kube-apiserver cmd/kube-controller-manager cmd/kubelet cmd/kubeadm cmd/kube-scheduler cmd/kubectl cmd/kubectl-convert"
export POD_CIDR="172.16.0.0/16"
export CLUSTER_CIDR="172.0.0.0/8"
export SERVICE_CLUSTER_IP_RANGE="172.17.0.0/16"
export FIRST_SERVICE_CLUSTER_IP="172.17.0.1"
export KUBE_DNS_SERVER_IP="172.17.63.254"
export GOPATH=/vagrant/github.com/kubernetes/kubernetes
export GOROOT=/opt/go
export PATH=/opt/go/bin:${SOURCE}/third_party:${SOURCE}/third_party/etcd:${SOURCE}/_output/local/bin/linux/amd64:${PATH}
Enter fullscreen mode Exit fullscreen mode

Starting one node Kubernetes command is simple

Execute start on the master node.

start() {
    rm -rf /var/run/kubernetes/* ||:
    KUBELET_HOST=${MASTER_IP} HOSTNAME_OVERRIDE=${MASTER_NAME} ./hack/local-up-cluster.sh -O
}
Enter fullscreen mode Exit fullscreen mode

Join token generation is where the magic happens :)

I had to dig deep into kubeadm join command to figure out how to join a new node. Execute config on the master node.

Generate join command

kubeadm token create --print-join-command > /var/run/kubernetes/join.sh
Enter fullscreen mode Exit fullscreen mode

Kubeadm bootstrap has to be able to read cluster-info

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeadm:bootstrap-signer-clusterinfo
  namespace: kube-public
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeadm:bootstrap-signer-clusterinfo
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:anonymous
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeadm:bootstrap-signer-clusterinfo
  namespace: kube-public
rules:
- apiGroups:
  - ''
  resources:
  - configmaps
  verbs:
  - get
Enter fullscreen mode Exit fullscreen mode

Create cluster-info config map

It contains the KUBECONFIG for joining. This config has some requirements because of JWT token signature validation, so it is better to generate a new config.

    cat <<EOFI > /var/run/kubernetes/kubeconfig
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: $(base64 -iw0 /var/run/kubernetes/server-ca.crt)
    server: https://${MASTER_IP}:${API_SECURE_PORT}/
  name: ''
contexts: []
current-context: ''
kind: Config
preferences: {}
users: []
EOFI
    kubectl delete cm -n kube-public cluster-info |:
    kubectl create cm -n kube-public --from-file=/var/run/kubernetes/kubeconfig cluster-info
Enter fullscreen mode Exit fullscreen mode

Kubelet needs lots of permissions, so I just gave them all

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubelet:operate
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubelet:operate
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:anonymous
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kubelet:operate
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
Enter fullscreen mode Exit fullscreen mode

SECURITY ALERT !!! Kubelet is running with anonymous auth, so as you see I gave all rights to an unauthorized user!!!

The bootstrap client of Kubeadm also needs some permission fixes

    token_id="$(cat /var/run/kubernetes/join.sh | awk '{print $5}' | cut -d. -f1)"

    cat <<EOFI | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kubeadm:bootstrap-signer-kubeadm-config
  namespace: kube-system
rules:
- apiGroups:
  - ''
  resourceNames:
  - kubeadm-config
  - kube-proxy
  - kubelet-config
  resources:
  - configmaps
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubeadm:bootstrap-signer-kubeadm-config
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubeadm:bootstrap-signer-kubeadm-config
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:bootstrap:${token_id}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kubeadm:bootstrap-signer-kubeadm-config
rules:
- apiGroups:
  - ''
  resources:
  - nodes
  verbs:
  - '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubeadm:bootstrap-signer-kubeadm-config
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubeadm:bootstrap-signer-kubeadm-config
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:bootstrap:${token_id}
EOFI
Enter fullscreen mode Exit fullscreen mode

Kubeadm fetches the ClusterConfig, so I had to prepare one

    cat <<EOFI > /var/run/kubernetes/ClusterConfiguration
apiServer:
  timeoutForControlPlane: 2m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: local-up-cluster
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: ${KUBE_VERSION}
networking:
  dnsDomain: cluster.local
  podSubnet: ${POD_CIDR}
  serviceSubnet: ${SERVICE_CLUSTER_IP_RANGE}
EOFI
    kubectl delete cm -n kube-system kubeadm-config |:
    kubectl create cm -n kube-system --from-file=/var/run/kubernetes/ClusterConfiguration kubeadm-config
Enter fullscreen mode Exit fullscreen mode

Copy various config files to the shared folder

    last=$(ls /tmp -t | grep "local-up-cluster.sh." | head -1)
    if [[ "${last}" ]]; then
      cp -rf /tmp/${last}/* /var/run/kubernetes
    else
      cp -rf /tmp/kube* /var/run/kubernetes
    fi
Enter fullscreen mode Exit fullscreen mode

Create necessary config maps

    cat /var/run/kubernetes/kube-proxy.yaml | sed -e "s/${MASTER_NAME}/''/" -e "s/${MASTER_IP}/${NODE_IP}/" > /var/run/kubernetes/config.conf
    kubectl delete cm -n kube-system kube-proxy |:
    kubectl create cm -n kube-system --from-file=/var/run/kubernetes/config.conf kube-proxy

    cp -f /var/run/kubernetes/kubelet.yaml /var/run/kubernetes/kubelet
    kubectl delete cm -n kube-system kubelet-config |:
    kubectl create cm -n kube-system --from-file=/var/run/kubernetes/kubelet kubelet-config
Enter fullscreen mode Exit fullscreen mode

Finally refresh the shared folder and set permissions

    exportfs -a

    chmod -R a+rw /var/run/kubernetes/*
Enter fullscreen mode Exit fullscreen mode

Join a member to the cluster

Execute member on the worker node.

Mount the shared volume if not mounted

mkdir -p /var/run/kubernetes ; mount | grep /var/run/kubernetes 1>/dev/null || mount ${MASTER_IP}:/var/run/kubernetes /var/run/kubernetes
Enter fullscreen mode Exit fullscreen mode

Create a Systemd service units for Kube-proxy and Kubelet

  cat <<EOFI > /etc/systemd/system/kube-proxy.service
[Unit]
Wants=network-online.target
After=network-online.target

[Service]
ExecStart=/vagrant/github.com/kubernetes/kubernetes/_output/local/bin/linux/amd64/kube-proxy \
--v=3 \
--config=/var/run/kubernetes/config.conf \
--master="https://${MASTER_IP}:${API_SECURE_PORT}"
Restart=on-failure
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target
EOFI

  cat <<EOFI > /etc/systemd/system/kubelet.service
[Unit]
Wants=kube-proxy
After=kube-proxy

[Service]
ExecStart=/vagrant/github.com/kubernetes/kubernetes/_output/local/bin/linux/amd64/kubelet \
--address="${NODE_IP}" \
--hostname-override=$(hostname) \
--pod-cidr="${POD_CIDR}" \
--node-ip="${NODE_IP}" \
--register-node=true \
--v=3 \
--bootstrap-kubeconfig=/var/run/kubernetes/admin.kubeconfig \
--kubeconfig=/var/run/kubernetes/admin.kubeconfig \
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \
--client-ca-file=/var/run/kubernetes/client-ca.crt \
--config=/var/run/kubernetes/kubelet.yaml
Restart=no
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target
EOFI

  systemctl daemon-reload

  systemctl restart kube-proxy
Enter fullscreen mode Exit fullscreen mode

Finally execute the generated script

sh /var/run/kubernetes/join.sh
Enter fullscreen mode Exit fullscreen mode

Verify Kubernetes cluster

Please follow the readme how to start the cluster.

# k get no -o wide
NAME            STATUS   ROLES    AGE     VERSION                                    INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
master-node     Ready    <none>   3m51s   v1.27.0-alpha.0.367+1fe7f09b46850f-dirty   192.168.56.10   <none>        Ubuntu 22.04.1 LTS   5.15.0-57-generic   containerd://1.6.8
worker-node01   Ready    <none>   2m22s   v1.27.0-alpha.0.367+1fe7f09b46850f-dirty   192.168.56.11   <none>        Ubuntu 22.04.1 LTS   5.15.0-57-generic   containerd://1.6.8
Enter fullscreen mode Exit fullscreen mode
# k get po -Ao wide
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE     IP              NODE            NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-57b57c56f-chbnq   1/1     Running   0          2m49s   192.168.0.3     master-node     <none>           <none>
kube-system   calico-node-hlxqg                         1/1     Running   0          2m49s   192.168.56.10   master-node     <none>           <none>
kube-system   calico-node-jx4gr                         1/1     Running   0          2m29s   192.168.56.11   worker-node01   <none>           <none>
kube-system   coredns-6846b5b5f-qqhx8                   1/1     Running   0          4m27s   192.168.0.2     master-node     <none>           <none>
Enter fullscreen mode Exit fullscreen mode

Here are the things I have to change

This project is evolving based on my requirements, so here are some potential improvements.

Join command times out

The cluster works well, but the health check at the join command doesn't.

Top comments (0)