DEV Community

Cover image for How to switch container runtime in a Kubernetes cluster
Λ\: Laurent Noireterre for Stack Labs

Posted on

How to switch container runtime in a Kubernetes cluster

As you might know, Kubernetes has deprecated Docker as container runtime, and Docker support will be removed in next versions (currently planned for the 1.22 release in late 2021).

If you are using a managed Kubernetes cluster (like GKE, EKS, AKS) you shouldn't have a lot to handle and it should be pretty straight forward for you. But if you are managing a cluster by yourself (with kubeadm for example) and use Docker as container runtime, you will have to handle that runtime switch soon or later to keep enjoying Kubernetes updates.

The aim of this post is not to deep dive into the reasons of that change introduced by Kubernetes, or deep dive into container runtime behaviour in a Kubernetes cluster, but to step by step describe how to switch your container runtime from Docker to any runtime that implements Container Runtime Interface (CRI). If you need more details on the reasons which lead to Docker deprecation, you can read Kubernetes Blog post Don't Panic: Kubernetes and Docker

What to check in the first place

Appart from the changes linked to Kubernetes installation itself, the impacts on the workloads running in your cluster should be limited, if not non-existent. One of the only thing you have to care about is if you are using Docker-in-Docker in any of your container workload by mounting the Docker socket /var/run/docker.sock. In that case you will have to find an alternative (Kaniko for example) before switching from Docker to your new container runtime.

It's also warmly advised to backup your data before proceeding with the container runtime switch!

Let's proceed with the changes !

Ok now that you are ready to apply the container runtime switch, let's proceed with the changes. I will use containerd as container runtime in this post but the steps below can be adapted to any container runtime (like CRI-O)

We will first start by impacting all worker nodes, and then finish by the control plane.

Worker nodes

The steps below have to be applied on each worker node.

1. First we will cordon and drain the node so that no more workload will be scheduled and executed on the node during the procedure.

kubectl cordon <node_name>
kubectl drain <node_name>
Enter fullscreen mode Exit fullscreen mode

Remark: if you have DaemonSets running on the node, you can use the flag --ignore-daemonsets to proceed with the drain without evicting the pods linked to your DaemonSet (which is by the way impossible with the drain command). Don't worry, these pods will be automatically restarted by kubelet at the end of the procedure with the new container runtime. If you have critical workload linked to the DaemonSets and don't want to let them run during the process, you can either specify a nodeSelector on your DaemonSet or completely uninstall and reinstall them at the end of the process.

2. Once the node is drained, stop the kubelet service:

sudo systemctl stop kubelet
sudo systemctl status kubelet
Enter fullscreen mode Exit fullscreen mode

3. Uninstall Docker.
I will not detail the commands here as it depends on your Linux distribution and the way you have installed Docker. Just be carefull if you want completely clean Docker artifacts, you might have to manually remove some files (for example /var/lib/docker)

You can check Docker documentation to help you uninstalling the engine.

4. Install containerd (same here, I let you choose your favorite way to install it following containerd documentation)

5. Enable and Start containerd service

sudo systemctl enable containerd
sudo systemctl start containerd
sudo systemctl status containerd
Enter fullscreen mode Exit fullscreen mode

6. Kubernetes communicates with the container runtime through the CRI plugin. Be sure this plugin is not disabled in your containerd installation by editing the config file /etc/containerd/config.toml and check the disabled_plugins list:

disabled_plugins = [""]
Enter fullscreen mode Exit fullscreen mode

Then restart containerd service if needed

sudo systemctl restart containerd
Enter fullscreen mode Exit fullscreen mode

7. Edit kubelet configuration file /var/lib/kubelet/kubeadm-flags.env to add the following flags to KUBELET_KUBEADM_ARGS variable (adapt container-runtime-endpoint path if needed):

--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock
Enter fullscreen mode Exit fullscreen mode

8. Start kubelet

sudo systemctl start kubelet
Enter fullscreen mode Exit fullscreen mode

9. Check if the new runtime has been correctly taken into account on the node:

kubectl describe node <node_name>
Enter fullscreen mode Exit fullscreen mode

You should see the container runtime version and name:

System Info:
  Machine ID:                 21a5dd31f86c4
  System UUID:                4227EF55-BA3BCCB57BCE
  Boot ID:                    77229747-9ea581ec6773
  Kernel Version:             3.10.0-1127.10.1.el7.x86_64
  OS Image:                   Red Hat Enterprise Linux Server 7.8 (Maipo)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.3
  Kubelet Version:            v1.20.2
  Kube-Proxy Version:         v1.20.2
Enter fullscreen mode Exit fullscreen mode

10. Uncordon the node to mark it as schedulable and check your pods running status

kubectl uncordon <node_name>
Enter fullscreen mode Exit fullscreen mode

That's it, once all your pods have been restarted you can proceed with the next worker node !

Control Plane

The procedure to upgrade the container runtime on master nodes is exactly the same than on the worker node. However you have to be careful if you are on a single master node configuration. Indeed, while the new container runtime will pull kube-apiserver, etcd and coredns images and then create corresponding containers, the cluster will be unavailable. You shouldn't also be able to run kubectl command.

Here are some tips to help you follow the new container runtime start and troubleshoot potential problems:

1. Use journalctl to follow kubelet logs:

journalctl -u kubelet
Enter fullscreen mode Exit fullscreen mode

2. As well watch containerd logs:

journalctl -u containerd
Enter fullscreen mode Exit fullscreen mode

3. Use crictl command to follow container deployments

crictl --runtime-endpoint /run/containerd/containerd.sock ps
Enter fullscreen mode Exit fullscreen mode

4. Check at the end of the upgrade that you are well using the new container runtime by executing a describe command on your master nodes:

kubectl describe node <master_node_name>
Enter fullscreen mode Exit fullscreen mode

Congratulations! You are now running a Kubernetes cluster without Docker and are now ready to receive future releases!

Discussion (2)

Collapse
jackgit28 profile image
jackgit28

Very useful and concise thank you!

I encountered one issue after performing the above on a cluster; The below resolved it for me, hope it comes in handy for others!

Environment (context):
Host nodes: Ubuntu 21.04 (amd64 arch) - but imagine much the same for any cluster
Moving from: k8s v1.21.5
Moving to: k8s v1.22.3
Cluster installed with: kubeadm

Problem:
Upgrading the k8s version with kubeadm failed as follows:
k8s-cp-node:~# kubeadm upgrade plan
k8s-cp-node:~# kubeadm upgrade apply vX.X.XX
Error:
"docker is required for container runtime: exec: "docker": executable file not found in $PATH"

Cause:
As docker runtime was uninstalled, it was no longer present on nodes of course. This combined with a lingering ANNOTATION (applied by kubeadm on initial cluster install) was still pointing to dockershim's unix socket on each node, which was blocking the k8s version upgrade:

Check your nodes cri-socket annotation with:
$ kubectl describe node | grep Annotations -A5
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
...

Fix:
I moved over to using containerd as my container runtime.

  1. Check the location of your new container runtime unix socket (runtime endpoint) before changing anything. Mine was as exactly as per Laurent's main article above.

  2. Update your "cri-socket" node annotations (for ALL your nodes) before you upgrade k8s version. Run for each of your nodes, replacing with the actual name of your node. This worked for me:
    $ kubectl annotate node --overwrite kubeadm.alpha.kubernetes.io/cri-socket=/var/run/containerd/containerd.sock

  3. You can check the annotation(s) after changing them:
    $ kubectl describe node | grep Annotations -A5

  4. Proceed with your k8s version cluster upgrade as per normal. You should no longer get complaints of missing docker problems...

Other Notes
I also had a few static control plane pods (api-server etc) getting stuck "Pending" during the upgrade and had to nuke them as follows, after which all was good...
$ kubectl -n kube-system delete pod kube-apiserver- --force --grace-period 0
(replace with your actual pod name in the command of course)
Missing static control plane pods will automatically be re-created by the node when it sees they are missing.

Happy k8sing all!

Collapse
cawoodm profile image
Marc

I just installed Kubernetes (k3s) and it demanded I install Docker as a pre-requisite. If Docker is deprecated this makes no sense!