A few weeks ago, the Kubernetes development team announced that they are deprecating Docker. This piece of news made the rounds through tech communities and social networks alike. Will Kubernetes clusters break, and if so, how will we run our applications? What should we do now? Today, we’ll examine all these questions and more.
Let’s start from the top. If you’re already familiar with Docker and Kubernetes and want to get to the juicy parts, skip to how does the Dockershim deprecation impact you?
Even though Docker is used as a synonym for containers, the reality is that they have existed long before Docker was a thing. Unix and Linux have had containers in some form or another since the late 70s, when chroot was introduced. Chroot allowed system admins to run programs in a kind-but-not-really-isolated filesystem. Later, the idea was refined and enhanced into container engines such as FreeBSD Jails, OpenVZ, or Linux Containers (LXC).
But what are containers?
A container is a logical partition where we can run applications isolated from the rest of the system. Each application gets its own private network and a virtual filesystem that is not shared with other containers or the host.
Running containerized applications is a lot more convenient than installing and configuring software. For one thing, containers are portable; we can build in one server with the confidence that it will work in any server. Another advantage is that we can run multiple copies of the same program simultaneously without conflict or overlap, something really hard to do otherwise.
However, for all this to work, we need a container runtime, a piece of software capable of running containers.
Docker is the most popular container runtime — by a long shot. It shouldn’t be surprising, as it brought the concept of containers into the mainstream, which in turn inspired the creation of platforms like Kubernetes.
Before Docker, running containers was indeed possible, but it was hard work. Docker made things simple because it’s a complete tech stack that can:
- Manage container lifecycle.
- Proxy requests to and from the containers.
- Monitor and log container activity.
- Mount shared directories.
- Set resource limits on containers.
- Build images. The
Dockerfileis the de-facto format for building container images.
- Push and pull images from registries.
In its first iterations, Docker used Linux Containers (LXC) as the runtime backend. As the project evolved, LXC was replaced by containerd, Docker’s own implementation. A modern Docker installation is divided into two services:
containerd, responsible for managing containers, and
dockerd, which does all the rest.
Kubernetes takes the idea of containers and turns it up a notch. Instead of running containerized applications in a single server, Kubernetes distributes them across a cluster of machines. Applications running in Kubernetes look and behave like a single unit, even though, in reality, they may consist of an arrangement of loosely-coupled containers.
Kubernetes adds distributed computing features on top of containers:
- Pods: pods are logical groups of containers that share resources like memory, CPU, storage, and network.
- Auto-scaling: Kubernetes can automatically adapt to changing workloads by starting and stopping pods as needed.
- Self-healing: containers are monitored and restarted on failure.
- Load-balancing: requests are distributed over the healthy available pods.
- Rollouts: Kubernetes supports automated rollouts and rollbacks. Making otherwise complex procedures like Canary and Blue-Green releases trivial.
We can think of Kubernetes’ architecture as a combination of two planes:
- The control plane is the coordinating brain of the cluster. It has a controller that manages nodes and services, a scheduler that assigns pods to the nodes, and the API service, which handles communication. Configuration and state are stored on a highly-available database called etcd.
- The worker nodes are the machines that run the containers. Each worker node runs a few components like the kubelet agent, a network proxy, and the container runtime. The default container runtime up to Kubernetes version v1.20 was Docker.
Before starting a container, we need to either build or download a container image, which is a filesystem packed with everything the application needs: code, binaries, configuration files, libraries, and dependencies.
The rise in popularity of containers showed the need for an open image standard. As a result, Docker Inc and CoreOS established the Open Container Initiative (OCI) in 2015, with the mission of producing vendor-neutral formats. The result of this effort was the creation of two standards:
- An image specification that defines the image binary format.
- A runtime specification that describes how to unpack and run a container. OCI maintains a reference implementation called runc. Both containerd and CRI-O use runc in the background to spawn containers.
The OCI standard brought interoperability among different container solutions. As a result, images built in one system can run in any other compliant stack.
Here is where things get a bit more technical. I said that each Kubernetes worker node needs a container runtime. In its first original design, Docker was inseparable from Kubernetes because it was the only runtime supported.
Docker, however, was never designed to run inside Kubernetes. Realizing this problem, the Kubernetes developers eventually implemented an API called Container Runtime Interface (CRI). This interface allows us to choose among different container runtimes, making the platform more flexible and less dependent on Docker.
This change introduced a new difficulty for the Kubernetes team since Docker doesn’t know about or support the CRI. Hence, at the same time the API was introduced, they had to write an adaptor called Dockershim to translate CRI messages into Docker-specific commands.
While Docker was the first and only supported engine for a time, it was never on the long-term plans. Kubernetes version 1.20 deprecates Dockershim, kicking off the transition away from Docker.
Once the transition is done, the stack gets significantly smaller. It goes from this:
The result is less bloat and fewer dependencies needed on each of the worker nodes.
So, why the change?
Simply put, Docker is heavy. We get better performance with a lightweight container runtime like containerd or CRI-O. As a recent example, Google benchmarks have shown that containerd consumes less memory and CPU, and that pods start in less time than on Docker.
Besides, in some ways Docker itself can be considered technical debt. What Kubernetes needs from Docker is, in fact, the container runtime: containerd. The rest, at least as far as Kubernetes is concerned, is overhead.
Things are not as dramatic as they sound. Let’s preface this whole section by saying the only thing that changes in v1.20 is that you’ll get a deprecation warning, only if you’re running Docker. That’s all.
Can I still use Docker for development?
Yes, you absolutely can, now and in the foreseeable future. You see, Docker doesn’t run Docker-specific images; it runs OCI-compliant containers. As long as Docker continues using this format, Kubernetes will keep accepting them.
Can I Still Package My Production Apps With Docker?
Yes, for the same reasons as in the previous question. Applications packaged with Docker will continue to run — no change there. Thus, you can still build and test containers with the tools you know and love. You don’t need to change your CI/CD pipelines or switch to other image registries, Docker-produced images will continue to work in your cluster just as they always have.
What Do I Need to Change?
Right now, nothing. If your cluster uses Docker as a runtime, you’ll get a deprecation warning after upgrading to v1.20. But the change is a clear signal from the Kubernetes community about the direction they want to take. It’s time to start planning for the future.
When Is the Change Going to Happen?
The plan is to have all Docker dependencies completely removed by v1.23 in late 2021.
When Dockershim Goes Away, What Will Happen?
At that point, Kubernetes cluster admins will be forced to switch to a CRI-compliant container runtime.
If you are an end-user not a lot changes for you. Unless you are running some kind of node customizations, you probably won’t have to do anything special. Only test that your applications work with the new container runtime.
These are some of the things that will cause problems or break after upgrading to v1.23:
- Using Docker-specific logging and monitoring. That is, parsing docker messages from a log or polling the Docker API.
- Using Docker optimizations.
- Running scripts that rely on
- Running Docker commands in privileged pods. For instance: to build images with
docker build. See projects like kaniko for alternative solutions.
- Using Docker-in-Docker setups.
- Running Windows containers. Containerd does work in Windows, but its support level is not yet up to par with Docker’s. The objective is to have a stable containerd release for Windows by containerd version 1.20.
If you’re using a managed cluster on a cloud provider like AWS EKS, Google GKE, or Azure AKS, check that your cluster uses a supported runtime before Docker support goes away. Some cloud vendors are a few versions behind, so you may have more time to plan. So, check with your provider. To give an example, Google Cloud announced they are changing the default runtime from Docker to containerd for all newly-created worker nodes, but you can still opt-in for Docker.
If you run your own cluster: in addition to checking the points mentioned above, you will need to evaluate moving to another container runtime that is fully compatible with CRI. The Kubernetes docs explain the steps in detail:
Kubernetes is growing, but the change doesn’t need to be a traumatic experience. Most users won’t have to take any action. For those who do, there’s still time to test and plan.
To continue learning about Docker and Kubernetes, read these next: