Unlocking Docker: The Not-So-Obvious Stuff Made Simple

#devops #docker #containers #kubernetes

This article is for understanding docker to greater depths, also this article will mostly revolve around non trivial stuff.
Disclaimer: this has content which might not come handy in day to day use, but is definitely interesting and simple at the same time.

Docker architecture

Docker runs a client-server architecture, where your docker CLI is the client sending commands to the Docker daemon process, called dockerd. The daemon is the core engine responsible for managing all Docker objects (containers, images, volumes, networks).

When you run commands like docker build . or docker run nginx, the CLI parses your input and communicates over a REST API with the daemon. This communication usually happens over a Unix socket (e.g., /var/run/docker.sock) on your local machine, but it can also work remotely over TLS or SSH.

The daemon coordinates multiple subsystems:
It uses containerd (the container runtime) to handle container lifecycle tasks, including starting, stopping, and pausing containers.

It interacts with Linux kernel features like namespaces (for isolation), cgroups (for resource control) more about this below, and security modules (seccomp, AppArmor) to ensure containers run securely and efficiently.
The daemon orchestrates networking setups, volume mounts, image caches, and registry communication to pull/push images.

Because the daemon is always running as a service, it manages containers persistently, including after system reboots, ensuring your containers continue running or can be restarted based on configuration.

Layering in docker:

Every time you build, pull, or push an image, Docker works with layers. These layers correspond to each instruction in your Dockerfile. When rebuilding, only layers that changed get rebuilt, which speeds things up. Combining Dockerfile statements with && can collapse commands into one layer, reducing total layers and image size. Multi-stage builds leverage this concept by creating separate image stages that get combined efficiently.

How layering actually works:
Layers are stored using content-addressable storage, meaning each layer is identified and referenced by a cryptographic hash of its content, not just a sequential number or name. This hash ensures two identical layers, even from different images, are stored only once on disk, making image storage and transfer extremely efficient.

When pulling an image, Docker downloads layers by their hashes, then extracts each into its own directory on the host filesystem. When running a container, Docker uses a *union filesystem *(like OverlayFS) to stack these immutable layers one on top of the other and presents a single unified filesystem view inside the container. The writable container layer sits on top, capturing any changes you make runtime.

Isolation in docker - namespaces and cgroups:

Ever wondered how docker manages its containers with complete isolation, how it never lets these containers go rogue, or how it limits the containers resources so that the container wont drain out the host machine ?
Docker uses a technology called namespaces to provide the isolated workspace called the container. When you run a container, Docker creates a set of namespaces for that container.
Namespaces are basically a linux kernel feature which docker utilizes for this purpose.

These namespaces provide a layer of isolation. Each aspect of a container runs in a separate namespace and its access is limited to that namespace.

When starting a container, Docker creates different namespace types:
PID (process isolation)
NET (network isolation)
MNT (filesystem isolation)
UTS (hostname/domain isolation)
IPC (inter-process communication isolation)
USER (user/group isolation)

Cgroups in linux:
Control groups (cgroups) are a Linux kernel feature for managing and limiting resource usage, such as CPU, memory, disk I/O, and network bandwidth, at a fine-grained level. Think of cgroups as a hierarchical rulebook for resource control.

You can imagine a root system cgroup that branches into child cgroups, such as cpu and memory. When you set resource limits for a service, you write constraints into these cgroup files along with the service’s process IDs (PIDs).

Inheritance works like this: child cgroups inherit resource restrictions from their parents, meaning you cannot allow a child to exceed the parent's limits. However, child cgroups can impose stricter limits within the boundaries set by their parents, allowing precise resource allocation in a hierarchy.

Docker and Kubernetes rely heavily on cgroups to isolate containers and enforce resource limits. When you specify resource requests or limits in Kubernetes deployments, they ultimately get translated into cgroup settings applied by the container runtime.

How to utilize:
Interaction is not solely via files in /sys/fs/cgroup; higher-level tools like systemd and cgclassify can also be used to manage cgroups.
Can use systemd-cgls

while setting up this container, i had set the limits to 200m for memory, and 20% of CPU, which is what is also observed from the above picture.

SHA in docker:

Ever faced a scenario, where one of your services started misbehaving, because its base image on which the whole application was built, got an update which messed up with the application code. This happens when sometimes there are updates to the same image tag, and when the service restarts, and the ImagePullPolicy is set to Always, it assumingly pulls an image of the same tag, but the image is no longer the same. Solution to this.

Each image is also associated with a digest SHA — very much like a git commit. By specifying the SHA instead of the version tag, you can pull a specific version of the image. Something like docker pull ubuntu@sha256:sha_digest_here.
All in all, if you want to save yourself a headache, just pin your image to a SHA version like so:

build: image: ubuntu@sha256:3235326357dfb65f1781dbc4df3b834546d8bf914e82cce58e6e6b676e23ce8f commands: - build_my_wonderful_app.sh
You can find the SHA by either pulling the image (it shows the digest once the image is pulled) or by inspecting the image.

You can find all about this sha through the following command.
Docker inspect cmd
In the output of this cmd, you will notice three sha each of which corresponds to a different use case.
Also, the first 12 characters of the sha256 hash matches the IMAGE ID field of the docker images command output.

Second, The sha256 hash in the RepoDigests field in the output of the docker inspect command is generated when pushing the image to an image registry.

The sha256 algorithm generates unique hashes for an image each time you build or push.
**The **RootFS.Layers sha256 is basically for which base image was used to create the current image for which the docker inspect command is being used.
Sha256 thing is ofc, immutable

Everything in depth what happens when you run a docker cmd:

As we know at high level, Docker works on a client-server model. The command you type on your terminal is the Docker CLI client talking to the Docker daemon (dockerd), which is like the brain running on your machine or remote server. This daemon is responsible for managing images, containers, networking, and storage.

Step 1: The Docker CLI Parses Your Command
When you run a Docker command — for example, docker run nginx — the Docker CLI first parses the command and its parameters. It checks if your syntax is correct and builds a structured request to send to the Docker daemon’s API.

Step 2: The Docker Client Sends a Request to dockerd
This request is sent seamlessly to the Docker daemon, whether on your local machine or a remote host. The daemon listens for these API calls and interprets your intent. The daemon is the heavyweight that does all the work; your CLI is the friendly middle layer making interactions easy.

Step 3: dockerd Handles the Request
Depending on the command, dockerd takes different actions. Let’s look at two common commands:

For docker build .:
The daemon reads your build context (the . means current directory, few people always miss out on this, they never know about this, why . is used) and looks for a Dockerfile inside it. It processes each Dockerfile instruction layer by layer, fetching base images if needed, and assembling intermediate image layers. It uses content-addressed storage to reuse layers and optimize build times. Once complete, it stores the final image locally with a unique ID (and SHA digest).

Step 4: Docker Engine & Container Runtime Interaction
The Docker daemon relies on a container runtime underneath (default is containerd) to handle container lifecycle and OS-level features. This runtime manages Linux kernel features like namespaces for isolation, cgroups for resource limits.

Step 5: Container Starts and Runs
Once all the setup is done:
The container’s root filesystem is a writable layer stacked on the underlying image layers (read-only).
Its processes run isolated within their namespaces, meaning they see their own filesystem, network stack, and process tree.
Resource usage is controlled by the cgroups assigned to the container, ensuring no overuse beyond limits.

Hope you guys liked this info!
ping me via my socials for interesting discussions on DevOps (or anything random).
Thanks for reading!