DEV Community

jeswin cyriac
jeswin cyriac

Posted on

What Actually Happens in the Linux Kernel When You Run docker run?

Most explanations of containers stop at "it's like a lightweight VM." I wanted to go deeper about what actually happens in the Linux
kernel when you run docker run?

I built an interactive blog that walks through it step by step, with animations you can click through:

Filesystem isolation

  • How OverlayFS stacks read-only image layers with a writable layer using copy-on-write
  • How pivot_root() changes a process's root pointer so it can't escape the container

Network isolation

  • How veth pairs are just two net_device structs pointing at each other in kernel memory
  • How the docker0 bridge switches packets using a MAC address hash table
  • How NAT with iptables MASQUERADE and conntrack lets containers reach the internet

Process isolation

  • How nsproxy inside task_struct holds pointers to all 6 namespace types
  • How getpid() returns different PIDs depending on which namespace is asking

Resource control

  • How the OOM killer picks which process to kill using oom_badness scores
  • How CFS tracks vruntime per process to fairly share CPU

The runtime chain

  • How docker run chains through dockerd → containerd → runc, and runc fires clone(), unshare(), pivot_root() then exits

Each topic has interactive diagrams where you can step through what's happening at the kernel level.

Check it out here: https://devopsagents.co/blog/containers-from-scratch#sharing-problem

Would love to hear feedback. Especially if anything is inaccurate or could be explained better.

Top comments (0)