I ran a simple command:
docker run -it ubuntu bash
But behind this… the Linux kernel created multiple isolation layers.
Containers are NOT magic.
They are just processes with boundaries enforced by the kernel.
Let’s break down what actually isolates your container.
⚠️ The Truth Most People Miss
Docker does NOT create isolation.
The Linux kernel does.
Docker → containerd → runc → kernel
At the lowest level, everything comes down to:
- Processes
- Namespaces
- Cgroups
🧠 Step 1: A Container is Just a Process
Run:
docker run -d ubuntu sleep 1000
Now get PID:
docker inspect --format '{{.State.Pid}}'
Example:
PID = 4321
👉 This is the actual process on the host
📁 Step 2: Where Isolation is Visible
Check:
ls -l /proc/4321/ns/
Output:
pid -> pid:[4026531836]
net -> net:[4026532000]
mnt -> mnt:[4026531840]
uts -> uts:[4026531838]
ipc -> ipc:[4026531839]
user -> user:[4026531837]
cgroup -> cgroup:[4026531835]
🔥 Critical Insight
These are NOT files.
They are references to kernel namespace objects.
👉 /proc//ns/ is just a window into kernel state
🧩 Step 3: What Happens During Container Creation
When you run:
docker run ubuntu
Internally:
dockerd → containerd → runc → clone()/unshare() → kernel
The kernel:
✔ Creates a process
✔ Attaches namespaces
✔ Applies cgroups
✔ Sets capabilities & security filters
🧱 Step 4: Namespace Isolation (Core Concept)
Each container gets its own:
Namespace
Purpose
PID
Process isolation
NET
Network stack
MNT
Filesystem
UTS
Hostname
IPC
Shared memory
USER
User mapping
🔬 Step 5: Proving Isolation
Run two containers:
docker run -d --name c1 ubuntu sleep 1000
docker run -d --name c2 ubuntu sleep 1000
Get PIDs:
docker inspect --format '{{.State.Pid}}' c1
docker inspect --format '{{.State.Pid}}' c2
Now compare:
ls -l /proc//ns/net
ls -l /proc//ns/net
Example:
net:[4026532000]
net:[4026532100]
💡 Golden Rule
Namespace identity = inode number
Same inode → shared namespace
Different inode → isolated namespace
⚠️ Step 6: Not Always New Namespaces
Example:
docker run --network=host ubuntu
👉 Result:
Container uses host network namespace
No isolation at network level
🔐 Step 7: Cgroups (Resource Isolation)
Example:
docker run -d --memory=200m --cpus=1 ubuntu stress
Check:
cat /sys/fs/cgroup/memory/docker//memory.limit_in_bytes
👉 Controls:
CPU usage
Memory limits
OOM behavior
🛡️ Step 8: Security Layers (Advanced)
Capabilities
docker run --cap-drop=ALL ubuntu
👉 Root without power
Seccomp
👉 Filters syscalls
Example: blocks ptrace
AppArmor / SELinux
👉 Mandatory access control
💥 Reality Check (Most Important Section)
Containers are NOT fully isolated like VMs.
They share:
- Same kernel
- Same OS
If the kernel is compromised → all containers are compromised.
🔬 Advanced Insight (Kernel-Level)
Namespaces are created using:
Plain text
clone(CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWNS | ...)
👉 Each flag creates a new isolation boundary
🧠 Final Mental Model
Container = Process + Namespaces + Cgroups + Security Filters
NOT a virtual machine
NOT magic
🔥 Closing
Next time you run:
docker run nginx
Remember…
You didn’t start a container.
You asked the Linux kernel to create
a fully isolated execution environment for a process.
Top comments (0)