Linux Internals Every DevOps Engineer Must Understand
(Let's go Beyond “I know Linux”)
If you claim DevOps, Linux isn’t just an OS. it’s your runtime, debugger, firewall, scheduler, and autopsy report.
This article covers the Linux internals that secretly gets you tested.
1. Linux File System Internals: /proc & /sys
Linux exposes its own brain as files.
/proc – Process & Kernel Runtime View
- Virtual filesystem (no disk I/O)
- Created at boot
- Reflects current kernel state
Key paths:
/proc/cpuinfo # CPU architecture, cores
/proc/meminfo # Memory stats
/proc/loadavg # Load average
/proc/<PID>/fd # Open file descriptors
/proc/<PID>/maps # Memory mapping
Production insight
If a Java app is leaking memory:
ls -l /proc/<PID>/fd | wc -l
You’ll instantly know if file descriptors are leaking.
/sys – Hardware & Driver Control Layer
- Used by udev, drivers, containers
- Allows controlled kernel interaction
Example:
/sys/class/net/eth0/speed
/sys/block/sda/queue/scheduler
Great takeaway
-
/proc= What is happening now -
/sys= How hardware & kernel are wired
2. Process Lifecycle: fork → exec → zombie
Understanding processes separates juniors from seniors.
The Lifecycle
- fork() → child process created
- exec() → program replaced
- wait() → parent collects exit status
fork();
exec("/bin/java");
wait();
Zombies (are Not Horror, Just Bad Parenting)
- Process finished execution
- Parent did not collect exit code
- PID still exists
Check:
ps aux | grep Z
Fix:
- Restart parent
- Or fix application signal handling
Interview gold line
“Zombies don’t consume memory, but they exhaust PID space.”
3. Memory, CPU & Load Average (The Most Misunderstood Topic)
Load Average ≠ CPU Usage
uptime
# 1.2 0.9 0.7
Means:
- Avg runnable or waiting processes over 1, 5, 15 minutes
| Scenario | Meaning |
|---|---|
| Load = 4 on 4 cores | Healthy |
| Load = 10 on 4 cores | Overloaded |
| High load, low CPU | I/O bottleneck |
Memory: Why “Free” Lies
free -m
Focus on:
- available, not free
- Linux aggressively uses cache
Clear myth:
“High memory usage is bad” ❌
“Unused memory is wasted memory” ✅
Real Debug Workflow
vmstat 1
iostat -x
top / htop
Can correlate:
- CPU wait
- Disk latency
- Run queue
4. Networking Basics: Ports, Sockets & Reality
Port ≠ Process
A socket = IP + Port + Protocol
ss -tulnp
Example:
LISTEN 0 128 0.0.0.0:8080 java
Connection States You Must Know
| State | Meaning |
|---|---|
| LISTEN | Waiting |
| ESTABLISHED | Active |
| TIME_WAIT | Normal close |
| CLOSE_WAIT | App bug |
A signal
If you see many CLOSE_WAIT → application is leaking connections.
5. Permissions & SELinux (Where “It Works on My VM” Dies)
Linux Permissions Refresher
-rwxr-x---
But permissions alone are not enough.
SELinux (Mandatory Access Control)
Modes:
getenforce
# Enforcing | Permissive | Disabled
Why prod apps fail:
- Correct permissions
- Wrong SELinux context
Fix properly:
ausearch -m avc -ts recent
semanage fcontext
restorecon
Senior rule
Never disable SELinux in production, fix policies.
6. systemd: The Real Init System
systemd Is More Than “service start”
It handles:
- Process supervision
- Logging
- Dependency management
- Auto-restart
Example unit:
[Service]
ExecStart=/app/start.sh
Restart=always
MemoryMax=2G
Check failures:
journalctl -u myapp --since today
Why DevOps Love systemd
- Built-in watchdog
- CGroup resource limits
- Deterministic startup
How they Evaluate This Knowledge
They won’t ask:
“Explain /proc”
They’ll ask:
“Why is load high but CPU idle?”
Or:
“App restarted but port still busy”
If you understand internals, answers come naturally.
Linux is:
- Your observability platform
- Your runtime security layer
- Your truth source
Tools change.
Containers evolve.
Linux fundamentals compound forever.


















Top comments (0)