DEV Community

Shivakumar
Shivakumar

Posted on

Linux Memory Management

Here are your Deep Study Notes for Linux Memory Management.

To understand this, we will use a simple "Library Analogy":

  • Hard Disk: The Library Bookshelf (Huge space, slow to find books).
  • RAM: The Study Desk (Small space, instant access to open books).
  • Process: The Student trying to read books.

1. Virtual Memory (The Great Illusion)

In Linux, applications are "lied" to.

  • The Concept: When you start a program (like Chrome), Linux tells it: "Here is 4GB of contiguous, empty memory just for you!"
  • The Reality: The physical RAM is actually fragmented and shared by 100 other processes.
  • How it works:
  • The OS creates a Virtual Address Space (The fake memory).
  • It uses a hardware component called the MMU (Memory Management Unit) to translate these "fake" virtual addresses into "real" Physical RAM addresses.

  • Why do we do this?

  • Security: One app cannot touch another app's memory because they have totally different maps.

  • Efficiency: If two apps use the same library (like libc), Linux loads it into RAM once and just maps it to both apps' virtual memory.


2. Page Faults (The "Missing Book" Event)

Memory is divided into small chunks called Pages (usually 4KB size).
When a program tries to access a part of memory, the CPU checks: "Is this page actually in the RAM right now?"

If the answer is NO, it triggers a Page Fault. This is an "interrupt" that pauses the program so the Kernel can fix it.

Types of Page Faults (Important for Interviews!)

Type Name Severity What Happens?
Minor Fault Soft Fault Good / Fast The data is already in RAM (maybe used by another app), but just needs to be "linked" to this process. Or, the app asked for new memory, and the OS is giving it a fresh zero-filled page.
Major Fault Hard Fault Bad / Slow The data is NOT in RAM. The OS has to go to the Disk (Swap or File) to fetch it. This halts the program for milliseconds (which is an eternity for a CPU).
Invalid Fault Segfault Fatal The program tried to access memory address that doesn't exist or is forbidden. Result: Segmentation Fault (core dumped).

3. Swap Memory (The Overflow Parking)

  • Definition: Swap is a space on your Hard Disk that is used as "Emergency RAM".
  • How it works:
  • When RAM is full, the Kernel looks for inactive pages (data that hasn't been used in a while).
  • It moves these inactive pages from RAM to the Disk (Swap Space). This is called Swapping Out.
  • If that program wakes up and needs that data, the Kernel moves it back from Disk to RAM. This is Swapping In.

  • Performance Hit: RAM is measured in nanoseconds. Disk is measured in milliseconds. Swap is 100,000x slower than RAM.

  • Thrashing: This is a disaster scenario where the system spends 100% of its time moving data between RAM and Disk (Swap in/out) and 0% of time actually running the app. The server freezes.

DevOps Config: swappiness

This is a kernel parameter that controls how aggressively Linux uses Swap.

  • Value: 0 to 100.
  • vm.swappiness = 0: "Please do not use swap unless absolutely necessary (RAM is 100% full)."
  • vm.swappiness = 100: "Aggressively move inactive data to swap to keep RAM free for file caching."
  • vm.swappiness = 60: (Default) A balanced approach.
  • Command: cat /proc/sys/vm/swappiness

4. OOM Killer (The Assassin)

OOM = Out Of Memory.

Imagine the Library (Disk) is full, the Desk (RAM) is full, and the Swap is full. The Student wants to open one more book. The system cannot say "Yes", but it cannot say "No" (because the kernel already promised the memory earlier).

The Kernel enters a panic mode and calls the OOM Killer.

  • The Job: To save the system from crashing, the OOM Killer must kill a process to free up memory immediately.
  • How does it choose the victim? It calculates an oom_score for every process based on:
  • Memory Usage: Who is eating the most RAM? (Bigger targets are preferred).
  • Process Age: Short-lived processes are preferred targets. Long-running system services are protected slightly.

  • The Result: It usually kills the heaviest process (like your Database or Java App). You will see Killed in the terminal.

How to debug OOM?

As a DevOps engineer, if a server crashes mysteriously, always check the logs:

dmesg | grep -i "killed"
# OR
grep -i "out of memory" /var/log/syslog

Enter fullscreen mode Exit fullscreen mode

Output Example: Out of memory: Kill process 1234 (java) score 850 or sacrifice child.


Summary Table for Interview

Concept Explanation Real-world Analogy
Virtual Memory Maps fake addresses to real RAM. Isolation & Sharing. Giving every student a fake "Desk #1" ticket, but guiding them to different real desks.
Page Fault CPU interrupt when data is not in RAM. Student reaches for a book but it's not on the desk.
Major Fault Data must be read from Disk (Slow). Student has to walk to the bookshelf to get the book.
Swap Disk space used as fake RAM. Placing books on the floor because the desk is full.
OOM Killer Kernel process that kills apps when RAM+Swap is full. The Librarian throwing a student out because the room is too crowded.

Top comments (1)

Collapse
 
vincentdu2021 profile image
Vincent Du

'The Librarian throwing a student out because the room is too crowded.' This is so good, lmao