Ripan Deuri

Posted on Mar 4

Understanding Linux Boot Memory Management

#architecture #computerscience #linux #tutorial

When the Linux kernel begins executing on ARM64 hardware, the CPU starts in a minimal environment. The Memory Management Unit (MMU) is disabled and the processor executes instructions using physical addresses directly.

Before Linux can use its normal virtual address space, the kernel must construct the page tables required for address translation. This work happens very early in the boot process head.S

During this phase the kernel performs three important tasks:

Construct minimal page tables
Create both identity and kernel virtual mappings
Enable the MMU and switch execution to the kernel's high virtual address space

This article explains how that process works, using concrete memory layouts and examples.

1. Boot Environment Assumptions

To make the discussion concrete, assume the following system configuration:

RAM start          : 0x80000000
Kernel load addr   : 0x80800000
Kernel size        : 30 MB
Kernel end         : 0x82600000

Other relevant parameters:

Page size          : 4 KB
L2 block size      : 2 MB
Virtual address size : 48 bits

The bootloader loads the kernel image into RAM at 0x80800000 and then jumps to the kernel entry point.

2. Physical Layout of the Kernel Image

After the bootloader loads the kernel, RAM contains the kernel image and its data sections.

Physical RAM
=======================================================

0x80000000  ───────────────────────────────────────────
             Start of RAM

0x80800000  ───────────────────────────────────────────
             Kernel _text

             Kernel code
             Kernel rodata
             Kernel data
             Kernel BSS

0x82500000  ───────────────────────────────────────────
             init_pg_dir region

0x82600000  ───────────────────────────────────────────
             End of kernel image

=======================================================

The kernel image contains multiple sections including code, read-only data, writable data, and the BSS section. The BSS section stores zero-initialized global variables. The early page tables are allocated within BSS region.

Why Early Page Tables Are Placed in BSS:

Early boot code cannot use dynamic memory allocation because the memory subsystem is not yet initialized. As a result, the kernel must reserve memory for early structures at build time.

The ARM64 kernel defines the root page table region and is placed in the BSS section by the linker script.

Because the memory already exists inside the kernel image, early boot code can simply reference it directly.

During boot the physical address of init_pg_dir becomes the base location where the kernel builds its early page tables.

3. Page Table Structure

With 4 KB pages and a 48-bit virtual address space, ARM64 uses four levels of page tables.

Virtual Address Bits [47:0]:
+-----+-----+-----+-----+------------+
| L0  | L1  | L2  | L3  |  Offset    |
|47:39|38:30|29:21|20:12|   11:0     |
+-----+-----+-----+-----+------------+
  9b    9b    9b    9b      12b
 (512) (512) (512) (512)   (4KB)

VA[47:39] → L0 index
VA[38:30] → L1 index
VA[29:21] → L2 index
VA[20:12] → L3 index

Each level contains 2^9 = 512 entries.

Block mappings can be created at intermediate levels:

L1 block size = 1 GB
L2 block size = 2 MB

Early boot typically uses L2 block mappings because they are simple and cover memory efficiently.

4. Early Page Table Memory Layout

The early page tables are placed sequentially in the BSS region.

Physical RAM
=============================================================

0x82500000  ── L0 table (TTBR0 - identity root)

0x82501000  ── L1 table (identity)

0x82502000  ── L2 table (identity)


0x82503000  ── L0 table (TTBR1 - kernel high VA root)

0x82504000  ── L1 table (kernel VA)

0x82505000  ── L2 table (kernel VA)

=============================================================

Each table contains 512 slots and 8 bytes per slot. So each table occupies one page (4 KB). The 8 bytes contains either PA of block or PA of another table.

Total memory required:

6 tables × 4 KB = 24 KB

5. Page Table Creation Steps (Simplified)

It constructs the minimal set of page tables required before the MMU is enabled.

5.1 Clearing Page Table Memory

The first step clears the memory used by init_pg_dir.

This ensures all entries start as invalid descriptors.

This is implemented using a loop that stores zero values across the reserved region.

5.2 Creating the Identity Mapping

The kernel builds an identity mapping for the region containing the kernel image.

VA 0x80800000 → PA 0x80800000

Page table hierarchy:

L0 entry → L1 table
L1 entry → L2 table
L2 entries → 2 MB blocks

Since the kernel size is 30 MB, the L2 table maps approximately 15 blocks.

15 blocks × 2 MB = 30 MB

This mapping allows the CPU to continue executing the kernel immediately after the MMU is enabled.

5.3 Creating the Kernel Virtual Mapping

Linux does not run the kernel at low addresses. Instead, the kernel executes in the upper portion of the virtual address space.

Assuming 48-bit address space:
Kernel VA starts from 0xffff_0000_0000_0000 (= PAGE_OFFSET)

Example kernel VA for PA 0x8080_0000: 0xFFFF000080800000.

The page tables create a mapping: VA 0xFFFF000080800000 → PA 0x80800000

This allows the same physical memory to appear at a high virtual address.

So the layout becomes:

Physical RAM
===========================================================

0x80000000  ───────────────────────────────────────────────
             Start of RAM

0x80800000  ───────────────────────────────────────────────
             Kernel _text (bootloader loaded image)

             Kernel code
             Kernel rodata
             Kernel data
             Kernel BSS

0x82500000  ───────────────────────────────────────────────
             init_pg_dir region (inside BSS)

             Early Page Tables
             ──────────────────────────────────────────────

0x82500000  ── L0 table (TTBR0)  → Identity map root
                entry[0] → 0x82501000

0x82501000  ── L1 table (identity)
                entry[2] → 0x82502000

0x82502000  ── L2 table (identity)
                entry[4..18] → 2MB block

                Example:
                L2[4]  → PA 0x80800000
                L2[5]  → PA 0x80A00000
                ...
                L2[18] → PA 0x82400000

0x82503000  ── L0 table (TTBR1) → Kernel virtual root
                entry[511] → 0x82504000

0x82504000  ── L1 table (kernel VA)
                entry[...] → 0x82505000

0x82505000  ── L2 table (kernel VA)
             ──────────────────────────────────────────────
0x825FFFFF  ───────────────────────────────────────────────
             End of kernel image

0x82600000  ───────────────────────────────────────────────
             First free RAM after kernel
===========================================================

6. Why Dual Mapping Is Required

At the moment the MMU is enabled, the CPU is already executing instructions from the kernel.

For example:

PC = 0x80800100

Before enabling the MMU, this address is interpreted as a physical address.

After enabling the MMU, the CPU interprets the program counter as a virtual address.

If the page tables contain an identity mapping:

VA 0x80800100 → PA 0x80800100

the instruction fetch continues successfully.

Afterward, the kernel performs a branch to its intended virtual address:

0xFFFF000080800000

From that point onward, the kernel runs entirely in high virtual memory.

If the identity mapping did not exist, enabling the MMU would immediately cause a translation fault.

Example:

PC = 0x80800100

After enabling the MMU the CPU attempts to translate:

VA 0x80800100

If no mapping exists for that address, the CPU raises an instruction abort. So identity mapping is required during boot.

Once the page tables are created, the kernel configures the translation system registers.

Kernel installs page table base addresses in following registers:

TTBR0_EL1  → identity mapping tables
TTBR1_EL1  → kernel virtual mapping tables

Finally, the MMU is enabled. At this moment the CPU switches from physical addressing to virtual addressing.

After enabling the MMU, the kernel performs a branch to its virtual address:

0xFFFF000080800000

The page tables translate this address to the physical location of the kernel in RAM.

VA 0xFFFF000080800000 → PA 0x80800000

From this point forward, the kernel executes entirely in its high virtual address space.

Conclusion

Early in the boot process, the Linux kernel must construct its own memory translation environment before the MMU can be enabled. The code in head.S performs this task by building minimal page tables inside statically allocated memory.

Two mappings are created during this phase. An identity mapping ensures that execution continues safely when the MMU is first enabled, while a kernel virtual mapping allows the kernel to run in its intended high address space.

After the MMU is enabled and execution switches to the high virtual address, the kernel continues building the full virtual memory system used during normal operation. These early page tables therefore serve as the foundation for the entire memory management subsystem.

DEV Community