DEV Community

Ripan Deuri
Ripan Deuri

Posted on

Linux Kernel: Interrupt Handling (Part 2)

Table of Contents

Introduction

This article breaks down the IRQ journey path for modern Linux on ARMv8-A, following a single interrupt through hardware exception entry, kernel assembly paths, IRQ and softirq subsystems, and finally back to user or kernel context.

Exception Levels and Execution Context

Exception Levels: Privilege Hierarchy

ARMv8-A organizes execution into four Exception Levels (EL0-EL3), with higher numbers indicating greater privilege.

EL0 (Application Execution Level): User space applications execute at EL0 with no access to privileged operations. Applications cannot configure MMU settings, interrupt controllers, or system timers. Every attempt to execute privileged instructions or access restricted registers triggers a synchronous exception to EL1.

EL1 (Operating System Kernel Level): The Linux kernel executes at EL1, with full control over system resources. The kernel can configure exception vectors, handle interrupts, and manage hardware resources.

EL2 and EL3 (optional): EL2 provides hypervisor support for virtualization, while EL3 implements the Secure Monitor for TrustZone.

The Dual Stack Mechanism at EL1

ARMv8-A defines two stack pointers that are available when executing at EL1: SP_EL0 and SP_EL1. The architecture itself does not assign them specific roles—Linux chooses how to use them.

SP_EL0
Linux uses SP_EL0 as the per-task kernel stack. Whenever the kernel runs in process context (for example, after a system call or interrupt return), SPSel is set to 0, so SP_EL0 holds the current thread’s kernel stack.

SP_EL1
SP_EL1 serves as the EL1 exception stack. During exception entry, the hardware forces SPSel=1, switching execution onto SP_EL1 before saving any state. This stack is per-CPU, not per-task, and is used only for the early stages of exception handling.

Stack selection is controlled by the SPSel bit in PSTATE:
SPSel = 0 → use SP_EL0
SPSel = 1 → use SP_EL1

Why two stacks?
Using distinct stacks protects the kernel from cascading failures. If an exception occurs while the kernel is running on a possibly corrupted or overflowing per-task stack (SP_EL0), the architecture guarantees that exception entry lands on the known-good SP_EL1 stack. This ensures there is always safe space for saving registers and continuing into the kernel’s exception-handling code.

Typical Execution Contexts

User space process (EL0, SP_EL0): Application code runs at EL0 using SP_EL0 (pointing to user-space stack).

Kernel process context (EL1, SP_EL0): After a system call or when handling process-specific kernel work, the kernel executes at EL1 but uses SP_EL0 (pointing to the process's kernel stack).

Exception entry (EL1, SP_EL1): When an exception occurs, hardware automatically switches to SP_EL1 regardless of the previous stack pointer.

Interrupt handler (EL1, per-CPU IRQ stack): After initial exception entry, the handler switches to a dedicated per-CPU interrupt stack for processing.

Critical System Registers for Exception Handling

Exception Link Register (ELR_EL1)

The Exception Link Register holds the return address when an exception is taken to EL1.

When an exception occurs, hardware automatically saves the Program Counter (PC) to ELR_EL1.

The ERET (Exception Return) instruction uses ELR_EL1 to restore PC: PC ← ELR_EL1.

Saved Program Status Register (SPSR_EL1)

SPSR_EL1 preserves the complete processor state (PSTATE) that existed before the exception. This 64-bit register contains:

  • Condition flags (N, Z, C, V): Arithmetic operation results
  • DAIF bits: Debug, SError, IRQ, and FIQ mask bits
  • Current Exception Level: The EL from which the exception was taken
  • Execution state: AArch64 vs AArch32 mode
  • Stack pointer selection: SPSel bit value

When an exception occurs, hardware automatically saves SPSR_EL1 ← PSTATE.

ERET performs PSTATE ← SPSR_EL1, recovering all processor state including interrupt masks.

Exception Syndrome Register (ESR_EL1)

ESR_EL1 provides detailed information about what caused the exception:

  • EC (Exception Class) [bits 31:26]: Categorizes the exception type (0x15 for SVC from AArch64, 0x25 for data abort, etc.)
  • IL (Instruction Length) [bit 25]: 16-bit or 32-bit instruction
  • ISS (Instruction Specific Syndrome) [bits 24:0]: Exception-specific details

For interrupts, ESR_EL1 is not typically used, but for synchronous exceptions like page faults, the kernel reads ESR_EL1 to understand fault details (read vs write, translation fault vs permission fault, etc.).

Vector Base Address Register (VBAR_EL1)

VBAR_EL1 holds the base address of the exception vector table for EL1. This 64-bit register must be 2KB-aligned and points to a table containing 16 exception entries, each 128 bytes (0x80) apart.

During boot, the kernel sets up its exception vector table and writes its address to VBAR_EL1:

Each exception triggers hardware to calculate: PC ← VBAR_EL1 + offset, where offset depends on exception origin and type.

Current Exception Level Register (CurrentEL)

CurrentEL is a read-only register indicating the current exception level in bits [3:2]. The kernel uses this primarily during early boot and debugging.

GIC Architecture and Interrupt Delivery

Checkout Linux Kernel: Interrupt (Part 1) for details on GIC.

GIC Components

The ARM Generic Interrupt Controller (GIC) consists of two main components:

GIC Distributor (GICD): Manages all interrupt sources, handles priority arbitration, and routes interrupts to CPU interfaces. Configuration registers include:

  • GICD_ISENABLER[n]: Enable interrupt n (write 1 to bit position)
  • GICD_ICENABLER[n]: Disable interrupt n
  • GICD_IPRIORITYR[n]: Set priority for interrupt n
  • GICD_ITARGETSR[n]: Route interrupt n to specific CPUs (GICv2) or use affinity routing (GICv3)
  • GICD_ICFGR[n]: Configure edge vs level triggering

GIC CPU Interface: Per-CPU component that signals interrupts to each CPU core and handles interrupt acknowledgment. Key registers (accessed via system registers in GICv3):

  • ICC_IAR1_EL1: Interrupt Acknowledge Register (read to get pending interrupt ID)
  • ICC_EOIR1_EL1: End of Interrupt Register (write to signal completion)
  • ICC_PMR_EL1: Priority Mask Register (sets priority threshold)
  • ICC_IGRPEN1_EL1: Interrupt Group 1 Enable

Interrupt Types and Triggering

Peripheral Interrupt Configuration: Each interrupt source can be configured as edge-triggered or level-sensitive in GICD_ICFGR registers. This configuration must match the actual hardware behavior of the peripheral device.

Edge-Triggered Interrupts: The GIC detects a rising or falling edge transition on the interrupt line and latches this event. Even if the peripheral immediately de-asserts the interrupt line, the GIC maintains the interrupt as pending. This prevents lost interrupts for brief pulses. The latched state persists until software reads ICC_IAR1_EL1.

Level-Triggered Interrupts: The GIC continuously monitors the interrupt line level. As long as the peripheral holds the line asserted, the interrupt remains pending. Software must service the interrupt and clear the condition at the peripheral device; otherwise, the interrupt will re-trigger immediately.

GIC to CPU Signaling

Regardless of how peripheral interrupts are configured (edge or level), the GIC CPU Interface presents a level-triggered signal to the CPU. When the GIC has a pending, enabled interrupt with sufficient priority:

  1. The GIC CPU Interface asserts the IRQ line to the CPU
  2. The IRQ line remains asserted continuously until the interrupt is acknowledged by software.
  3. The CPU's exception logic monitors this level signal each cycle
  4. When the CPU reads ICC_IAR1_EL1, the GIC de-asserts the IRQ line

This design allows the GIC to aggregate multiple interrupts and convert edge events into a stable level signal that the CPU can process when ready.

Interrupt Masking

CPU-Level Masking (PSTATE.I): When PSTATE.I=1, the CPU ignores the IRQ signal from the GIC. The GIC continues to assert the signal, but the CPU's exception entry logic is blocked. This is a local, per-CPU mask affecting all interrupts to that CPU.

GIC-Level Disabling (GICD_ICENABLER): Writing to GICD_ICENABLER disables a specific interrupt at the GIC Distributor. The GIC will not forward that interrupt to any CPU interface. The peripheral can still assert the interrupt line, but the GIC blocks it.

Priority Masking (ICC_PMR_EL1): Sets a priority threshold. Interrupts with priority numerically greater than this value (lower priority, as lower numbers mean higher priority) are masked at the GIC CPU interface.

These are independent mechanisms: PSTATE.I masks all interrupts at the CPU architectural level with no GIC interaction, while GICD/ICC registers control masking within the GIC hardware.

Exception Entry Mechanism

Hardware-Automated Exception Entry

When the CPU recognizes an interrupt (IRQ line asserted from GIC and PSTATE.I=0), it performs these actions atomically in hardware:

ELR_EL1  PC                    // Save return address
SPSR_EL1  PSTATE               // Save processor state
PSTATE.I  1                    // Mask IRQs
PSTATE.F  1                    // Mask FIQs
PSTATE.D  1                    // Mask debug exceptions
PSTATE.A  1                    // Mask SError
PSTATE.IL  0                   // Clear illegal state
SPSel  1                       // Switch to SP_EL1
PC  VBAR_EL1 + vector_offset   // Jump to exception handler
Enter fullscreen mode Exit fullscreen mode

These operations happen in silicon—no software involvement. The exception entry is non-maskable once triggered; nothing can interrupt this sequence.

Vector Offset Calculation

The vector_offset depends on two factors:

Exception Origin:

  • Current EL with SP_EL0: Offsets 0x000-0x180
  • Current EL with SP_ELx: Offsets 0x200-0x380
  • Lower EL (AArch64): Offsets 0x400-0x580
  • Lower EL (AArch32): Offsets 0x600-0x780

Exception Type (within each origin group):

  • Synchronous: +0x000
  • IRQ: +0x080
  • FIQ: +0x100
  • SError: +0x180

Examples:

  • Interrupt from user space (EL0): VBAR_EL1 + 0x480 (Lower EL + IRQ offset)
  • Interrupt from kernel (EL1, using SP_EL0): VBAR_EL1 + 0x080 (Current EL with SP0 + IRQ offset)
  • Interrupt from exception handler (EL1, using SP_EL1): VBAR_EL1 + 0x280 (Current EL with SPx + IRQ offset)

This distinction allows handlers to know the context they interrupted, enabling different handling strategies for user vs kernel interrupts.


The subsequent post Linux Kernel: Interrupt Handling - Code Walk Through (Part 3) breaks down the execution of the interrupt vector in more detail.

Top comments (0)