In the previous lecture, we took a macroscopic look at the structure of the Linux graphics stack. Today, we narrow our focus and dive deep into the i915 driver. We'll see how, when the kernel discovers an Intel GPU, the driver builds its entire complex software kingdom from scratch, step by step.
Initialization is one of the most challenging parts of a driver: it must be compatible with integrated graphics from 20 years ago while also supporting the latest high-performance discrete graphics cards.
1. PCI Probing: The First Greeting
The starting point for everything is i915_pci_probe in i915_pci.c. When the Linux kernel's PCI bus scan detects a device matching Intel's Vendor ID (0x8086), this function is called.
1.1 Device Identification via pciidlist
Intel has hundreds of GPU models (from the i830 to Alder Lake, and the latest DG2/MTL). The driver cannot write a separate function for each model. i915's approach is to use a massive mapping table, pciidlist:
// i915_pci.c
static const struct pci_device_id pciidlist[] = {
INTEL_I915G_IDS(INTEL_VGA_DEVICE, &i915g_info),
INTEL_KBL_GT2_IDS(INTEL_VGA_DEVICE, &kbl_gt2_info),
INTEL_DG2_IDS(INTEL_VGA_DEVICE, &dg2_info),
// ...
};
The intel_device_info structure here acts like the GPU's "genetic blueprint," defining which features the model supports (e.g., does it have an LLC cache, how many media engines, the version of the display engine, etc.).
1.2 Force Probe
For hardware that has not yet been officially released, i915 will not load by default. You will see a require_force_probe flag in the code. Unless the user adds i915.force_probe=xxxx to the kernel parameters, the driver will print a warning and exit. This protects users from being affected by a driver that is still in development and potentially unstable.
2. The Initialization Blueprint: i915_driver_probe
Once the PCI handshake is successful, the core logic shifts to i915_driver_probe in i915_driver.c.
This is a grand process, which can be divided into five key phases:
Phase 1: Early Probe
In this phase, the driver creates the core structure drm_i915_private (often abbreviated as i915 in the code), which represents the entire GPU instance. Simultaneously, it initializes spinlocks, mutexes, and basic task queues.
Phase 2: MMIO Probe (Register Mapping)
The driver maps the GPU's registers into the CPU's memory address space via the PCI BAR space.
-
Key Operation: Calls
i915_driver_mmio_probe. -
Significance: From this point on, the driver can "talk" to the hardware using
intel_uncore_read()andintel_uncore_write().
Phase 3: HW Probe (Hardware Assessment)
The driver begins probing the hardware's actual parameters, rather than solely relying on device_info:
- Determines the GPU's core frequency.
- Probes memory regions: whether there is only system memory or also dedicated video memory (LMEM).
- Probes GT (Graphics Technology) topology: how many slices, how many EUs (Execution Units) exist on the hardware.
Phase 4: GEM & Display (Core Construction)
This is the heaviest part:
- GEM Initialization: Establishes the memory manager and sets up the GGTT (Global Graphics Translation Table).
- Display Initialization: Probes display connectors (HDMI/DP) and reads screen EDID.
Phase 5: Registration (Official Listing)
Finally, i915_driver_register is called to register itself with the DRM core and create the cardX and renderDX nodes under /dev/dri/. Only at this point can userspace applications (like the X Server or games) actually see and use this GPU.
3. Fundamentals of Hardware Interaction: Uncore and Forcewake
During initialization, you will frequently see the word uncore in the code. This is the core abstraction i915 uses to handle hardware access.
3.1 What is "Uncore"?
In modern SoC architectures, besides the Core that performs rendering calculations, there are many shared, common components (like power management, memory controllers, etc.). These are referred to as "Uncore". In i915, the intel_uncore structure is responsible for managing all MMIO register reads and writes.
3.2 Waking the Sleeping Silicon: Forcewake
To save power, different parts (Domains) of an Intel GPU enter deep sleep when idle. If you directly try to read a register in a sleeping domain, you might read back all 0xFFFFFFFF values, or even cause the machine to hang.
The Forcewake mechanism ensures safe access:
- When the driver needs to access a register, it first calls
intel_uncore_forcewake_get(). - The driver sends a "wake-up" request to the hardware.
- The hardware informs the driver it is awake via an acknowledgment register (ACK register).
- The driver completes its read/write operations and finally calls
intel_uncore_forcewake_put()to allow the hardware to go back to sleep.
This "request-wait-operate" pattern is the fundamental guarantee of i915's interaction with the hardware.
Summary
Initialization is a tightly orchestrated symphony: confirming identity through PCI probing, establishing communication via MMIO mapping, ensuring the hardware is awake with Forcewake, and finally delivering functionality through GEM and Display.


Top comments (0)