DEV Community

Deleon Karen
Deleon Karen

Posted on

Part 3: The Core of Memory Management: Allocation and Residency

If WDDM is an operating system, then Video Memory Management (VidMm) is its heart. With WDDM 2.0+, the logic of memory management underwent a fundamental shift from "OS-applied patching" to "driver/application-managed state."

Physical Perspective: Implementation of Memory Segments

The driver describes the GPU's physical memory layout to the OS through "memory segments." This is primarily accomplished through two calls to DxgkDdiQueryAdapterInfo.

Sequence diagram of the two‑phase memory segment query between Dxgkrnl (VidMm) and the miniport driver via DxgkDdiQueryAdapterInfo. Phase 1 retrieves the segment count; Phase 2 populates the DXGK_SEGMENTDESCRIPTOR3 array describing VRAM and Aperture segments.

Implementation Flow:

  1. First Call (Get Count):
    • The OS sends DXGKQAITYPE_QUERYSEGMENT (or _QUERYSEGMENT3).
    • The driver only populates DXGK_QUERYSEGMENTOUT3.NbSegment (e.g., 1 VRAM segment, 1 Aperture segment, returns 2).
  2. Second Call (Populate Descriptors):
    • The OS allocates space for the DXGK_SEGMENTDESCRIPTOR3 array.
    • The driver populates the specific parameters for each segment.

Key Parameter Analysis:

  • BaseAddress / Size:
    • For Local VRAM: This is the physical starting address as seen internally by the GPU.
    • For Aperture Segment: This is the window starting address the GPU uses to access system memory.
  • CpuVisibleAddress:
    • If the GPU's physical memory is mapped into the CPU's address space via a PCIe BAR, the driver needs to provide CpuVisibleAddress.
    • WDDM 2.0+ Optimization: Even without a large BAR, dynamic mapping via CpuHostAperture is possible, with VidMm handling paging automatically.
  • Flags:
    • Aperture: Marks the segment as a "window" into system memory.
    • PopulatedFromSystemMemory: Marks whether the physical backing store of this segment is essentially system RAM.
    • CpuVisible: Tells the OS whether the CPU can directly read/write this segment's memory.

Development Guidance: Be absolutely accurate in marking which segments are PopulatedFromSystemMemory. Misreporting system memory as local VRAM will cause significant discrepancies in the OS page file and the Total Graphics Memory calculation formula.

Creating Allocations: DxgkDdiCreateAllocation

When an application requests resources, the OS calls this DDI.

  • Responsibility: The driver must calculate the size and alignment requirements for the resource and return DXGK_ALLOCATIONINFO.
  • WDDM 2.0 Change: Drivers no longer need to record a "Patch Location List," as resources are now accessed via virtual addresses.

Engineering Focus: GPU Virtual Addressing (GPUVA)

This is a core feature of WDDM 2.0.

Overview of GPU Virtual Addressing (GPUVA) in WDDM 2.0. Each process has its own GPU virtual address space. The VidMm and driver page tables map virtual addresses to physical pages residing on hardware segments such as local VRAM or system memory accessed through an aperture over PCIe.

  • Concept: Each process has a 48-bit (or larger) virtual address space.
  • Benefit: UMD can use fixed addresses directly in the command stream, eliminating the need for the OS to modify instructions before submission.
  • Driver Responsibility: KMD needs to implement page table operations (via UpdatePageTable operations within DxgkDdiBuildPagingBuffer).

Residency Mechanism: MakeResident and Eviction

Before WDDM 2.0, the OS automatically ensured all resources referenced by a command buffer were in video memory. Now, this responsibility is handed over to the driver (primarily UMD).

State diagram of allocation residency and eviction in WDDM 2.0+. UMD explicitly requests residency via MakeResident (reference count +1). Under memory pressure, the OS may evict the allocation to system memory. Evicted allocations become non‑resident and must be made resident again before rendering.

  • MakeResident (Triggered by UMD): The driver explicitly tells the OS, "The upcoming operations need these resources; please keep them in video memory."
  • Eviction (OS Policy): When video memory is low, the OS uses algorithms like LRU to evict non-resident resources to system memory.
  • Development Guidance:
    • Do Not Over-Reside: MakeResident calls exceeding the process budget will fail.
    • Handle Eviction Notifications: In WDDM 3.2, if a resource requires special handling (e.g., decompression) before eviction, the NotifyEviction flag can be utilized.

Advanced Perspective: WDDM vs. Linux (GEM/TTM)

If you are familiar with Linux kernel driver development, you will find WDDM's VidMm shares similarities with Linux's GEM/TTM, but there are core philosophical differences.

Feature WDDM 2.0+ (VidMm) Linux (GEM/TTM)
Basic Unit Allocation Buffer Object (BO)
Memory Abstraction Segments: Memory, Aperture, System Regions/Placements: VRAM, GTT, System
Residency Logic Explicit Residency: UMD actively maintains device residency list Validation Logic: Kernel ensures BO is in the correct Region upon command submission
Address Binding GPUVA: UMD allocates fixed virtual address, OS handles page table updates Relocations (Traditional GEM): OS modifies instruction stream; GPUVA (Modern): Similar to WDDM
Migration/Paging OS-Driven: VidMm decides when to page, driver executes BuildPagingBuffer Driver-Led: TTM provides framework, driver implements specific migration logic (Move)

Key Differences:

  • Boundary of Responsibility: WDDM 2.0+ delegates more decision-making power regarding "which resources need to be resident" to the UMD (User Mode Driver) to reduce kernel call overhead. In contrast, traditional Linux TTM validates the BO list during each execbuffer submission.
  • Page Table Management: GPU virtual address management in WDDM is highly standardized, with the OS deeply involved in the page table lifecycle; under Linux, different drivers (e.g., i915 vs. AMDGPU) have greater freedom in their page table implementations.

Advanced Topic: IOMMU and Hardware Isolation

For modern drivers (WDDM 2.4+), IOMMU is no longer transparent. It not only provides security (GPU isolation) but also solves addressing issues for systems with more than 1TB of physical memory on high-end servers (DMA Remapping).

  • Core Challenge: Shifting from using physical addresses (MDL) to logical addresses (ADL).
  • DDI That Must Be Implemented: DxgkDdiBeginExclusiveAccess (ensures hardware is quiet during IOMMU switches).

Deep Dive: Please refer to the dedicated document WDDM Advanced: IOMMU and DMA Remapping.

Developer Advice: Accuracy of Video Memory Statistics

The OS heavily relies on the video memory statistics reported by the driver.

  • Reporting Mechanism: Ensure DXGK_DRIVERCAPS in DxgkDdiQueryAdapterInfo correctly returns parameters like MaxSharedSystemMemory.
  • Common Issue: If the driver misreports the memory size, it will cause the OS to incorrectly configure page file settings, leading to memory pressure crashes that are difficult to debug.

Top comments (0)