If WDDM is an operating system, then Video Memory Management (VidMm) is its heart. With WDDM 2.0+, the logic of memory management underwent a fundamental shift from "OS-applied patching" to "driver/application-managed state."
Physical Perspective: Implementation of Memory Segments
The driver describes the GPU's physical memory layout to the OS through "memory segments." This is primarily accomplished through two calls to DxgkDdiQueryAdapterInfo.
Implementation Flow:
- First Call (Get Count):
- The OS sends
DXGKQAITYPE_QUERYSEGMENT(or_QUERYSEGMENT3). - The driver only populates
DXGK_QUERYSEGMENTOUT3.NbSegment(e.g., 1 VRAM segment, 1 Aperture segment, returns 2).
- The OS sends
- Second Call (Populate Descriptors):
- The OS allocates space for the
DXGK_SEGMENTDESCRIPTOR3array. - The driver populates the specific parameters for each segment.
- The OS allocates space for the
Key Parameter Analysis:
- BaseAddress / Size:
- For Local VRAM: This is the physical starting address as seen internally by the GPU.
- For Aperture Segment: This is the window starting address the GPU uses to access system memory.
- CpuVisibleAddress:
- If the GPU's physical memory is mapped into the CPU's address space via a PCIe BAR, the driver needs to provide
CpuVisibleAddress. - WDDM 2.0+ Optimization: Even without a large BAR, dynamic mapping via
CpuHostApertureis possible, with VidMm handling paging automatically.
- If the GPU's physical memory is mapped into the CPU's address space via a PCIe BAR, the driver needs to provide
- Flags:
-
Aperture: Marks the segment as a "window" into system memory. -
PopulatedFromSystemMemory: Marks whether the physical backing store of this segment is essentially system RAM. -
CpuVisible: Tells the OS whether the CPU can directly read/write this segment's memory.
-
Development Guidance: Be absolutely accurate in marking which segments are PopulatedFromSystemMemory. Misreporting system memory as local VRAM will cause significant discrepancies in the OS page file and the Total Graphics Memory calculation formula.
Creating Allocations: DxgkDdiCreateAllocation
When an application requests resources, the OS calls this DDI.
- Responsibility: The driver must calculate the size and alignment requirements for the resource and return
DXGK_ALLOCATIONINFO. - WDDM 2.0 Change: Drivers no longer need to record a "Patch Location List," as resources are now accessed via virtual addresses.
Engineering Focus: GPU Virtual Addressing (GPUVA)
This is a core feature of WDDM 2.0.
- Concept: Each process has a 48-bit (or larger) virtual address space.
- Benefit: UMD can use fixed addresses directly in the command stream, eliminating the need for the OS to modify instructions before submission.
- Driver Responsibility: KMD needs to implement page table operations (via
UpdatePageTableoperations withinDxgkDdiBuildPagingBuffer).
Residency Mechanism: MakeResident and Eviction
Before WDDM 2.0, the OS automatically ensured all resources referenced by a command buffer were in video memory. Now, this responsibility is handed over to the driver (primarily UMD).
- MakeResident (Triggered by UMD): The driver explicitly tells the OS, "The upcoming operations need these resources; please keep them in video memory."
- Eviction (OS Policy): When video memory is low, the OS uses algorithms like LRU to evict non-resident resources to system memory.
- Development Guidance:
- Do Not Over-Reside:
MakeResidentcalls exceeding the process budget will fail. - Handle Eviction Notifications: In WDDM 3.2, if a resource requires special handling (e.g., decompression) before eviction, the
NotifyEvictionflag can be utilized.
- Do Not Over-Reside:
Advanced Perspective: WDDM vs. Linux (GEM/TTM)
If you are familiar with Linux kernel driver development, you will find WDDM's VidMm shares similarities with Linux's GEM/TTM, but there are core philosophical differences.
| Feature | WDDM 2.0+ (VidMm) | Linux (GEM/TTM) |
|---|---|---|
| Basic Unit | Allocation | Buffer Object (BO) |
| Memory Abstraction | Segments: Memory, Aperture, System | Regions/Placements: VRAM, GTT, System |
| Residency Logic | Explicit Residency: UMD actively maintains device residency list | Validation Logic: Kernel ensures BO is in the correct Region upon command submission |
| Address Binding | GPUVA: UMD allocates fixed virtual address, OS handles page table updates | Relocations (Traditional GEM): OS modifies instruction stream; GPUVA (Modern): Similar to WDDM |
| Migration/Paging |
OS-Driven: VidMm decides when to page, driver executes BuildPagingBuffer
|
Driver-Led: TTM provides framework, driver implements specific migration logic (Move) |
Key Differences:
- Boundary of Responsibility: WDDM 2.0+ delegates more decision-making power regarding "which resources need to be resident" to the UMD (User Mode Driver) to reduce kernel call overhead. In contrast, traditional Linux TTM validates the BO list during each
execbuffersubmission. - Page Table Management: GPU virtual address management in WDDM is highly standardized, with the OS deeply involved in the page table lifecycle; under Linux, different drivers (e.g., i915 vs. AMDGPU) have greater freedom in their page table implementations.
Advanced Topic: IOMMU and Hardware Isolation
For modern drivers (WDDM 2.4+), IOMMU is no longer transparent. It not only provides security (GPU isolation) but also solves addressing issues for systems with more than 1TB of physical memory on high-end servers (DMA Remapping).
- Core Challenge: Shifting from using physical addresses (MDL) to logical addresses (ADL).
- DDI That Must Be Implemented:
DxgkDdiBeginExclusiveAccess(ensures hardware is quiet during IOMMU switches).
Deep Dive: Please refer to the dedicated document WDDM Advanced: IOMMU and DMA Remapping.
Developer Advice: Accuracy of Video Memory Statistics
The OS heavily relies on the video memory statistics reported by the driver.
- Reporting Mechanism: Ensure
DXGK_DRIVERCAPSinDxgkDdiQueryAdapterInfocorrectly returns parameters likeMaxSharedSystemMemory. - Common Issue: If the driver misreports the memory size, it will cause the OS to incorrectly configure page file settings, leading to memory pressure crashes that are difficult to debug.



Top comments (0)