Deleon Karen

Posted on Jun 2

Part 2: Architecture — The "Duet" of Host and Virtual Machine

#architecture #infrastructure #microsoft #systems

In the previous part, we covered the basic concepts of GPU-P. Today, we'll dive deep into its internal architecture. If a traditional graphics driver is a solo performance, then GPU-P is a precisely choreographed "duet": the Host and the Guest (virtual machine) have clearly defined roles, working together closely through VMBus.

Component Model: An Unbalanced "Bilocation"

Under the GPU-P architecture, the driver components are split between two worlds — the host and the virtual machine. Interestingly, the components in these two worlds are asymmetric.

Host: The All-Powerful "Brain"

The host possesses the complete graphics subsystem stack and serves as the ultimate resource manager:

Full-Featured KMD (Kernel Mode Driver): Interacts directly with the physical GPU hardware.
VidMm (Video Memory Manager): Controls the allocation and scheduling of all video memory.
VidSch (Scheduler): Decides which virtual machine's tasks can enter the GPU hardware queues.
UMD (User Mode Driver): Used for the host's own rendering tasks.

Guest (Virtual Machine): The Lean "Executor"

Inside the virtual machine, things are very "slim":

UMD Only: An adapted user mode driver runs inside the VM, directly facing applications (like games or AI frameworks).
No KMD: The manufacturer's kernel mode driver code does not run inside the VM.
No VidMm or VidSch: The virtual machine is not responsible for managing video memory or hardware scheduling.

This design drastically reduces the attack surface of the virtual machine, while also avoiding the overhead of running complex video memory management logic repeatedly in every VM.

Virtual Render Device (VRD): The Ingenious "Impostor"

Since there is no KMD in the virtual machine, how does the Windows operating system know to load a display driver? This is the work of the VRD (Virtual Render Device).

The VRD is a "shadow driver" on the Guest side. Its main responsibilities are:

Deceiving the OS: Mounting a virtual graphics device in Device Manager, making the VM's operating system think "I have hardware," thereby triggering the loading of Dxgkrnl.sys (the DirectX Graphics Kernel).
Guiding the Load: It acts as the fuse for loading the Guest-side UMD.

On the host side, each virtual machine corresponds to a VMWP.exe (VM Worker Process). Within this process runs a library called vrdumed.dll, which serves as the "back-end support" for the VRD, responsible for emulating this virtual device on the host side.

Communication Bridge: VMBus and Parameter Marshalling

For the UMD in the virtual machine to render an image, its commands must ultimately be passed to the host's hardware. How does this "dialogue" happen between them?

VMBus

VMBus, provided by Hyper-V, is the underlying communication mechanism for this architecture. It functions like a dedicated expressway, allowing data to travel rapidly between the virtual machine and host memory.

Parameter Marshalling

When an application inside the VM calls a DirectX API, the following process occurs:

Interception: The Guest-side Dxgkrnl receives the UMD's call requests (Thunk calls).
Marshalling: Dxgkrnl packages the parameters and data packets of these calls into individual Messages.
Transmission: These messages are sent to the host via VMBus.
Execution: The host-side Dxgkrnl receives the messages, unpacks them, and passes them to the physical driver for execution.

Optimization Mechanism: To prevent VMBus congestion, the Guest-side Dxgkrnl retains some "local objects" (such as handle mappings for Allocations and Devices), and communication across the boundary only occurs when hardware execution is truly required.

Resource Ownership: Why Doesn't the Guest Have Video Memory Management?

A common question is: Why not let the virtual machine manage the video memory allocated to it?

The answer is: Physical video memory is a globally unified resource.
If each virtual machine believed it had its own independent video memory manager, they would "fight" over the same physical addresses. In GPU-P:

Host Coordination: The host's VidMm stands from a God's-eye view, coordinating the whole picture. Based on each VM's configuration (like MinPartitionVRAM / MaxPartitionVRAM), it partitions the physical video memory among different VMs.
Guest Request: When a virtual machine needs memory, it must initiate a "loan" request to the Host via VMBus.

This "centralized authority" architecture ensures that even when multiple VMs are running under high load, the system will not suffer a host-wide blue screen (TDR) due to video memory conflicts.

Conclusion

The architecture of GPU-P showcases Microsoft's art of balancing performance and security in virtualization: through VRD deception, VMBus transmission, and highly centralized control on the Host side, efficient GPU resource sharing is achieved.

However, in such a shared environment, how do we ensure that a virtual machine cannot illegally access the host's memory? That is the topic we will discuss in our next part: IOMMU and Security Isolation.

DEV Community