How to build a virtualized GPU that executes remotely and keeping your data local?

The idea is to build something like this:

Virtualization for GPU that allows you to run local GPU apps and the code is actually run in the cloud, keeping your data local.

Functionality:

vGPU is a virtualization layer for a GPU
your local app "runs" on local vGPU
local app decrypts the actual local data and sends the (CUDA) instructions to the remote GPU-Coortinator
GPU-Coortinator distribute the instructions to multiple real GPUs
then it sends the results back to vGPU which sends them to the local app

The advantage is your private data never leaves your network in plain. Only actual GPU instructions (CUDA instructions) are sent over the wire but encrypted with TLS.

I know it will be slow, but in cases where the data flow is small compared to processing time it could be a reasonable compromise for the security it gives you.

Also because instructions are distributed to multiple GPUs, when possible, it could offer better performance, in some cases, than locally

schema https://github.com/radumarias/rvirt-gpu/blob/main/website/resources/schema2.png

implementation ideas https://github.com/radumarias/rvirt-gpu/wiki/Implementation

Top comments (1)

Carolyn Weitz • Jun 3

I’ve fielded this question a few times from customers who need heavy GPU compute but can’t move sensitive data off‑premises. The key is to use GPU virtualization—technologies like NVIDIA vGPU or SR‑IOV to carve a physical GPU into multiple virtual devices that live in the cloud or a remote data center. You then establish a secure, high‑speed tunnel (VPN or dedicated link) so your application streams only the model weights or tensor data to the remote GPU, while all of your raw data remains on local storage. AceCloud’s GPUaaS supports this pattern: we provide the virtual GPU endpoint and secure networking, you keep your data vault‑locked on your side, and all the heavy lifting happens off‑site without compromising compliance or performance.