DEV Community

Desk Pai
Desk Pai

Posted on

The Hidden Pitfalls of ONNXRuntime GPU Setup

Image description

In developing DeskPai, our all-in-one media toolkit, we integrated onnxruntime-gpu to accelerate AI inference via CUDA. Everything seemed to be set up correctly—our Python call to ort.get_available_providers() listed GPU support.

But we quickly learned the hard way: that doesn't mean it's actually working.

This post explains what went wrong, how to truly verify your ONNXRuntime GPU installation, and why upgrading cuDNN (not downgrading ORT or CUDA) turned out to be the cleanest fix.


Environment Setup

Here’s our actual setup at the time of debugging:

$ nvidia-smi
...
Driver Version: 550.144.03     CUDA Version: 12.4
GPU: NVIDIA GeForce RTX 4090   Memory: 24GB
OS: Ubuntu 20.04
Enter fullscreen mode Exit fullscreen mode
$ nvcc --version
Cuda compilation tools, release 12.4, V12.4.99
Enter fullscreen mode Exit fullscreen mode
$ python -c "import onnxruntime as ort; print(ort.__version__)"
1.20.1
Enter fullscreen mode Exit fullscreen mode

The Trap: get_available_providers() Is Not Enough

Like most developers, we started by checking the available ONNXRuntime providers:

import onnxruntime as ort
print(ort.get_available_providers())
Enter fullscreen mode Exit fullscreen mode

And got:

['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
Enter fullscreen mode Exit fullscreen mode

Looks good, right?

Then we tried running a model... and hit:

libcudnn_adv.so.9: cannot open shared object file: No such file or directory
Failed to create CUDAExecutionProvider.
Enter fullscreen mode Exit fullscreen mode

This was the moment we realized: ONNXRuntime can detect CUDA, but still fail at runtime.


Root Cause: cuDNN Mismatch

We confirmed CUDA was installed. But a system-wide check showed we only had cuDNN 8:

$ find /usr -name "libcudnn*.so*"
# only libcudnn.so.8 found
Enter fullscreen mode Exit fullscreen mode

But ONNXRuntime 1.20 requires cuDNN 9.

This exposed a common misconception:

Installing the CUDA Toolkit does not install cuDNN.

cuDNN is a separate SDK with its own versioning and compatibility rules.


How to Properly Verify ONNXRuntime GPU Support

We built a minimal test using a 1-node ONNX model to verify actual runtime GPU support.

Step 1: Create minimal.onnx

import onnx
from onnx import helper, TensorProto

node = helper.make_node("Identity", ["input"], ["output"])
graph = helper.make_graph(
    [node],
    "MinimalGraph",
    [helper.make_tensor_value_info("input", TensorProto.FLOAT, [1])],
    [helper.make_tensor_value_info("output", TensorProto.FLOAT, [1])]
)
model = helper.make_model(graph)
onnx.save(model, "minimal.onnx")
Enter fullscreen mode Exit fullscreen mode

Step 2: Load with CUDA

import onnxruntime as ort
session = ort.InferenceSession("minimal.onnx", providers=["CUDAExecutionProvider"])
print(session.get_providers())
Enter fullscreen mode Exit fullscreen mode

If this crashes or falls back to CPU, your GPU backend isn’t functional.


Why We Upgraded cuDNN Instead of Downgrading ONNXRuntime

At first, we thought of downgrading ONNXRuntime to 1.18 to match cuDNN 8.

But ONNXRuntime 1.18 unexpectedly threw cuBLASLt version errors, since it expected an older version than what was bundled with CUDA 12.4.

Fixing that would require downgrading the entire CUDA toolkit, which is invasive and risky for a stable dev environment.

So instead, we upgraded to cuDNN 9.10.2, which is compatible with ONNXRuntime 1.20.1 and our current CUDA 12.4 stack.

This was cleaner, safer, and future-proof (especially for TensorRT 9 compatibility).


Installing cuDNN 9 via .deb (APT Local Repository)

We followed NVIDIA's local repo install method:

Step 1: Download the cuDNN 9.10.2 .deb file

From NVIDIA cuDNN archive

Step 2: Add the local repo

sudo dpkg -i cudnn-local-repo-ubuntu2004-9.10.2_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-9.10.2/*.gpg /usr/share/keyrings/
sudo apt-key add /usr/share/keyrings/cudnn-*-keyring.gpg
sudo apt update
Enter fullscreen mode Exit fullscreen mode

Step 3: Install cuDNN 9

sudo apt install -y \
  libcudnn9-cuda-12=9.10.2.21-1 \
  libcudnn9-dev-cuda-12=9.10.2.21-1 \
  libcudnn9-headers-cuda-12=9.10.2.21-1
Enter fullscreen mode Exit fullscreen mode

⚠️ Warning: This will uninstall libcudnn8-dev. cuDNN 8 and 9 dev headers cannot coexist.


Can cuDNN 8 and 9 Coexist?

Component Coexistence Allowed?
Runtime libraries (.so) ✅ Yes
Development headers ❌ No

If you need both for development (e.g., TensorRT 8 and ONNXRuntime), isolate with Docker containers.


Register cuDNN 9 with the System

Make sure dynamic linker recognizes the new libs:

sudo ldconfig
ldconfig -p | grep libcudnn.so.9
Enter fullscreen mode Exit fullscreen mode

If not found:

echo "/usr/lib/x86_64-linux-gnu" | sudo tee /etc/ld.so.conf.d/cudnn9.conf
sudo ldconfig
Enter fullscreen mode Exit fullscreen mode

Summary of What We Learned

Action Why It Matters
Built minimal ONNX test Validates runtime GPU inference, not just detection
Verified cuDNN compatibility Crucial for ONNXRuntime >= 1.19
Avoided ORT downgrade Prevented cuBLASLt and CUDA version conflicts
Upgraded cuDNN to 9.10.2 Resolved runtime failures, future-proofed stack
Used .deb + ldconfig method Clean install and reliable shared object loading

Final Advice

✅ Don’t stop at get_available_providers().
🧪 Always run an actual inference on GPU to validate your setup.

This approach saved us time and frustration—and made our DeskPai deployment stable across environments.

Top comments (0)