The Hidden Pitfalls of ONNXRuntime GPU Setup

Desk Pai — Sat, 21 Jun 2025 21:39:42 +0000

In developing DeskPai, our all-in-one media toolkit, we integrated onnxruntime-gpu to accelerate AI inference via CUDA. Everything seemed to be set up correctly—our Python call to ort.get_available_providers() listed GPU support.

But we quickly learned the hard way: that doesn't mean it's actually working.

This post explains what went wrong, how to truly verify your ONNXRuntime GPU installation, and why upgrading cuDNN (not downgrading ORT or CUDA) turned out to be the cleanest fix.

Environment Setup

Here’s our actual setup at the time of debugging:

$ nvidia-smi
...
Driver Version: 550.144.03     CUDA Version: 12.4
GPU: NVIDIA GeForce RTX 4090   Memory: 24GB
OS: Ubuntu 20.04

$ nvcc --version
Cuda compilation tools, release 12.4, V12.4.99

$ python -c "import onnxruntime as ort; print(ort.__version__)"
1.20.1

The Trap: `get_available_providers()` Is Not Enough

Like most developers, we started by checking the available ONNXRuntime providers:

import onnxruntime as ort
print(ort.get_available_providers())

And got:

['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']

Looks good, right?

Then we tried running a model... and hit:

libcudnn_adv.so.9: cannot open shared object file: No such file or directory
Failed to create CUDAExecutionProvider.

This was the moment we realized: ONNXRuntime can detect CUDA, but still fail at runtime.

Root Cause: cuDNN Mismatch

We confirmed CUDA was installed. But a system-wide check showed we only had cuDNN 8:

$ find /usr -name "libcudnn*.so*"
# only libcudnn.so.8 found

But ONNXRuntime 1.20 requires cuDNN 9.

This exposed a common misconception:

Installing the CUDA Toolkit does not install cuDNN.

cuDNN is a separate SDK with its own versioning and compatibility rules.

How to Properly Verify ONNXRuntime GPU Support

We built a minimal test using a 1-node ONNX model to verify actual runtime GPU support.

Step 1: Create `minimal.onnx`

import onnx
from onnx import helper, TensorProto

node = helper.make_node("Identity", ["input"], ["output"])
graph = helper.make_graph(
    [node],
    "MinimalGraph",
    [helper.make_tensor_value_info("input", TensorProto.FLOAT, [1])],
    [helper.make_tensor_value_info("output", TensorProto.FLOAT, [1])]
)
model = helper.make_model(graph)
onnx.save(model, "minimal.onnx")

Step 2: Load with CUDA

import onnxruntime as ort
session = ort.InferenceSession("minimal.onnx", providers=["CUDAExecutionProvider"])
print(session.get_providers())

If this crashes or falls back to CPU, your GPU backend isn’t functional.

Why We Upgraded cuDNN Instead of Downgrading ONNXRuntime

At first, we thought of downgrading ONNXRuntime to 1.18 to match cuDNN 8.

But ONNXRuntime 1.18 unexpectedly threw cuBLASLt version errors, since it expected an older version than what was bundled with CUDA 12.4.

Fixing that would require downgrading the entire CUDA toolkit, which is invasive and risky for a stable dev environment.

So instead, we upgraded to cuDNN 9.10.2, which is compatible with ONNXRuntime 1.20.1 and our current CUDA 12.4 stack.

This was cleaner, safer, and future-proof (especially for TensorRT 9 compatibility).

Installing cuDNN 9 via `.deb` (APT Local Repository)

We followed NVIDIA's local repo install method:

Step 1: Download the cuDNN 9.10.2 `.deb` file

From NVIDIA cuDNN archive

Step 2: Add the local repo

sudo dpkg -i cudnn-local-repo-ubuntu2004-9.10.2_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-9.10.2/*.gpg /usr/share/keyrings/
sudo apt-key add /usr/share/keyrings/cudnn-*-keyring.gpg
sudo apt update

Step 3: Install cuDNN 9

sudo apt install -y \
  libcudnn9-cuda-12=9.10.2.21-1 \
  libcudnn9-dev-cuda-12=9.10.2.21-1 \
  libcudnn9-headers-cuda-12=9.10.2.21-1

⚠️ Warning: This will uninstall libcudnn8-dev. cuDNN 8 and 9 dev headers cannot coexist.

Can cuDNN 8 and 9 Coexist?

Component	Coexistence Allowed?
Runtime libraries (`.so`)	✅ Yes
Development headers	❌ No

If you need both for development (e.g., TensorRT 8 and ONNXRuntime), isolate with Docker containers.

Register cuDNN 9 with the System

Make sure dynamic linker recognizes the new libs:

sudo ldconfig
ldconfig -p | grep libcudnn.so.9

If not found:

echo "/usr/lib/x86_64-linux-gnu" | sudo tee /etc/ld.so.conf.d/cudnn9.conf
sudo ldconfig

Summary of What We Learned

Action	Why It Matters
Built minimal ONNX test	Validates runtime GPU inference, not just detection
Verified cuDNN compatibility	Crucial for ONNXRuntime >= 1.19
Avoided ORT downgrade	Prevented cuBLASLt and CUDA version conflicts
Upgraded cuDNN to 9.10.2	Resolved runtime failures, future-proofed stack
Used `.deb` + `ldconfig` method	Clean install and reliable shared object loading

Final Advice

✅ Don’t stop at get_available_providers().
🧪 Always run an actual inference on GPU to validate your setup.

This approach saved us time and frustration—and made our DeskPai deployment stable across environments.

DEV Community: Desk Pai