In developing DeskPai, our all-in-one media toolkit, we integrated onnxruntime-gpu
to accelerate AI inference via CUDA. Everything seemed to be set up correctly—our Python call to ort.get_available_providers()
listed GPU support.
But we quickly learned the hard way: that doesn't mean it's actually working.
This post explains what went wrong, how to truly verify your ONNXRuntime GPU installation, and why upgrading cuDNN (not downgrading ORT or CUDA) turned out to be the cleanest fix.
Environment Setup
Here’s our actual setup at the time of debugging:
$ nvidia-smi
...
Driver Version: 550.144.03 CUDA Version: 12.4
GPU: NVIDIA GeForce RTX 4090 Memory: 24GB
OS: Ubuntu 20.04
$ nvcc --version
Cuda compilation tools, release 12.4, V12.4.99
$ python -c "import onnxruntime as ort; print(ort.__version__)"
1.20.1
The Trap: get_available_providers()
Is Not Enough
Like most developers, we started by checking the available ONNXRuntime providers:
import onnxruntime as ort
print(ort.get_available_providers())
And got:
['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
Looks good, right?
Then we tried running a model... and hit:
libcudnn_adv.so.9: cannot open shared object file: No such file or directory
Failed to create CUDAExecutionProvider.
This was the moment we realized: ONNXRuntime can detect CUDA, but still fail at runtime.
Root Cause: cuDNN Mismatch
We confirmed CUDA was installed. But a system-wide check showed we only had cuDNN 8:
$ find /usr -name "libcudnn*.so*"
# only libcudnn.so.8 found
But ONNXRuntime 1.20 requires cuDNN 9.
This exposed a common misconception:
Installing the CUDA Toolkit does not install cuDNN.
cuDNN is a separate SDK with its own versioning and compatibility rules.
How to Properly Verify ONNXRuntime GPU Support
We built a minimal test using a 1-node ONNX model to verify actual runtime GPU support.
Step 1: Create minimal.onnx
import onnx
from onnx import helper, TensorProto
node = helper.make_node("Identity", ["input"], ["output"])
graph = helper.make_graph(
[node],
"MinimalGraph",
[helper.make_tensor_value_info("input", TensorProto.FLOAT, [1])],
[helper.make_tensor_value_info("output", TensorProto.FLOAT, [1])]
)
model = helper.make_model(graph)
onnx.save(model, "minimal.onnx")
Step 2: Load with CUDA
import onnxruntime as ort
session = ort.InferenceSession("minimal.onnx", providers=["CUDAExecutionProvider"])
print(session.get_providers())
If this crashes or falls back to CPU, your GPU backend isn’t functional.
Why We Upgraded cuDNN Instead of Downgrading ONNXRuntime
At first, we thought of downgrading ONNXRuntime to 1.18 to match cuDNN 8.
But ONNXRuntime 1.18 unexpectedly threw cuBLASLt version errors, since it expected an older version than what was bundled with CUDA 12.4.
Fixing that would require downgrading the entire CUDA toolkit, which is invasive and risky for a stable dev environment.
So instead, we upgraded to cuDNN 9.10.2, which is compatible with ONNXRuntime 1.20.1 and our current CUDA 12.4 stack.
This was cleaner, safer, and future-proof (especially for TensorRT 9 compatibility).
Installing cuDNN 9 via .deb
(APT Local Repository)
We followed NVIDIA's local repo install method:
Step 1: Download the cuDNN 9.10.2 .deb
file
From NVIDIA cuDNN archive
Step 2: Add the local repo
sudo dpkg -i cudnn-local-repo-ubuntu2004-9.10.2_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2004-9.10.2/*.gpg /usr/share/keyrings/
sudo apt-key add /usr/share/keyrings/cudnn-*-keyring.gpg
sudo apt update
Step 3: Install cuDNN 9
sudo apt install -y \
libcudnn9-cuda-12=9.10.2.21-1 \
libcudnn9-dev-cuda-12=9.10.2.21-1 \
libcudnn9-headers-cuda-12=9.10.2.21-1
⚠️ Warning: This will uninstall
libcudnn8-dev
. cuDNN 8 and 9 dev headers cannot coexist.
Can cuDNN 8 and 9 Coexist?
Component | Coexistence Allowed? |
---|---|
Runtime libraries (.so ) |
✅ Yes |
Development headers | ❌ No |
If you need both for development (e.g., TensorRT 8 and ONNXRuntime), isolate with Docker containers.
Register cuDNN 9 with the System
Make sure dynamic linker recognizes the new libs:
sudo ldconfig
ldconfig -p | grep libcudnn.so.9
If not found:
echo "/usr/lib/x86_64-linux-gnu" | sudo tee /etc/ld.so.conf.d/cudnn9.conf
sudo ldconfig
Summary of What We Learned
Action | Why It Matters |
---|---|
Built minimal ONNX test | Validates runtime GPU inference, not just detection |
Verified cuDNN compatibility | Crucial for ONNXRuntime >= 1.19 |
Avoided ORT downgrade | Prevented cuBLASLt and CUDA version conflicts |
Upgraded cuDNN to 9.10.2 | Resolved runtime failures, future-proofed stack |
Used .deb + ldconfig method |
Clean install and reliable shared object loading |
Final Advice
✅ Don’t stop at
get_available_providers()
.
🧪 Always run an actual inference on GPU to validate your setup.
This approach saved us time and frustration—and made our DeskPai deployment stable across environments.
Top comments (0)