<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Desk Pai</title>
    <description>The latest articles on DEV Community by Desk Pai (@deskpai).</description>
    <link>https://dev.to/deskpai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3283349%2F7dfe8722-2fcd-47c9-847c-f508dc94a5eb.png</url>
      <title>DEV Community: Desk Pai</title>
      <link>https://dev.to/deskpai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/deskpai"/>
    <language>en</language>
    <item>
      <title>The Hidden Pitfalls of ONNXRuntime GPU Setup</title>
      <dc:creator>Desk Pai</dc:creator>
      <pubDate>Sat, 21 Jun 2025 21:39:42 +0000</pubDate>
      <link>https://dev.to/deskpai/the-hidden-pitfalls-of-onnxruntime-gpu-setup-4kb7</link>
      <guid>https://dev.to/deskpai/the-hidden-pitfalls-of-onnxruntime-gpu-setup-4kb7</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi9uehjflep55maywgrtl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi9uehjflep55maywgrtl.png" alt="Image description" width="800" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In developing &lt;strong&gt;DeskPai&lt;/strong&gt;, our all-in-one media toolkit, we integrated &lt;code&gt;onnxruntime-gpu&lt;/code&gt; to accelerate AI inference via CUDA. Everything &lt;em&gt;seemed&lt;/em&gt; to be set up correctly—our Python call to &lt;code&gt;ort.get_available_providers()&lt;/code&gt; listed GPU support.&lt;/p&gt;

&lt;p&gt;But we quickly learned the hard way: &lt;strong&gt;that doesn't mean it's actually working&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This post explains what went wrong, how to &lt;em&gt;truly&lt;/em&gt; verify your ONNXRuntime GPU installation, and why upgrading cuDNN (not downgrading ORT or CUDA) turned out to be the cleanest fix.&lt;/p&gt;




&lt;h2&gt;
  
  
  Environment Setup
&lt;/h2&gt;

&lt;p&gt;Here’s our actual setup at the time of debugging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nvidia-smi
...
Driver Version: 550.144.03     CUDA Version: 12.4
GPU: NVIDIA GeForce RTX 4090   Memory: 24GB
OS: Ubuntu 20.04
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nvcc &lt;span class="nt"&gt;--version&lt;/span&gt;
Cuda compilation tools, release 12.4, V12.4.99
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;python &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import onnxruntime as ort; print(ort.__version__)"&lt;/span&gt;
1.20.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Trap: &lt;code&gt;get_available_providers()&lt;/code&gt; Is Not Enough
&lt;/h2&gt;

&lt;p&gt;Like most developers, we started by checking the available ONNXRuntime providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;onnxruntime&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ort&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ort&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_available_providers&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And got:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TensorrtExecutionProvider&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CUDAExecutionProvider&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CPUExecutionProvider&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks good, right?&lt;/p&gt;

&lt;p&gt;Then we tried running a model... and hit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;libcudnn_adv.so.9: cannot open shared object file: No such file or directory
Failed to create CUDAExecutionProvider.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was the moment we realized: &lt;strong&gt;ONNXRuntime can &lt;em&gt;detect&lt;/em&gt; CUDA, but still fail at runtime.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Root Cause: cuDNN Mismatch
&lt;/h2&gt;

&lt;p&gt;We confirmed CUDA was installed. But a system-wide check showed we only had cuDNN 8:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;find /usr &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"libcudnn*.so*"&lt;/span&gt;
&lt;span class="c"&gt;# only libcudnn.so.8 found&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But ONNXRuntime 1.20 requires &lt;strong&gt;cuDNN 9&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This exposed a common misconception:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Installing the CUDA Toolkit does not install cuDNN.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;cuDNN is a separate SDK with its own versioning and compatibility rules.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Properly Verify ONNXRuntime GPU Support
&lt;/h2&gt;

&lt;p&gt;We built a &lt;strong&gt;minimal test&lt;/strong&gt; using a 1-node ONNX model to verify actual runtime GPU support.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Create &lt;code&gt;minimal.onnx&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;onnx&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;onnx&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TensorProto&lt;/span&gt;

&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Identity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MinimalGraph&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_tensor_value_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TensorProto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_tensor_value_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TensorProto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;onnx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minimal.onnx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Load with CUDA
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;onnxruntime&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ort&lt;/span&gt;
&lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ort&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;InferenceSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minimal.onnx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;providers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CUDAExecutionProvider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_providers&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this crashes or falls back to CPU, your GPU backend isn’t functional.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why We Upgraded cuDNN Instead of Downgrading ONNXRuntime
&lt;/h2&gt;

&lt;p&gt;At first, we thought of downgrading ONNXRuntime to 1.18 to match cuDNN 8.&lt;/p&gt;

&lt;p&gt;But ONNXRuntime 1.18 unexpectedly threw &lt;strong&gt;cuBLASLt version errors&lt;/strong&gt;, since it expected an older version than what was bundled with CUDA 12.4.&lt;/p&gt;

&lt;p&gt;Fixing that would require downgrading the &lt;strong&gt;entire CUDA toolkit&lt;/strong&gt;, which is invasive and risky for a stable dev environment.&lt;/p&gt;

&lt;p&gt;So instead, we upgraded to &lt;strong&gt;cuDNN 9.10.2&lt;/strong&gt;, which is compatible with ONNXRuntime 1.20.1 and our current CUDA 12.4 stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This was cleaner, safer, and future-proof&lt;/strong&gt; (especially for TensorRT 9 compatibility).&lt;/p&gt;




&lt;h2&gt;
  
  
  Installing cuDNN 9 via &lt;code&gt;.deb&lt;/code&gt; (APT Local Repository)
&lt;/h2&gt;

&lt;p&gt;We followed NVIDIA's local repo install method:&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Download the cuDNN 9.10.2 &lt;code&gt;.deb&lt;/code&gt; file
&lt;/h3&gt;

&lt;p&gt;From &lt;a href="https://developer.nvidia.com/rdp/cudnn-archive#a-collapse9102" rel="noopener noreferrer"&gt;NVIDIA cuDNN archive&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Add the local repo
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dpkg &lt;span class="nt"&gt;-i&lt;/span&gt; cudnn-local-repo-ubuntu2004-9.10.2_1.0-1_amd64.deb
&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /var/cudnn-local-repo-ubuntu2004-9.10.2/&lt;span class="k"&gt;*&lt;/span&gt;.gpg /usr/share/keyrings/
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-key add /usr/share/keyrings/cudnn-&lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="nt"&gt;-keyring&lt;/span&gt;.gpg
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Install cuDNN 9
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  libcudnn9-cuda-12&lt;span class="o"&gt;=&lt;/span&gt;9.10.2.21-1 &lt;span class="se"&gt;\&lt;/span&gt;
  libcudnn9-dev-cuda-12&lt;span class="o"&gt;=&lt;/span&gt;9.10.2.21-1 &lt;span class="se"&gt;\&lt;/span&gt;
  libcudnn9-headers-cuda-12&lt;span class="o"&gt;=&lt;/span&gt;9.10.2.21-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Warning&lt;/strong&gt;: This will uninstall &lt;code&gt;libcudnn8-dev&lt;/code&gt;. cuDNN 8 and 9 dev headers cannot coexist.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Can cuDNN 8 and 9 Coexist?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Coexistence Allowed?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime libraries (&lt;code&gt;.so&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development headers&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you need both for development (e.g., TensorRT 8 and ONNXRuntime), isolate with &lt;strong&gt;Docker containers&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Register cuDNN 9 with the System
&lt;/h2&gt;

&lt;p&gt;Make sure dynamic linker recognizes the new libs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ldconfig
ldconfig &lt;span class="nt"&gt;-p&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;libcudnn.so.9
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If not found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"/usr/lib/x86_64-linux-gnu"&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/ld.so.conf.d/cudnn9.conf
&lt;span class="nb"&gt;sudo &lt;/span&gt;ldconfig
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Summary of What We Learned
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Why It Matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Built minimal ONNX test&lt;/td&gt;
&lt;td&gt;Validates runtime GPU inference, not just detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verified cuDNN compatibility&lt;/td&gt;
&lt;td&gt;Crucial for ONNXRuntime &amp;gt;= 1.19&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avoided ORT downgrade&lt;/td&gt;
&lt;td&gt;Prevented cuBLASLt and CUDA version conflicts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Upgraded cuDNN to 9.10.2&lt;/td&gt;
&lt;td&gt;Resolved runtime failures, future-proofed stack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Used &lt;code&gt;.deb&lt;/code&gt; + &lt;code&gt;ldconfig&lt;/code&gt; method&lt;/td&gt;
&lt;td&gt;Clean install and reliable shared object loading&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Final Advice
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ Don’t stop at &lt;code&gt;get_available_providers()&lt;/code&gt;.&lt;br&gt;
🧪 Always run an actual inference on GPU to validate your setup.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This approach saved us time and frustration—and made our DeskPai deployment stable across environments.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
