<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Coopernicus</title>
    <description>The latest articles on DEV Community by Coopernicus (@coopernicus01).</description>
    <link>https://dev.to/coopernicus01</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3910810%2Fd9e9170b-c166-4ae5-b245-6032088e425a.jpg</url>
      <title>DEV Community: Coopernicus</title>
      <link>https://dev.to/coopernicus01</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/coopernicus01"/>
    <language>en</language>
    <item>
      <title>I thought I found a cheap H100. I was wrong.</title>
      <dc:creator>Coopernicus</dc:creator>
      <pubDate>Tue, 05 May 2026 01:04:41 +0000</pubDate>
      <link>https://dev.to/coopernicus01/i-thought-i-found-a-cheap-h100-i-was-wrong-5bid</link>
      <guid>https://dev.to/coopernicus01/i-thought-i-found-a-cheap-h100-i-was-wrong-5bid</guid>
      <description>&lt;p&gt;I thought I found a great deal on an H100.&lt;/p&gt;

&lt;p&gt;~$2.50/hour. Way cheaper than what I’d seen elsewhere.&lt;/p&gt;

&lt;p&gt;On paper, it looked like a no-brainer.&lt;/p&gt;

&lt;p&gt;It wasn’t.&lt;/p&gt;




&lt;h2&gt;
  
  
  The mistake I made
&lt;/h2&gt;

&lt;p&gt;Like most people, I compared GPU providers based on:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;hourly price&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s how every pricing page is structured.&lt;/p&gt;

&lt;p&gt;So naturally, that’s how we evaluate them.&lt;/p&gt;

&lt;p&gt;But after actually running workloads, it became obvious:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;the hourly rate is one of the least important numbers.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What actually matters: cost per &lt;em&gt;useful&lt;/em&gt; compute
&lt;/h2&gt;

&lt;p&gt;The real question isn’t:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How much does this GPU cost per hour?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It’s:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How much does it cost to get the result I want?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Training run. Inference throughput. Completed job.&lt;/p&gt;

&lt;p&gt;Once you look at it that way, things change fast.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where the extra cost comes from
&lt;/h2&gt;

&lt;p&gt;Here are the biggest ones I’ve seen:&lt;/p&gt;




&lt;h3&gt;
  
  
  1. Idle GPUs (this adds up fast)
&lt;/h3&gt;

&lt;p&gt;GPUs are rarely fully utilized.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;jobs wait on data
&lt;/li&gt;
&lt;li&gt;pipelines stall
&lt;/li&gt;
&lt;li&gt;you overprovision “just in case”
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your GPU is sitting idle 30–40% of the time, your “cheap” instance isn’t cheap anymore.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Data movement (way bigger than people expect)
&lt;/h3&gt;

&lt;p&gt;At small scale, compute dominates.&lt;/p&gt;

&lt;p&gt;At larger scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dataset transfers
&lt;/li&gt;
&lt;li&gt;checkpoint syncing
&lt;/li&gt;
&lt;li&gt;cross-region traffic
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These costs quietly pile up.&lt;/p&gt;

&lt;p&gt;In some setups, they can rival or even exceed compute costs.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Retries + interruptions
&lt;/h3&gt;

&lt;p&gt;Stuff fails.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;spot instances get reclaimed
&lt;/li&gt;
&lt;li&gt;jobs crash
&lt;/li&gt;
&lt;li&gt;pipelines restart
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every retry:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;wastes progress
&lt;/li&gt;
&lt;li&gt;extends runtime
&lt;/li&gt;
&lt;li&gt;increases total cost
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cheap infra that fails more often = expensive infra.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Operational overhead
&lt;/h3&gt;

&lt;p&gt;This one’s less obvious, but real:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;time spent debugging infra
&lt;/li&gt;
&lt;li&gt;managing clusters
&lt;/li&gt;
&lt;li&gt;fixing deployment issues
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A slightly more expensive provider that “just works” can be cheaper overall.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this keeps happening
&lt;/h2&gt;

&lt;p&gt;Hourly pricing is simple.&lt;/p&gt;

&lt;p&gt;It’s easy to compare.&lt;/p&gt;

&lt;p&gt;And it looks precise.&lt;/p&gt;

&lt;p&gt;But it hides most of the variables that actually drive cost.&lt;/p&gt;




&lt;h2&gt;
  
  
  A better way to think about it
&lt;/h2&gt;

&lt;p&gt;Instead of comparing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;$/hour&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’ve started thinking in terms of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cost per training run
&lt;/li&gt;
&lt;li&gt;cost per 1M inferences
&lt;/li&gt;
&lt;li&gt;cost per completed job
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And asking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how utilized is the GPU actually?
&lt;/li&gt;
&lt;li&gt;how often do jobs fail?
&lt;/li&gt;
&lt;li&gt;how much data is moving around?
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;The cheapest GPU on paper is often not the cheapest in practice.&lt;/p&gt;

&lt;p&gt;And the difference can easily be 2× depending on how things are set up.&lt;/p&gt;




&lt;p&gt;I’ve been digging into this while building tools to compare real GPU/cloud costs across providers.&lt;/p&gt;

&lt;p&gt;Curious how others are thinking about this.&lt;/p&gt;

&lt;p&gt;Are you still comparing providers by hourly price, or looking at full workload cost?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cloud</category>
    </item>
  </channel>
</rss>
