<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: whisprer</title>
    <description>The latest articles on DEV Community by whisprer (@whisprer).</description>
    <link>https://dev.to/whisprer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3774695%2F4bc28a6c-d312-4927-9deb-c5c0d7b70986.png</url>
      <title>DEV Community: whisprer</title>
      <link>https://dev.to/whisprer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/whisprer"/>
    <language>en</language>
    <item>
      <title>How Boredom + A New PC Led to a 64x Faster Prime Sieve in Rust (The "Dopamine optimization" Story)</title>
      <dc:creator>whisprer</dc:creator>
      <pubDate>Mon, 16 Feb 2026 04:21:59 +0000</pubDate>
      <link>https://dev.to/whisprer/how-boredom-a-new-pc-led-to-a-64x-faster-prime-sieve-in-rust-the-dopamine-optimization-story-1mj2</link>
      <guid>https://dev.to/whisprer/how-boredom-a-new-pc-led-to-a-64x-faster-prime-sieve-in-rust-the-dopamine-optimization-story-1mj2</guid>
      <description>&lt;p&gt;The Origin Story: It Started With a "New Toy"&lt;br&gt;
About six months ago, I was sitting at my desk with a serious case of coder’s block. I didn’t want to grind through existing bugs; I needed a win. A dopamine hit.&lt;/p&gt;

&lt;p&gt;I also happened to be staring at a new (to me) monster of a workstation: 4.3GHz i7, 64GB RAM, Quadro GPU. It was a beast, and I realized I had never actually pushed it. I wanted to see the metal glow. I wanted to make the fans spin.&lt;/p&gt;

&lt;p&gt;So I set a challenge: Write the absolute fastest, most CPU-punishing code possible in C++.&lt;/p&gt;

&lt;p&gt;Math is heavy, so I landed on Prime Generation. I found a paper on optimizing the Sieve of Eratosthenes, dug into SIMD intrinsics (AVX2/512), and spent a frantic weekend obsessing over cache lines and cycle counts. I managed to beat the standard library implementations, watched my CPU hit 100% usage... and then, just like that, the dopamine faded.&lt;/p&gt;

&lt;p&gt;"Cool. Next."&lt;/p&gt;

&lt;p&gt;I shelved the code and forgot about it.&lt;/p&gt;

&lt;p&gt;The Flashback&lt;br&gt;
Fast forward to 3 days ago. I was compiling a Rust project, watching the dependencies scroll by.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Compiling rand...
Compiling primes...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait.&lt;/p&gt;

&lt;p&gt;It hit me like a brick. I had an entire codebase of highly optimized, battle-tested prime generation logic sitting in a dusty C++ repo. I had already solved the hard parts—the bit-twiddling, the cache-locality, the memory layout.&lt;/p&gt;

&lt;p&gt;Why wasn't I using it in Rust?&lt;/p&gt;

&lt;p&gt;The Frenzy: Porting to Rust&lt;br&gt;
The fear of losing the idea before I could implement it kicked in. I launched into a 12-hour coding fugue state.&lt;/p&gt;

&lt;p&gt;The goal wasn't just a port; it was a reimagining. In C++, I was raw-dogging pointers. In Rust, I wanted that same performance but with safety and better ergonomics.&lt;/p&gt;

&lt;p&gt;I ended up with Primer—a crate that is:&lt;/p&gt;

&lt;p&gt;Bit-Packed: Uses 1 bit per odd number (effectively 0.5 bits per number).&lt;/p&gt;

&lt;p&gt;Odd-Only: We hardcode 2 and ignore even numbers entirely.&lt;/p&gt;

&lt;p&gt;Intrinsics-Powered: Uses trailing_zeros (compiles to tzcnt on x86) to scan the sieve instantly.&lt;/p&gt;

&lt;p&gt;Kernighan-Iterated: We skip zeros at the hardware level.&lt;/p&gt;

&lt;p&gt;The Result: 64x Faster &amp;amp; 95x Smaller&lt;br&gt;
I didn't expect the Rust compiler to play this nice, but the results blew me away.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Benchmark (n = 50,000,000):&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Standard Vec&amp;lt;bool&amp;gt; implementation: ~47 MB RAM&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Primer (Segmented Buffer): ~32 KB RAM&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Speed: Up to 64x faster than the primes crate in bulk generation.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Why This Matters (Beyond the Speed)&lt;br&gt;
This project wasn't born from a need for primes. It was born from boredom and a desire to see a computer work.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffw71wqvoifiat8zaeaop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffw71wqvoifiat8zaeaop.png" alt=" " width="679" height="248"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sometimes the best code doesn't come from a ticket or a spec sheet. It comes from staring at a blinking cursor at 2 AM, wondering, "I wonder if I can make this go faster..."&lt;/p&gt;

&lt;p&gt;And then accidentally building the fastest thing in the room.&lt;/p&gt;

&lt;p&gt;Check out the code (and the benchmarks) on GitHub:&lt;br&gt;
👉 &lt;code&gt;https://github.com/whisprer/primer&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;(Also, if you are an embedded Rust dev—ESP32/Raspberry Pi—this 32KB footprint is specifically for you. Go wild.)&lt;/p&gt;

</description>
      <category>rust</category>
      <category>performance</category>
      <category>systems</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
