<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zipporah P.</title>
    <description>The latest articles on DEV Community by Zipporah P. (@_915f6e7fh).</description>
    <link>https://dev.to/_915f6e7fh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3651437%2Fcb3c8327-b16d-4115-a1d2-32b8abf4479c.png</url>
      <title>DEV Community: Zipporah P.</title>
      <link>https://dev.to/_915f6e7fh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_915f6e7fh"/>
    <language>en</language>
    <item>
      <title>From Pixels to Precision</title>
      <dc:creator>Zipporah P.</dc:creator>
      <pubDate>Tue, 10 Feb 2026 11:44:11 +0000</pubDate>
      <link>https://dev.to/_915f6e7fh/from-pixels-to-precision-425h</link>
      <guid>https://dev.to/_915f6e7fh/from-pixels-to-precision-425h</guid>
      <description>&lt;h3&gt;
  
  
  How Algorithmic Insight and Scalable Architecture Turn Noisy SEM Images into Reliable Data
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcwdpq7b183qcleyavxq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcwdpq7b183qcleyavxq.jpg" alt="SEM image before and after denoising" width="768" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Working with SEM images means accepting noise as a given.&lt;br&gt;&lt;br&gt;
Noise is not just a visual artifact - it directly affects measurements, downstream analysis, and scientific conclusions.&lt;/p&gt;

&lt;p&gt;This work was carried out as part of an intensive &lt;strong&gt;Applied Materials &amp;amp; Extra-Tech bootcamp&lt;/strong&gt;, where the challenge went far beyond choosing the “right” denoising model.&lt;br&gt;&lt;br&gt;
I would like to thank my mentors &lt;strong&gt;Roman Kris&lt;/strong&gt; and &lt;strong&gt;Mor Baram&lt;/strong&gt; from Applied Materials for their technical guidance, critical questions, and constant push toward practical, production-level thinking, as well as &lt;strong&gt;Shmuel Fine&lt;/strong&gt; and &lt;strong&gt;Sara Shimon&lt;/strong&gt; from Extra-Tech for their support and teaching throughout the process.&lt;/p&gt;

&lt;p&gt;Our goal was not simply to “clean images”, but to &lt;strong&gt;build a system that treats noise as an algorithmic challenge and solves it at scale&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
This is a story about denoising - and about the infrastructure that makes it reliable.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Who is this for?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post is written for engineers and researchers working with SEM data, image processing pipelines, or machine learning systems that need to operate at scale.&lt;/p&gt;


&lt;h2&gt;
  
  
  Noise Is Not One Problem - It’s Many
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-noise-image.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-noise-image.jpg" alt="Examples of different types of SEM noise" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;SEM noise rarely follows a single distribution.&lt;br&gt;&lt;br&gt;
In practice, it often combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gaussian-like noise&lt;/li&gt;
&lt;li&gt;Texture-dependent artifacts&lt;/li&gt;
&lt;li&gt;Frame-to-frame variability within the same dataset&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Classical denoising methods provide a natural starting point:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mean / Gaussian filters&lt;/strong&gt; - effective for uniform noise, but blur fine details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Median / Bilateral filters&lt;/strong&gt; - preserve edges, struggle with complex noise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BM3D / NLM&lt;/strong&gt; - high-quality results, at the cost of heavy computation and careful tuning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each method solves part of the problem - and introduces new trade-offs.&lt;/p&gt;


&lt;h2&gt;
  
  
  When Deep Learning Stops Being a Silver Bullet
&lt;/h2&gt;

&lt;p&gt;Deep learning models such as &lt;strong&gt;UNet and DRUNet&lt;/strong&gt; significantly changed the denoising landscape.&lt;br&gt;&lt;br&gt;
They learn noise patterns directly from data rather than relying on fixed assumptions.&lt;/p&gt;

&lt;p&gt;However:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They require high-quality training data&lt;/li&gt;
&lt;li&gt;They are computationally expensive&lt;/li&gt;
&lt;li&gt;They are not always optimal for every noise regime&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Replacing classical methods entirely was never the goal.&lt;br&gt;&lt;br&gt;
Instead, we aimed to &lt;strong&gt;use deep learning exactly where it provides the most value&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Hybrid Pipeline: Let Each Method Do What It Does Best
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fz78fu5jd92ga206kz3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fz78fu5jd92ga206kz3.png" alt="Diagram of classical and deep learning SEM denoising pipeline" width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pipeline was designed as a sequence of informed decisions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Classical filtering&lt;/strong&gt; to stabilize and reduce uniform noise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deep learning models&lt;/strong&gt; to handle complex, non-linear noise patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality metrics at every stage&lt;/strong&gt; to evaluate edges, texture, and detail preservation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Rather than producing a single blind output, the system &lt;strong&gt;selects the best result based on measurable criteria&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At this point, a new challenge emerged:&lt;br&gt;&lt;br&gt;
How does this pipeline behave at scale?&lt;/p&gt;


&lt;h2&gt;
  
  
  Algorithms Don’t Scale - Systems Do
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-architecture-diagram.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-architecture-diagram.jpg" alt="Multi-server and worker pool architecture for SEM processing" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To support large datasets and multiple users, the algorithm needed a solid architectural backbone:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multi-client / multi-server design&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worker pools&lt;/strong&gt; executing pipeline stages in parallel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External object storage (S3 / MinIO)&lt;/strong&gt; for intermediate results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis-based caching&lt;/strong&gt; to reduce I/O overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relational database&lt;/strong&gt; for job state, metrics, and recovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The denoising logic remained the same -&lt;br&gt;&lt;br&gt;
but performance, stability, and throughput changed dramatically.&lt;/p&gt;


&lt;h2&gt;
  
  
  Parallelism Done Right
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-parallelism-diagram.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/path-to-parallelism-diagram.jpg" alt="Parallel processing and worker pool pipeline diagram" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not all parallelism is equal. The system exploits parallelism across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt; - maximizing throughput on large datasets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline stages&lt;/strong&gt; - overlapping CPU- and GPU-heavy tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution models&lt;/strong&gt; - threads for native/CUDA workloads, processes to bypass Python’s GIL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in predictable runtimes and efficient resource utilization.&lt;/p&gt;


&lt;h2&gt;
  
  
  Measure First. Optimize Later.
&lt;/h2&gt;

&lt;p&gt;Before optimizing anything, we benchmarked:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Per-image latency&lt;/li&gt;
&lt;li&gt;Overall throughput&lt;/li&gt;
&lt;li&gt;CPU and GPU utilization&lt;/li&gt;
&lt;li&gt;I/O overhead&lt;/li&gt;
&lt;li&gt;Impact of concurrent users&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Bottlenecks often live outside the model.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  Why Caching Changed Everything
&lt;/h2&gt;

&lt;p&gt;Intermediate results are cached using structured keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;image_id&amp;gt;:&amp;lt;version&amp;gt;:&amp;lt;stage&amp;gt;:&amp;lt;config_hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Instant reuse of previous computations&lt;/li&gt;
&lt;li&gt;True stop-and-resume capabilities&lt;/li&gt;
&lt;li&gt;Cross-server result sharing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, this &lt;strong&gt;eliminated hours of redundant processing&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A strong algorithm needs a system that supports it&lt;/li&gt;
&lt;li&gt;Hybrid approaches outperform single-method solutions&lt;/li&gt;
&lt;li&gt;Metrics are part of the algorithm, not an afterthought&lt;/li&gt;
&lt;li&gt;Caching and parallelism are force multipliers&lt;/li&gt;
&lt;li&gt;Good architecture allows algorithms to shine&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The core of SEM denoising lies in algorithms that maximize measurable quality metrics while minimizing information loss.&lt;/p&gt;

&lt;p&gt;By combining algorithmic insight, deep learning, and scalable architecture, noisy SEM images become reliable data.&lt;/p&gt;

&lt;p&gt;That is the journey from &lt;strong&gt;Pixels to Precision&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Closing Thought:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you’re facing similar challenges in image processing or machine learning systems,&lt;br&gt;&lt;br&gt;
I’d be happy to hear how you approach noise and scalability in your own pipelines.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>architecture</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
