<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: miriam K</title>
    <description>The latest articles on DEV Community by miriam K (@miriam_k).</description>
    <link>https://dev.to/miriam_k</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3666262%2F1734c7e1-cb88-4484-8a94-e0498f4dff62.png</url>
      <title>DEV Community: miriam K</title>
      <link>https://dev.to/miriam_k</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/miriam_k"/>
    <language>en</language>
    <item>
      <title>SEMNR: Why I Stopped Trusting "Clean" Images (And Treated Metrics as Guardrails)</title>
      <dc:creator>miriam K</dc:creator>
      <pubDate>Wed, 17 Dec 2025 06:49:10 +0000</pubDate>
      <link>https://dev.to/miriam_k/semnr-why-i-stopped-trusting-clean-images-and-treated-metrics-as-guardrails-3l3f</link>
      <guid>https://dev.to/miriam_k/semnr-why-i-stopped-trusting-clean-images-and-treated-metrics-as-guardrails-3l3f</guid>
      <description>&lt;p&gt;&lt;em&gt;This work was carried out as part of an intensive Applied Materials &amp;amp; Extra-Tech bootcamp, where the challenge went far beyond choosing the “right” denoising model.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I would like to thank my mentors Roman Kris and Mor Baram from Applied Materials for their technical guidance, critical questions, and constant push toward practical, production-level thinking, as well as Shmuel Fine and Sara Shimon from Extra-Tech for their support and teaching throughout the process.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In classical image processing, "clean" is a compliment. In semiconductor SEM denoising, "clean" is often a lie.&lt;/p&gt;

&lt;p&gt;The obvious goal of a denoiser is to remove noise. But in scientific and industrial imaging, the actual objective is &lt;strong&gt;evidence preservation&lt;/strong&gt;. Microscopic edges of a conductor, the subtle texture of a silicon surface, or a tiny defect—these signals carry critical meaning.&lt;/p&gt;

&lt;p&gt;A denoiser can easily make an image look pleasant to the human eye while silently scrubbing away the very details that change the entire analysis.&lt;/p&gt;

&lt;p&gt;Building &lt;strong&gt;SEMNR&lt;/strong&gt; taught me a hard lesson: standard evaluation methods were a trap. I didn't need a leaderboard to brag about; I needed engineering guardrails. Here is how I moved from chasing high scores to building a trust profile for my data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpvhav67alv0usz82p9m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpvhav67alv0usz82p9m.png" alt="High Score vs. High Trust: The middle image has a better PSNR score but blurred the critical edges of the wafer lines." width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;High Score vs. High Trust: The middle image has a better PSNR score but blurred the critical edges of the wafer lines. &lt;br&gt;
The right image (SEMNR) preserves the sharp structure and original texture, even if it's less smoothly "clean".&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining What I Refuse to Lose
&lt;/h2&gt;

&lt;p&gt;Before training a single model, I defined exactly what I refused to lose. Metric selection became an active engineering decision, not just a passive acceptance of default tools.&lt;/p&gt;

&lt;p&gt;I found that aggressive noise reduction often fights directly against preserving structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Metrics that reward smoothness&lt;/strong&gt; (like standard PSNR in many cases) actively encourage &lt;strong&gt;over-smoothing&lt;/strong&gt;. The model learns to blur textures just to get a better score by minimizing pixel error.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics that ignore texture&lt;/strong&gt; basically give the model permission to "hallucinate" details that aren't there, or worse, wipe out real defects critical for quality control.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To validate this, I ran "stress tests"—applying artificial blur, over-sharpening, and artifacts to SEM samples—to see which metrics flagged issues and which stayed silent. The results were wildly inconsistent. Often, &lt;strong&gt;PSNR improved while the image actually became less analytically useful.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I saw PSNR go up while utility went down. That instantly killed the "single hero number" idea for me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stack: Profiles Over Scores
&lt;/h2&gt;

&lt;p&gt;Instead of chasing one perfect number, I built a metric profile. Think of it as a QA toolkit where each metric has a specific job description:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqt53y9g8fvz1c9m4pnq3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqt53y9g8fvz1c9m4pnq3.png" alt="A delicate balance: The goal is to maximize the total area of the chart, not just one spike." width="800" height="436"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;A delicate balance: The goal is to maximize the total area of the chart, not just one spike. Notice how boosting PSNR (Fidelity) often comes at the direct expense of Texture Realism (DISTS).&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PSNR (The Anchor):&lt;/strong&gt; Measures pixel-level fidelity (how close raw pixel values are to the original). It is my baseline, but I never trust it alone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSIM (The Structural Engineer):&lt;/strong&gt; Ensures the "skeleton" of the image remains intact (checking macroscopic structures like contact holes or vias).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FSIM (The Edge Guardian):&lt;/strong&gt; Critical in SEM. It monitors sharp transitions between materials, flagging if edges are being blurred out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DISTS (The Texture Specialist):&lt;/strong&gt; Captures realism using deep learning features. This is the metric that prevents the "plastic" look and preserves natural grain.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CNR (The Pragmatist):&lt;/strong&gt; Reflects practical Contrast-to-Noise detectability. It asks: Can a computer vision algorithm actually spot a defect easier now against the background?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where Metrics Disagree – Finding the Debug Signals
&lt;/h2&gt;

&lt;p&gt;The most valuable engineering insights didn't arrive when all metrics went up together. They came when metrics &lt;strong&gt;disagreed&lt;/strong&gt;. I learned to read these conflicts as distinct debugging signals for model behavior:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;PSNR ⬆️ / FSIM ⬇️:&lt;/strong&gt; A clear sign of &lt;strong&gt;over-smoothing&lt;/strong&gt;. The model is aggressively cleaning noise but erasing high-frequency edge information.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;SSIM Stable / DISTS ⬇️:&lt;/strong&gt; The general structure is fine, but I am experiencing &lt;strong&gt;texture drift&lt;/strong&gt;. The surface is losing its authentic material character.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;PSNR ⬆️ / CNR ⬇️:&lt;/strong&gt; I am technically closer to the ground truth pixels, but I have lost local contrast, making features harder to interpret.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx69lwlfk7pfly037fz9e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx69lwlfk7pfly037fz9e.png" alt="The logic behind the scenes: The flowchart I used to flag failures." width="800" height="436"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The logic behind the scenes: The flowchart I used to flag failures that the human eye (or PSNR alone) might initially miss.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing: Shifting from Beauty to Trust
&lt;/h2&gt;

&lt;p&gt;In SEMNR, this process changed my guiding question from "Is this image clean?" to &lt;strong&gt;"Is this image trustworthy?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By building an evaluation stack that uses specific metrics as guardrails against specific failures (like edge blurring), I turned model evolution from a beauty contest into an engineering safety system.&lt;/p&gt;

&lt;p&gt;In the world of scientific and industrial data, my job isn't to beautify reality, but to reveal it with minimal interference. Sometimes, that means leaving a little bit of natural "noise" behind—just to make sure the truth stays in the picture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fikedyghj7e3xd8l62v1a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fikedyghj7e3xd8l62v1a.png" alt="The difference is in the micro-details: A zoom-in on a defect." width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The difference is in the micro-details: A zoom-in on a defect at the edge of a structure. Left: A standard model erased the defect along with the noise. Right (SEMNR): The noise is cleared, but the critical defect is preserved sharply.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>science</category>
    </item>
  </channel>
</rss>
