<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pallab Roy</title>
    <description>The latest articles on DEV Community by Pallab Roy (@pallab_roy).</description>
    <link>https://dev.to/pallab_roy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3860275%2Fa489f8b6-cb00-40bc-a4a5-888bce35fa3a.png</url>
      <title>DEV Community: Pallab Roy</title>
      <link>https://dev.to/pallab_roy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pallab_roy"/>
    <language>en</language>
    <item>
      <title>Stop Optimizing for MSE: Why Your Business Metrics Matter More Than Your Loss Function</title>
      <dc:creator>Pallab Roy</dc:creator>
      <pubDate>Sat, 04 Apr 2026 03:10:14 +0000</pubDate>
      <link>https://dev.to/pallab_roy/stop-optimizing-for-mse-why-your-business-metrics-matter-more-than-your-loss-function-f7e</link>
      <guid>https://dev.to/pallab_roy/stop-optimizing-for-mse-why-your-business-metrics-matter-more-than-your-loss-function-f7e</guid>
      <description>&lt;p&gt;As developers, we are trained to worship the leaderboards. We see a lower &lt;strong&gt;Mean Squared Error (MSE)&lt;/strong&gt; or a higher &lt;strong&gt;R-squared&lt;/strong&gt;, and we think we’ve won. &lt;/p&gt;

&lt;p&gt;But after half a decade in the industry—transitioning from a fullstack developer to a AI native software engineer building Gen AI predictors for French hospitality giants—I’ve learned a hard truth: &lt;strong&gt;Your stakeholders don't care about your loss function. They care about their bottom line&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Trap: When "Accurate" Models Fail the Business
&lt;/h2&gt;

&lt;p&gt;In the &lt;strong&gt;Regression Thinking Framework&lt;/strong&gt;, we learn that the loss function is just a "badness score". Most of us default to MSE because the math is "beautiful" and smooth. &lt;/p&gt;

&lt;p&gt;However, MSE treats all errors the same by squaring them. In the real world, being "off" by 10 units isn't always equal. &lt;/p&gt;

&lt;h3&gt;
  
  
  The Food Delivery Disaster
&lt;/h3&gt;

&lt;p&gt;Imagine you are building a model to predict food delivery times. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scenario A (Early):&lt;/strong&gt; The model predicts 30 mins; it arrives in 20. The customer is happy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scenario B (Late):&lt;/strong&gt; The model predicts 30 mins; it arrives in 40. The customer is angry, demands a refund, and leaves a 1-star review.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The MSE Problem:&lt;/strong&gt; A standard MSE loss penalizes being 10 minutes early and 10 minutes late exactly the same. If you optimize for MSE, you are essentially telling the business that customer churn is no more expensive than a pleasant surprise.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Strategic Shift: Loss Functions are Business Decisions
&lt;/h2&gt;

&lt;p&gt;One of the most important "Thinking Frameworks" I use today is recognizing that &lt;strong&gt;the loss function is a business decision, not a technical one&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Business Need&lt;/th&gt;
&lt;th&gt;Technical Metric (Internal)&lt;/th&gt;
&lt;th&gt;Business Metric (Stakeholder)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inventory Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RMSE&lt;/td&gt;
&lt;td&gt;% of Stockouts vs. Overstock cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Medical Dosage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MAE&lt;/td&gt;
&lt;td&gt;Patient Safety Margin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Financial Forecasting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Log-Loss&lt;/td&gt;
&lt;td&gt;Rupee Impact per Quarter&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In my current project, predicting goods prices for restaurants, a "small" error in predicting the price of high-volume items like onions is far more catastrophic than a "large" error on a rare spice. We had to move beyond simple MSE to ensure the model respected the &lt;strong&gt;asymmetric costs&lt;/strong&gt; of the restaurant's wallet.&lt;/p&gt;




&lt;h2&gt;
  
  
  3 Ways to Align Your Model with Reality
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Build an Asymmetric Loss
&lt;/h3&gt;

&lt;p&gt;If being late costs more than being early, tell your model. By penalizing under-prediction more heavily than over-prediction, you build a model that "under-promises and over-delivers". This isn't just math; it's a customer service strategy built into code.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The "Within X%" Rule
&lt;/h3&gt;

&lt;p&gt;Stakeholders rarely understand what an RMSE of 45.2 means. Instead, report: &lt;em&gt;"95% of our predictions are within +/- 10% of the actual cost"&lt;/em&gt;. This is a metric a CEO can make a decision on.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Compare Against the "Human" Baseline
&lt;/h3&gt;

&lt;p&gt;In every project—from my early days building HR management systems to my recent Gen AI work—I always compare the model against the current manual process. If your model has a slightly higher MSE but results in &lt;strong&gt;20% fewer stockouts&lt;/strong&gt; than the manual Excel sheet, you’ve won.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts: The Evolution of a Developer
&lt;/h2&gt;

&lt;p&gt;When I was 8, I got my first low-spec PC and tried every software just to see what it could do. I learned by breaking things and fixing them. &lt;/p&gt;

&lt;p&gt;In AI, we "break" the business when we optimize for the wrong metrics. Don't be the developer who delivers a mathematically "perfect" model that loses the company money. Be the strategist who uses &lt;strong&gt;Thinking Frameworks&lt;/strong&gt; to solve human problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What business metric are you actually trying to move? Stop looking at the loss curve and start looking at the impact.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;I wrote a full breakdown of How to Spot Data Leakage Before It Kills Your Production Code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/pallab_roy/the-silent-killer-of-ai-projects-how-to-spot-data-leakage-before-it-kills-your-production-code-2f12"&gt;Click Here&lt;/a&gt; if you want to read the whole thing.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>analytics</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>The Silent Killer of AI Projects: How to Spot Data Leakage Before It Kills Your Production Code</title>
      <dc:creator>Pallab Roy</dc:creator>
      <pubDate>Sat, 04 Apr 2026 02:14:26 +0000</pubDate>
      <link>https://dev.to/pallab_roy/the-silent-killer-of-ai-projects-how-to-spot-data-leakage-before-it-kills-your-production-code-2f12</link>
      <guid>https://dev.to/pallab_roy/the-silent-killer-of-ai-projects-how-to-spot-data-leakage-before-it-kills-your-production-code-2f12</guid>
      <description>&lt;p&gt;We’ve all been there. You’ve spent weeks cleaning data, engineering features, and tuning your model. You hit "Run," and the results are breathtaking: &lt;strong&gt;99.8% accuracy.&lt;/strong&gt; You celebrate. You might even start drafting the "Project Success" email to your stakeholders.&lt;/p&gt;

&lt;p&gt;But then, you deploy to production, and the model collapses. It’s not just performing poorly; it’s guessing.&lt;/p&gt;

&lt;p&gt;Welcome to the world of &lt;strong&gt;Data Leakage&lt;/strong&gt;. In my journey from a middle-class Bengali home—where I used to tear down motors and speakers to see how they worked—to building predictive Gen AI tools for French hotels, I’ve learned that the "guts" of a model matter more than the shiny exterior.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Data Leakage?
&lt;/h2&gt;

&lt;p&gt;Data leakage occurs when your training data accidentally contains information from the future, or information that simply won't be available at the moment you need to make a real-world prediction. &lt;/p&gt;

&lt;p&gt;It’s like giving a student the answer key inside the exam paper. They aren't learning the concepts; they are just reading the answers.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Hospital Readmission" Trap
&lt;/h3&gt;

&lt;p&gt;Imagine you are building a model to predict if a patient will be readmitted to the hospital. You include a feature: &lt;em&gt;"Follow-up appointment scheduled"&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Leak:&lt;/strong&gt; That appointment is usually scheduled &lt;strong&gt;after&lt;/strong&gt; the decision to discharge or readmit is made. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Result:&lt;/strong&gt; The model "predicts" the readmission perfectly because it sees the scheduled appointment that only exists &lt;em&gt;because&lt;/em&gt; the patient was readmitted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcakmpadjwf40pty23je.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcakmpadjwf40pty23je.png" alt="Image" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3 Red Flags That Your Code is "Cheating"
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The "Too Good to be True" Metric
&lt;/h3&gt;

&lt;p&gt;If your R-squared is 0.99 or your RMSE is near zero on your first attempt, don't celebrate—investigate. In the &lt;strong&gt;Regression Thinking Framework&lt;/strong&gt;, we call this a "warning sign," not a success. Check for any feature that has a suspiciously high correlation (&amp;gt;0.95) with your target.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The Time-Traveler's Split
&lt;/h3&gt;

&lt;p&gt;One of the biggest mistakes I see is using a &lt;strong&gt;Random 80/20 Split&lt;/strong&gt; on time-series data. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Error:&lt;/strong&gt; If you are predicting tomorrow’s sales, your model cannot see data from next month during its training. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Fix:&lt;/strong&gt; Use a &lt;strong&gt;Time-Based Split&lt;/strong&gt;. Train on months 1–10 and test on months 11–12. This mimics the real world, where the future is always unknown.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. The "Post-Event" Feature
&lt;/h3&gt;

&lt;p&gt;In my current work with Gen AI predicting fruit and vegetable prices for restaurants, we scrape news data to label and summarize trends. If we included the "Final Market Price" as a feature to predict the "Expected Price," the model would be useless. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rule of Thumb:&lt;/strong&gt; Ask yourself: &lt;em&gt;"Will I actually have this specific piece of data at 9:00 AM on the day I need the prediction?"&lt;/em&gt; If the answer is no, delete the feature.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Diagnostic Protocol: How to Protect Your Code
&lt;/h2&gt;

&lt;p&gt;Before you ship, run this "Audit":&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Feature Audit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flag any feature that wouldn't exist at the time of prediction.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Correlation Check&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Identify features that "explain" the target too perfectly.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Split Strategy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;TimeSeriesSplit&lt;/code&gt; for temporal data or &lt;code&gt;GroupKFold&lt;/code&gt; for customer-based data.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Feature Importance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;If a "suspicious" feature is in your Top 3, investigate it immediately.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Final Thoughts: Curiosity is Your Best Defense
&lt;/h2&gt;

&lt;p&gt;When I was a kid, I didn't just play with toys; I wanted to know the "functionality behind the cool toy". Engineering is the same. Don't just look at the accuracy score; look at the &lt;strong&gt;why&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The most dangerous models aren't the ones that fail—it's the ones that give you &lt;strong&gt;confidently wrong answers&lt;/strong&gt; because they were allowed to cheat during training. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have you ever been burned by a 99% accuracy model that failed in production? Let’s discuss in the comments.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>automation</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
