DEV Community: Pallab Roy

Stop Optimizing for MSE: Why Your Business Metrics Matter More Than Your Loss Function

Pallab Roy — Sat, 04 Apr 2026 03:10:14 +0000

As developers, we are trained to worship the leaderboards. We see a lower Mean Squared Error (MSE) or a higher R-squared, and we think we’ve won.

But after half a decade in the industry—transitioning from a fullstack developer to a AI native software engineer building Gen AI predictors for French hospitality giants—I’ve learned a hard truth: Your stakeholders don't care about your loss function. They care about their bottom line.

The Trap: When "Accurate" Models Fail the Business

In the Regression Thinking Framework, we learn that the loss function is just a "badness score". Most of us default to MSE because the math is "beautiful" and smooth.

However, MSE treats all errors the same by squaring them. In the real world, being "off" by 10 units isn't always equal.

The Food Delivery Disaster

Imagine you are building a model to predict food delivery times.

Scenario A (Early): The model predicts 30 mins; it arrives in 20. The customer is happy.
Scenario B (Late): The model predicts 30 mins; it arrives in 40. The customer is angry, demands a refund, and leaves a 1-star review.

The MSE Problem: A standard MSE loss penalizes being 10 minutes early and 10 minutes late exactly the same. If you optimize for MSE, you are essentially telling the business that customer churn is no more expensive than a pleasant surprise.

The Strategic Shift: Loss Functions are Business Decisions

One of the most important "Thinking Frameworks" I use today is recognizing that the loss function is a business decision, not a technical one.

Business Need	Technical Metric (Internal)	Business Metric (Stakeholder)
Inventory Management	RMSE	% of Stockouts vs. Overstock cost
Medical Dosage	MAE	Patient Safety Margin
Financial Forecasting	Log-Loss	Rupee Impact per Quarter

In my current project, predicting goods prices for restaurants, a "small" error in predicting the price of high-volume items like onions is far more catastrophic than a "large" error on a rare spice. We had to move beyond simple MSE to ensure the model respected the asymmetric costs of the restaurant's wallet.

3 Ways to Align Your Model with Reality

1. Build an Asymmetric Loss

If being late costs more than being early, tell your model. By penalizing under-prediction more heavily than over-prediction, you build a model that "under-promises and over-delivers". This isn't just math; it's a customer service strategy built into code.

2. The "Within X%" Rule

Stakeholders rarely understand what an RMSE of 45.2 means. Instead, report: "95% of our predictions are within +/- 10% of the actual cost". This is a metric a CEO can make a decision on.

3. Compare Against the "Human" Baseline

In every project—from my early days building HR management systems to my recent Gen AI work—I always compare the model against the current manual process. If your model has a slightly higher MSE but results in 20% fewer stockouts than the manual Excel sheet, you’ve won.

Final Thoughts: The Evolution of a Developer

When I was 8, I got my first low-spec PC and tried every software just to see what it could do. I learned by breaking things and fixing them.

In AI, we "break" the business when we optimize for the wrong metrics. Don't be the developer who delivers a mathematically "perfect" model that loses the company money. Be the strategist who uses Thinking Frameworks to solve human problems.

What business metric are you actually trying to move? Stop looking at the loss curve and start looking at the impact.

I wrote a full breakdown of How to Spot Data Leakage Before It Kills Your Production Code.

Click Here if you want to read the whole thing.

The Silent Killer of AI Projects: How to Spot Data Leakage Before It Kills Your Production Code

Pallab Roy — Sat, 04 Apr 2026 02:14:26 +0000

We’ve all been there. You’ve spent weeks cleaning data, engineering features, and tuning your model. You hit "Run," and the results are breathtaking: 99.8% accuracy. You celebrate. You might even start drafting the "Project Success" email to your stakeholders.

But then, you deploy to production, and the model collapses. It’s not just performing poorly; it’s guessing.

Welcome to the world of Data Leakage. In my journey from a middle-class Bengali home—where I used to tear down motors and speakers to see how they worked—to building predictive Gen AI tools for French hotels, I’ve learned that the "guts" of a model matter more than the shiny exterior.

What is Data Leakage?

Data leakage occurs when your training data accidentally contains information from the future, or information that simply won't be available at the moment you need to make a real-world prediction.

It’s like giving a student the answer key inside the exam paper. They aren't learning the concepts; they are just reading the answers.

The "Hospital Readmission" Trap

Imagine you are building a model to predict if a patient will be readmitted to the hospital. You include a feature: "Follow-up appointment scheduled".

The Leak: That appointment is usually scheduled after the decision to discharge or readmit is made.
The Result: The model "predicts" the readmission perfectly because it sees the scheduled appointment that only exists because the patient was readmitted.

3 Red Flags That Your Code is "Cheating"

1. The "Too Good to be True" Metric

If your R-squared is 0.99 or your RMSE is near zero on your first attempt, don't celebrate—investigate. In the Regression Thinking Framework, we call this a "warning sign," not a success. Check for any feature that has a suspiciously high correlation (>0.95) with your target.

2. The Time-Traveler's Split

One of the biggest mistakes I see is using a Random 80/20 Split on time-series data.

The Error: If you are predicting tomorrow’s sales, your model cannot see data from next month during its training.
The Fix: Use a Time-Based Split. Train on months 1–10 and test on months 11–12. This mimics the real world, where the future is always unknown.

3. The "Post-Event" Feature

In my current work with Gen AI predicting fruit and vegetable prices for restaurants, we scrape news data to label and summarize trends. If we included the "Final Market Price" as a feature to predict the "Expected Price," the model would be useless.

Rule of Thumb: Ask yourself: "Will I actually have this specific piece of data at 9:00 AM on the day I need the prediction?" If the answer is no, delete the feature.

The Diagnostic Protocol: How to Protect Your Code

Before you ship, run this "Audit":

Stage	Action
Feature Audit	Flag any feature that wouldn't exist at the time of prediction.
Correlation Check	Identify features that "explain" the target too perfectly.
Split Strategy	Use `TimeSeriesSplit` for temporal data or `GroupKFold` for customer-based data.
Feature Importance	If a "suspicious" feature is in your Top 3, investigate it immediately.

Final Thoughts: Curiosity is Your Best Defense

When I was a kid, I didn't just play with toys; I wanted to know the "functionality behind the cool toy". Engineering is the same. Don't just look at the accuracy score; look at the why.

The most dangerous models aren't the ones that fail—it's the ones that give you confidently wrong answers because they were allowed to cheat during training.

Have you ever been burned by a 99% accuracy model that failed in production? Let’s discuss in the comments.