Chanchal Singh

Posted on Jan 22

Day 5 : Is Your Model Actually Good? - Evaluation Metrics

#machinelearning #ai #beginners #datascience

You prepare for an exam.

You give a mock test.
You get 72 marks.

Now the real question is not:

“Did I pass or fail?”

The real question is:

“How good is 72?”

Is it:

Better than before?
Good enough?
Just lucky?

That’s exactly what model evaluation is about.

Why We Need Evaluation

A model can always give predictions.

But prediction alone means nothing.

We must ask:

Can I trust this model?
Will it work on new data?
Is it learning patterns or memorizing data?

Evaluation answers these questions.

R-squared (R²): The Most Popular Metric

Imagine this.

You’re trying to predict house prices.

Before using ML, your best guess is:

“All houses cost around ₹50 lakh.”

That’s your baseline.

Now your model predicts different prices for different houses.

R² asks:

“How much better is your model compared to this dumb guess?”

R² in simple words

R² tells how much of the problem your model explains.

R² = 0.80 → model explains 80% pattern
R² = 0.20 → model explains very little
R² = 1 → perfect (rare, suspicious)

Important Truth About R²

High R² does not always mean good model.

Why?

It can overfit
It can memorize
It can fail on new data

That’s why we never trust R² alone.

Residuals: Listening to the Model’s Mistakes

Residual = actual value − predicted value.

Think of residuals as model’s complaints.

If residuals look:

Random → model is healthy
Patterned → model is missing something

Residual plots help us see:

“Is the model behaving logically?”

Standard Error (SE): How Confident Is the Model?

Imagine two friends predicting house prices.

Friend A:

Usually wrong by ₹5,000

Friend B:

Usually wrong by ₹50,000

Who do you trust more?

Standard Error tells:

“On average, how far predictions are from truth.”

Lower SE = more reliable model.

Train vs Test Performance (Very Important)

If:

Training accuracy is very high
Testing accuracy is low

That means:

Model memorized instead of learning.

This is how we detect overfitting.

Like a student who learns answers by heart but fails when questions change.

This problem is called overfitting — the model knows the past too well,
but can’t handle anything new.

Tiny Real-Life Thought 🧠

If someone always scores high in practice tests
but fails in the real exam —

you know something is wrong.

Same with ML models.

3-Line Takeaway

Evaluation tells if model is trustworthy
R² shows explained variation
SE shows prediction reliability

What’s Coming Next 👀

Now the big question:

Why do some models fail even when metrics look good?

That leads us to:

👉 Day 6 — Why Linear Regression Breaks (Assumptions & Multicollinearity)

I love breaking down complex topics into simple, easy-to-understand explanations so everyone can follow along. If you're into learning AI in a beginner-friendly way, make sure to follow for more!

Connect on Linkedin: https://www.linkedin.com/in/chanchalsingh22/
Connect on YouTube: https://www.youtube.com/@Brains_Behind_Bots

DEV Community