Akshay Rajinikanth

Posted on Mar 12

I Did AI/ML "Wrong", And It's the Best Mistake I Ever Made

#ai #career #datascience #machinelearning

Okay, real talk.

When my classmates were installing TensorFlow and building "AI projects" for their resumes, I was sitting in the library reading about probability distributions and hypothesis testing.

They laughed. I felt behind.

Fast forward to our final year and now I'm the one they're texting at 11 PM asking "bro what even is overfitting?"

Funny how that works.

What I Noticed Among My Peers

There's a pattern I've watched play out all semester. A friend jumps into a PyTorch tutorial, builds something that "works," and then hits a wall the moment anything breaks or needs explaining.

They can run the model. They just can't think about it.

Here's what I keep hearing:

"Why is my model performing perfectly on training data but terrible on test data?"
"What does this ROC curve actually mean?"
"Why does the learning rate matter so much?"
"What do you mean the data has bias — it's just numbers?"

These aren't framework problems. These are statistics problems. And no amount of model.fit() will fix them if you don't understand what's happening underneath.

The hard truth? Most ML tutorials teach you to drive a car without explaining how the engine works. Fun until something breaks.

Why Statistics Is Secretly the Foundation of AI

Here's something nobody tells you in year one:

Machine Learning is just applied statistics with better marketing.

I'm only slightly joking.

Every ML concept you struggle with maps almost perfectly to a statistics concept:

What ML calls it	What it actually is
Overfitting	High variance in your model
Regularization	Penalizing complexity to reduce variance
Logistic Regression	Probability estimation using the sigmoid function
Loss Function	A formalized way of measuring error distributions
Gradient Descent	Optimization over a cost surface : calculus + stats
Evaluation Metrics	Hypothesis testing applied to model predictions

When my professor introduced logistic regression, half the class stared blankly. I understood it immediately, because I already knew that the sigmoid function outputs a probability between 0 and 1. I'd spent weeks thinking about probability. It clicked in seconds.

Realizations That Changed How I See ML

Here are three moments where my statistics background saved me:

1. Probability → Logistic Regression

Logistic regression doesn't predict a class. It predicts the probability of a class. If you've never thought carefully about what probability means conditional probability, Bayes' theorem, odds ratios that distinction will confuse you endlessly.

2. Distributions → Data Modeling

Before you model anything, you need to understand what your data looks like. Is it normally distributed? Skewed? Heavy-tailed? This changes everything — which algorithm you use, how you preprocess, whether your assumptions hold.

My peers skip this step. Then their model "doesn't work" and they have no idea why.

3. Variance → Overfitting

Overfitting sounds mysterious until you realize it's literally the definition of high variance your model is too sensitive to the training data. Understanding the bias-variance tradeoff isn't an advanced topic. It's a statistics 101 concept that makes ML model evaluation make complete sense.

Once you see it, you can't unsee it.

If You're Starting Today, Follow This Roadmap

You don't need years. You need the right order.

Step 1 — Statistics Basics (3–4 weeks)

Mean, median, variance, standard deviation
Probability and conditional probability
Bayes' theorem (yes, now not later)
Distributions: normal, binomial, Poisson

Step 2 — Data Analysis Thinking (2–3 weeks)

Exploratory Data Analysis (EDA)
Correlation vs. causation (this will save your life)
Hypothesis testing and p-values
Handling missing data, outliers, skewness

Step 3 — Python + Data Tools (2–3 weeks)

NumPy, Pandas, Matplotlib, Seaborn
Practice EDA on real datasets (Kaggle is your friend)
Learn to question your data before modeling it

Step 4 — Machine Learning Fundamentals (4–6 weeks)

Linear and logistic regression (you'll now actually understand these)
Decision trees, k-NN, SVMs
Bias-variance tradeoff, cross-validation, regularization
scikit-learn

Step 5 — Deep Learning (ongoing)

Neural networks, backpropagation, activation functions
CNNs, RNNs, Transformers
PyTorch or TensorFlow — now you're ready

The difference? By Step 4, you won't just be copying code. You'll be reasoning about your models.

Actionable Advice (From One Student to Another)

If you take nothing else from this, take these:

Don't skip EDA. Ever. Look at your data before you model it. Always.
Learn to read a confusion matrix before you learn to build a neural network.
Understand what a p-value is — ML evaluation metrics are essentially the same idea, dressed differently.
Build intuition before building models. StatQuest on YouTube is criminally underrated for this.
Kaggle notebooks are your best textbook. Read others' EDA sections obsessively.
When your model fails, think statistically first. Bad data distribution, data leakage, and class imbalance cause more failures than bad architectures.

The Conclusion I Wish Someone Had Given Me

There's a version of you that learns TensorFlow in week one, builds a "95% accurate" model on an imbalanced dataset, puts it on your resume, and doesn't realize what went wrong until an interview.

There's another version that spends a few extra weeks understanding why things work.

That second version isn't slower. They're just building on solid ground.

The flashy frameworks come and go. The math doesn't.

And honestly? Once the statistics clicked, AI stopped feeling like magic and started feeling like a puzzle I actually knew how to solve. That shift in confidence — from copying tutorials to genuinely understanding — is worth every extra week.

AI isn't magic — it's mostly statistics wearing a hoodie.

Start there. The rest will follow.

📌 TL;DR

Many CS students jump into ML frameworks without understanding the statistical foundations underneath
Concepts like overfitting, gradient descent, loss functions, and evaluation metrics are fundamentally statistics concepts
Learning probability, distributions, variance, and hypothesis testing first makes ML dramatically easier to understand and debug
Suggested order: Statistics → Data Analysis → Python Tools → ML → Deep Learning
Build intuition before you build models

💬 Discussion Question

Did you learn statistics before ML, or did you dive straight into frameworks?
Looking back, do you think the order mattered? Drop your experience in the comments. I'm genuinely curious how different paths shaped how people think about models. 👇