Building Strong ML Foundations: Chapter 2 - Classification is Now Live

#opensource #machinelearning #classification #ai

A few weeks ago I published Chapter 1 of my hands-on AI tutorial series, focused on Regression. Today, I'm excited to share that Chapter 2: Classification is complete.

This series isn't just another collection of notebook tutorials. I'm building it to truly understand how these algorithms work under the hood — implementing them from scratch where it makes sense, comparing them properly, and focusing on concepts that actually matter in interviews and real projects.

What’s in Chapter 2

I implemented and analyzed five core classification algorithms:

Logistic Regression (implemented from scratch with NumPy, plus scikit-learn version)
K-Nearest Neighbors (KNN) Classifier
Random Forest Classifier
XGBoost Classifier
Support Vector Classifier (SVC) with different kernels

Key Focus Areas

This chapter goes deeper than just training models. I spent a lot of time on:

Visualizing decision boundaries for each algorithm
Understanding probability estimates and calibration
Bias-variance tradeoff in classification problems
Precision vs Recall — one of the most important topics for ML interviews. I dedicated a good portion explaining when to optimize for precision, when to prioritize recall, and how to use F1-score effectively depending on the problem.
Confusion matrices, ROC-AUC, and proper model evaluation
Why ensemble methods (Random Forest and XGBoost) consistently outperform single models

Everything is implemented cleanly using NumPy, scikit-learn, and XGBoost, with real datasets and detailed explanations.

You can check out the full chapter here:

→ https://github.com/zkzkGamal/hands-on-ai-tutorial/tree/main/ml_fundamentals/chapter2

Chapter 1 (Regression) is available in the same repository.

Why I’m Doing This Publicly

I got tired of only knowing how to call model.fit() without understanding what was happening inside. This project is my way of forcing myself to learn deeply while creating a resource that can help others who want the same.

If you're a developer transitioning into ML, preparing for machine learning interviews, or simply want stronger fundamentals, I believe this series can be useful.

What's Next?

I'm planning Chapter 3 soon. I'm thinking about Dimensionality Reduction (PCA, t-SNE, UMAP) or Advanced Model Evaluation & Hyperparameter Tuning. Let me know in the comments what you'd like to see next.

Feedback is always welcome — whether it's about the code, explanations, or structure.

Happy to connect if you're on a similar learning journey.