DEV Community

Cover image for Machine learning from scratch, what to build before using scikit-learn
I Want To Learn Programming
I Want To Learn Programming

Posted on • Originally published at iwtlp.com

Machine learning from scratch, what to build before using scikit-learn

model.fit(X, y) is one line, and that is the problem. scikit-learn is excellent, but reaching for it before you understand what it does leaves you able to call models without being able to reason about them. Build a few core pieces by hand first, and the rest of machine learning becomes readable.

1. Linear regression

Start here because everything else generalizes from it. A linear model is just prediction = weights . features + bias. Building it teaches you what a model parameter is and what "fitting" means: finding the weights that make predictions close to the truth.

def predict(x, w, b):
    return sum(wi * xi for wi, xi in zip(w, x)) + b
Enter fullscreen mode Exit fullscreen mode

2. A loss function

You cannot improve what you cannot measure. A loss function scores how wrong the predictions are (mean squared error for regression). Once you have written one, "training" stops being magic: it is just making this number go down.

def mse(preds, ys):
    return sum((p - y) ** 2 for p, y in zip(preds, ys)) / len(ys)
Enter fullscreen mode Exit fullscreen mode

3. Gradient descent

This is the engine under almost every model. Compute which direction reduces the loss (the gradient), then nudge the weights a small step that way, and repeat. Building it by hand is the single most clarifying thing you can do in machine learning, because every neural network trains this exact way.

w -= learning_rate * gradient   # the whole idea, one line
Enter fullscreen mode Exit fullscreen mode

4. Classification and the sigmoid

Move from predicting a number to predicting a probability. Logistic regression wraps the linear model in a sigmoid to squash the output to 0..1, with a different loss (cross-entropy). Now you understand the difference between regression and classification at the level of the math, not the API.

5. Softmax for multiple classes

Generalize from yes/no to "which of K classes," turning a vector of scores into probabilities that sum to one. This is the output layer of nearly every classifier you will ever use.

Why this pays off

When you later call scikit-learn or a deep learning framework, you will know what fit is doing, why the learning rate matters, what overfitting looks like, and how to debug a model that will not learn. The library becomes a convenience, not a mystery. People who skip this step plateau quickly because they cannot reason about what they cannot see.

Build it, then use the library

The AI and Deep Learning track builds exactly this path, linear models, loss functions, gradient descent, classification, and softmax, all from scratch in plain Python and NumPy before any framework, graded in your browser. The first project is free.

Build the engine once. Then the whole field opens up.

Top comments (0)