Linear regression is one of those foundational machine learning algorithms that every data practitioner encounters early on. It’s simple, intuitive, and surprisingly powerful for understanding relationships between variables. But while most tutorials show you how to implement it using libraries like scikit-learn, building it from scratch is where the real learning happens.
And honestly? It’s way simpler than it looks.
In this article, we’ll walk through the full process of implementing linear regression using nothing but Python, NumPy, and some essential math. The goal is to help you understand exactly what’s happening behind the scenes—step by step—written in a friendly, human tone that feels like you're learning with a buddy.
Let’s dive in.
What Is Linear Regression (in Simple Terms)?
Linear regression tries to model the relationship between a dependent variable (like house price) and one or more independent variables (like size, rooms, or location).
But let’s simplify this with a relatable example:
Imagine you’re trying to predict someone’s weight based on their height.
If you plot a bunch of height-weight data points on a graph, they tend to form a somewhat straight-line pattern.
Linear regression tries to draw the best possible line through these points.
That line is represented as:
𝑦
𝑚
𝑥
+
𝑏
y=mx+b
Where:
m = slope
b = intercept
x = input
y = predicted output
The goal is to find the best m and b.
Why Learn Linear Regression From Scratch?
Sure, libraries handle the hard parts. But building it manually gives you:
A deeper understanding of how machine learning works.
A clear view of gradient descent and loss minimization.
More confidence when tuning or debugging models.
A strong base for learning advanced algorithms.
Think of it like learning to drive with a manual car—once you get the mechanics, everything else feels easier.
Core Concepts Behind the Algorithm
Before we code, let’s understand the math intuitively.
- Hypothesis Function
Our model’s prediction:
𝑦
^
𝑚
𝑥
+
𝑏
y
^
=mx+b
This is what the model uses to predict values.
- Loss Function (Mean Squared Error)
We measure how “bad” our predictions are:
MSE
1
𝑛
∑
(
𝑦
−
𝑦
^
)
2
MSE=
n
1
∑(y−
y
^
)
2
Lower MSE = better model.
- Gradient Descent
This is the algorithm that adjusts m and b to minimize the MSE.
Gradients:
∂
∂
𝑚
−
2
𝑛
∑
𝑥
(
𝑦
−
𝑦
^
)
∂m
∂
=−
n
2
∑x(y−
y
^
)
∂
∂
𝑏
−
2
𝑛
∑
(
𝑦
−
𝑦
^
)
∂b
∂
=−
n
2
∑(y−
y
^
)
We update:
𝑚
𝑚
−
𝛼
⋅
∂
∂
𝑚
m=m−α⋅
∂m
∂
𝑏
𝑏
−
𝛼
⋅
∂
∂
𝑏
b=b−α⋅
∂b
∂
Where:
α = learning rate
Small α = slow learning
Big α = unstable updates
Let’s Implement Linear Regression from Scratch in Python
We’ll build everything step by step.
🔧 Step 1: Import Dependencies
import numpy as np
NumPy handles all the math cleanly.
🔧 Step 2: Create a Simple Dataset
Example dataset for demonstration:
X = np.array([1, 2, 3, 4, 5], dtype=float)
y = np.array([3, 4, 2, 4, 5], dtype=float)
You can replace this with any dataset—these numbers just make it easy to visualize.
🔧 Step 3: Define the Linear Regression Class
We’ll create a clean, reusable class.
class LinearRegressionScratch:
def __init__(self, learning_rate=0.01, epochs=1000):
self.lr = learning_rate
self.epochs = epochs
self.m = 0
self.b = 0
def predict(self, X):
return self.m * X + self.b
We initialize slope, intercept, and learning parameters.
🔧 Step 4: Train Using Gradient Descent
def fit(self, X, y):
n = float(len(X))
for _ in range(self.epochs):
y_pred = self.predict(X)
# Compute gradients
dm = (-2/n) * sum(X * (y - y_pred))
db = (-2/n) * sum(y - y_pred)
# Update parameters
self.m -= self.lr * dm
self.b -= self.lr * db
This loop updates m and b until the error minimizes.
🔧 Step 5: Train and Test the Model
model = LinearRegressionScratch(learning_rate=0.01, epochs=1000)
model.fit(X, y)
print("Slope (m):", model.m)
print("Intercept (b):", model.b)
print("Predictions:", model.predict(X))
You’ll see that the model approximates a best-fit line.
How Do We Know It’s Working?
After training:
Predictions start closely matching actual values.
Slope and intercept stabilize.
Loss keeps decreasing with each epoch.
If your model performs oddly:
Reduce the learning rate.
Increase epochs.
Normalize data if values differ drastically.
Going One Step Further: Visualizing the Regression Line
Here’s a simple addition if you want to visualize:
import matplotlib.pyplot as plt
plt.scatter(X, y, color='blue')
plt.plot(X, model.predict(X))
plt.show()
This helps you visually confirm the line fits the data well.
Advantages of Building Linear Regression Yourself
There are some great benefits:
- You understand the math behind ML models
No more treating algorithms as black boxes.
- You learn gradient descent deeply
This will help when learning:
Logistic regression
Neural networks
Deep learning optimization
- You practice writing clean, modular ML code
- You gain confidence to handle real-world data Where Linear Regression Is Actually Useful
Even though it’s simple, linear regression is widely used:
Predicting sales growth trends
Estimating housing prices
Forecasting demand
Weight-height or age-income relationships
Early-stage predictive modeling
It’s especially helpful when interpretability matters.
Adding Regularization (Bonus Insight)
In real-world scenarios, data can be noisy. That’s where regularization comes in.
Two popular types:
L1 (Lasso) → pushes coefficients toward zero
L2 (Ridge) → reduces coefficient magnitude
These improve model generalization.
Common Mistakes Beginners Make
- Using a high learning rate
This leads to overshooting the minimum.
- Not normalizing input features
Gradient descent behaves much better with normalized data.
- Training for too few epochs
Slowly decreasing error is normal—patience helps.
- Forgetting to check assumptions
Linear regression assumes:
Linearity
No multicollinearity
Constant variance
Normally distributed errors
These aren’t strict rules, but ignoring them affects performance.
Practical Tips for Better Models
Here are some real-world tips that professionals use:
Start with a small learning rate and adjust gradually.
Visualize loss every few epochs for debugging.
Use vectorized operations to speed up training.
Always inspect your data—bad inputs → bad outputs.
Think of linear regression like cooking. Even with a simple recipe, the ingredients and technique matter a lot.
Scaling to Multiple Features (Multivariate Regression)
So far, we handled only one input variable.
Real-world datasets usually have many features.
The multivariate formula becomes:
𝑦
^
𝑤
1
𝑥
1
+
𝑤
2
𝑥
2
+
.
.
.
+
𝑤
𝑛
𝑥
𝑛
+
𝑏
y
^
=w
1
x
1
+w
2
x
2
+...+w
n
x
n
+b
Where w becomes a vector.
With NumPy, implementing multivariate regression simply requires:
Converting X to a 2D matrix
Using vectorized gradient descent
Once you grasp the single-variable case, scaling up is straightforward.
Why Developers on dev.to Love This Approach
Because:
It deepens foundational knowledge.
Helps in interviews (companies love asking gradient descent!).
Builds intuition before moving into frameworks like TensorFlow or PyTorch.
Makes debugging ML pipelines easier.
Understanding the mechanics empowers you to create better, cleaner ML solutions.
Final Thoughts: Linear Regression Teaches You How ML Thinks
Learning to implement linear regression from scratch is more than a coding exercise.
It teaches you:
How models learn
How errors decrease
What optimization means
Why math and code work together
Once you get this foundation, the jump to complex algorithms—neural networks, transformers, reinforcement learning—feels much less intimidating.
If you're on a machine learning journey, this is the perfect starting point.
Top comments (0)