teresa kungu

Posted on Jan 26

Ridge Regression vs Lasso Regression

#machinelearning #linearregression #regressionanalysis #lassoregression

Introduction

Predicting house prices is a common problem in data science. A house’s price is influenced by many factors such as its size, number of bedrooms, distance to the city, and nearby amenities. However, real-world datasets also contain noise — features that do not truly affect the price.
In this article, we use a house price dataset to explain:

Ordinary Least Squares (OLS)
Ridge Regression (L2 Regularization)
Lasso Regression (L1 Regularization)

The goal is to understand how these models work, why regularization is needed, and which model to choose in practice, using simple explanations and visual support.
1. Loading the Dataset

We begin by loading the dataset into Python using common data science libraries. This step allows us to inspect the data and understand what features are available.

The dataset contains 500 houses and includes:

House characteristics (size, bedrooms, bathrooms)
Location information (distance to city)
Social features (schools nearby)

Some noisy or weak features that do not meaningfully affect price
Figure 1: Loading the house price dataset and displaying sample rows.

2*. Understanding the Data*

Before building any model, it is important to understand the data.
Important Features

size_sqm – size of the house
bedrooms – number of bedrooms
bathrooms – number of bathrooms
distance_to_city – distance from city center
schools_nearby – number of nearby schools

Noisy / Weak Features

paint_color_code
random_id_feature
weather_noise
street_code
Target Variable
price – the value we want to predict

Some features clearly make sense, while others are random and should not affect house price.

3. Splitting Features and Target

Next, we separate:

_Input features (X) – all columns except price

Target (y) – the house price_

This prepares the data for training machine learning models.

4. Train–Test Split
To evaluate model performance properly, we split the dataset into:

Training data– used to train the model
Testing data –used to evaluate performance on unseen data

This step is critical for detecting overfitting.

5. Ordinary Least Squares (OLS)
What is OLS?

Ordinary Least Squares is the most basic linear regression method. It finds model coefficients by minimizing the sum of squared differences between actual house prices and predicted prices.
Why OLS Can Overfit

OLS:

Uses all features
Assigns coefficients to both useful and noisy variables
Can give large weights to irrelevant features

In our dataset, OLS may treat random_id_feature as important, even though it has no real meaning.

6. Detecting Overfitting with OLS

To check overfitting, we compare:

Training performance
Testing performance

If training accuracy is high but test accuracy is much lower, the model is overfitting.

Regularization: Why We Need It**

Regularization adds a penalty to the loss function to control model complexity.

It helps:

Reduce overfitting
Shrink large coefficients
Improve performance on unseen data

Two common regularization techniques are** Ridge and Lasso regression.

8. Ridge Regression (L2 Regularization)
How Ridge Works

Ridge regression adds an L2 penalty, which penalizes large coefficients by squaring them.

Ridge:

Shrinks all coefficients
Keeps all features
Reduces sensitivity to noise

Ridge on Our Dataset

In our house price data:

Important features still have strong influence

Noisy features receive very small coefficients

No feature is completely removed

9. Lasso Regression (L1 Regularization)
How Lasso Works

Lasso regression uses an L1 penalty, which can shrink coefficients to zero.

This means:

Unimportant features are removed
The model becomes simpler and easier to interpret

Lasso on Our Dataset

After applying Lasso:

Features like size_sqm and bedrooms remain
Noisy features such as weather_noise and random_id_feature are set to zero

10. Ridge vs Lasso Comparison

Aspect	Ridge Regression	Lasso Regression
Regularization	L2	L1
Feature selection	No	Yes
Handles noise	Shrinks	Removes
Interpretability	Lower	Higher

11. Model Evaluation Using Residuals

Residuals are the differences between:

Actual house prices
Predicted house prices

By plotting residuals:

Random scatter → good model
Clear patterns → poor model fit

12. Choosing the Right Model
If all features are believed to matter

Choose Ridge Regression

Keeps all features
Controls overfitting
Works well with correlated variables

If only a few features are important

Choose Lasso Regression

Removes noisy features
Produces a simpler model
Easier to explain and interpret

Conclusion

Using the house price dataset, we observe that:

OLS is simple but prone to overfitting

Ridge regression improves stability by shrinking coefficients

Lasso regression simplifies the model by removing irrelevant features

DEV Community

Ridge Regression vs Lasso Regression

Top comments (0)