1. Ordinary Least Squares (OLS)
method used in linear regression to estimate model parameters by minimizing the sum of squared errors between actual and predicted values.
Loss=∑](𝑦−𝑦^)2Loss=∑(y−y^)2
Why OLS can overfit:
.Works poorly with many features
.Sensitive to noise
. Performs badly when features are correlated
. Fits training data too closely
2. Regularization
reduces overfitting by adding a penalty term to the loss function.
Why it helps:
Controls model complexity
Reduces large coefficients
Improves performance on unseen data
3. Ridge Regression (L2 Regularization)
Loss Function:
Loss=∑(𝑦−𝑦^)2+𝜆∑𝛽2Loss=∑(y−y^)2+λ∑β2
Key Points:
Shrinks coefficients toward zero
Reduces variance
Keeps all features
Works well with multicollinearity
Does not perform feature selection
4. Lasso Regression ( Regularization)
Loss Function:
Loss=∑(𝑦−𝑦^)2+𝜆∑∣𝛽∣Loss=∑(y−y^)2+λ∑∣β∣
Key Points:
Can reduce coefficients to exactly zero
Performs automatic feature selection
Produces simpler, more interpretable models
5. Ridge vs Lasso Comparison
Ridge: Uses L2 penalty, shrinks coefficients, keeps all features, good when all variables matter.
Lasso: Uses L1 penalty, sets some coefficients to zero, performs feature selection, good when few variables matter.
6. House Price Prediction Example
Features:
House size
Bedrooms
Distance to city
Nearby schools
Noisy variables
If all features affect price → Ridge Regression
Keeps all features
Reduces overfitting
Handles correlated variables
If only few features matter → Lasso Regression
Removes irrelevant features
Improves interpretability
Reduces noise
7. Model Evaluation
Detecting Overfitting:
High training accuracy + low test accuracy → Overfitting
Similar train & test error → Good model
Role of Residuals:
Show prediction errors
Help detect patterns and outliers
Random residuals = good model
Summary
OLS -Simple but overfits
Ridge -Shrinks coefficients
Lasso -Selects features
Regularization -Improves generalization
Top comments (0)