What is Linear Regression
Linear Regression is a fundamental statistical machine learning algorithm that models the linear relationship between a dependent variable ( ) and one or more independent variables ( ). The goal is to fit a straight line (or a hyperplane in multiple dimensions) that minimizes the overall prediction error on the training data.
Note 1.1: Linearly related features exhibit a correlation where a change in one variable results in a proportional change in the other (e.g., as increases, also tends to increase, or vice versa). Linear Regression works best when this relationship is approximately linear.
Note 1.2: Regression is the process of predicting a continuous or real value (e.g., 265.34, 10.231).
How does it work
Linear Regression models the relationship by defining a linear function, often called the hypothesis
, which calculates the predicted value.
For Multiple Linear Regression (more than one feature), this line is represented as:
is the predicted value (the model’s output).
is the y-intercept (the bias term).
are the coefficients or weights for each feature
.
The model’s task is to find the optimal set of weights ( ) that best fit the data.
How to Measure the Model Performance?
The performance of a regression model is measured using a cost function (or loss function), which quantifies the “error” or “cost” for the model’s predictions. The most commonly used for Linear Regression is the Mean Squared Error (MSE).
Mean Squared Error (MSE) is calculated by averaging the squared differences between the predicted values and the actual values:
Where is the predicted value and is the actual value.
MSE is popular because the squaring operation penalizes larger errors more heavily, making the model sensitive to outliers.
How does the Model Learn?
The model learns by iteratively adjusting its weights ( ) to minimize the cost function using the Gradient Descent optimization algorithm.
Gradient Descent works by calculating the gradient (the slope) of the cost function with respect to each weight. This gradient indicates the direction of the steepest increase in error. The weights are then updated by moving in the opposite direction of the gradient. The weight update rules for Linear Regression using MSE are:
For each feature weight ( , where ):
For the intercept ( ):
(alpha) is the learning rate, a hyperparameter that controls the step size during each iteration.
is the data index.
This process is repeated over many iterations, called epochs, allowing the model to gradually converge on the optimal weights.
The code to implement Linear Regression from scratch is provided below.
import numpy as np
class linear_regression:
def __init__(self):
self.weights = []
self.bias = 0.0
self.learning_rate = 0.001
def fit(self, x, y, epochs):
data_size = len(x)
number_of_features = len(x[0])
x = np.array(x)
y = np.array(y)
self.weights = np.zeros(number_of_features)
for epoch in range(epochs):
derivatives = [0.0] * number_of_features
bias_derivative = 0.0
for pos in range(data_size):
prediction = sum([self.weights[i] * x[pos][i] for i in range(number_of_features)]) + self.bias
for i in range(number_of_features):
derivatives[i] += (2 / data_size) * (prediction - y[pos]) * x[pos][i]
bias_derivative += (2 / data_size) * (prediction - y[pos])
for i in range(number_of_features):
self.weights[i] -= self.learning_rate * derivatives[i]
self.bias -= self.learning_rate * bias_derivative
# Safety check for numerical stability
if any([np.isnan(w) or np.isinf(w) or abs(w) > 1e10 for w in self.weights]):
return
def predict(self, x):
return sum([self.weights[i] * x[i] for i in range(len(self.weights))]) + self.bias
Top comments (0)