*Simple Linear Regression: *
Simple Linear Regression is a statistical and machine learning model that finds a linear relationship between an independent variable (a feature) and a dependent variable (the output). The goal of this model is to find a line equation in the data that can most accurately predict the output value.
Why is Simple Linear Regression used?
- When the outcome needs to be easily predicted using a single input variable.
- To analyze linear relationships in data.
- To create models quickly, easily, and understandably.
- In prediction and understanding trends.
Model equation:
y = a + bx
- y = dependent variable (what you want to predict)
- x = independent variable (input)
- a = y-intercept (what intersects the y-axis)
- b = slope (how much y increases/decreases when x changes)
How does it work?
The "Least Squares Method" is used to minimize the distance from each data point, so that the relationship between the output value and the input value can be properly understood.
Use Cases and Applications:
- Determining the price of a house (by looking at the size)
- Analyzing the relationship between sales and advertising
- Evaluating student study time and results
- Predicting disease symptoms and patient conditions in medicine
- The relationship between rainfall and crop production in agriculture
Example:
Let's say we want to know how the price (y) increases when the size of the house (x) increases. We did a linear regression with some house data. The model said that the average price increases by 5000 rupees per square foot and the base price is 1,00,000 rupees. Then the model:
y = 100000 + 5000x
Now the price of a 100 square feet house will be:
100000 + 5000×100 = 6,00,000
In this way, the future price of a house can be easily estimated.
Example:
Let's say you have some data on the size (square feet) and price (in thousands of taka) of a house—
Size (x): 50, 60, 70, 80, 90
Price (y): 150, 180, 210, 240, 270
Step:
- First find the average of x and y.
- Find the slope b: b = Σ(xi - x̄)(yi - ȳ) / Σ(xi - x̄)²
- Find the intercept a: a = ȳ - b x̄
- Model: y = a + b x
Python code:
# Sample data
x = [50, 60, 70, 80, 90]
y = [150, 180, 210, 240, 270]
# Calculate means
x_mean = sum(x) / len(x)
y_mean = sum(y) / len(y)
# Calculate slope (b)
numerator = sum((xi - x_mean)*(yi - y_mean) for xi, yi in zip(x, y))
denominator = sum((xi - x_mean)**2 for xi in x)
b = numerator / denominator
# Calculate intercept (a)
a = y_mean - b * x_mean
print(f"Regression Equation: y = {a:.2f} + {b:.2f}x")
# Predict price for 75 sqft
x_new = 75
y_pred = a + b * x_new
print(f"Predicted price for {x_new} sqft: {y_pred:.2f} thousand")
Running this code will give you a straight line that you can use to predict the price of the new size.
Top comments (0)