DEV Community

Cover image for Understanding Simple Linear Regression: Predicting with straight lines
Harsimranjit Singh
Harsimranjit Singh

Posted on

Understanding Simple Linear Regression: Predicting with straight lines

Today, we embark on an exciting journey of Simple Linear Regression.

Introduction to Linear Regression

Linear Regression is statistical method that defines the relationship between two or more variables by fitting a linear equation to observed data.
Imagine plotting height on x-axis and weight on the y-axis. Linear regression helps us find the best-fitting line that captures the general trend of how weight changes with height.

Types of Linear Regression

  • Simple Linear Regression: when there is only one independent variables.

  • Multiple Linear Regression: when there are multiple independent variables.

Simple Linear Regression

in simple linear regression, the linear equation is:
y=mx+c y = mx + c

  • y is dependent variable (output)
  • x is independent variable (input)
  • m is slope of line
  • b is intercept

Our objective is to find the values of m and b that minimize the loss function, allowing us to create the best-fitting line for our data.

Finding the values of m and b

The values of m and be can be find with two methods

  • Closed Form Solution:(OLS)

closed form solution

  • Non-Closed Form Solution (Approximation): Gradient Descent

Implementing Simple Linear Regression

Let's walk through an example using a dataset where the CGPA of students serves as the input variable, and we aim to predict the package of the student in Lakhs Per Annum (LPA) based on their CGPA.

  • Import Necessary Modules:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
Enter fullscreen mode Exit fullscreen mode
  • Preparing Data for Training:
df = pd.read_csv('placement.csv')
x = df.iloc[:,0:1] #input
y = df.iloc[:,1:] #output

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=2)
Enter fullscreen mode Exit fullscreen mode
  • Training the Model:
lr = LinearRegression()  #creating object of LinearRegression 
lr.fit(X_train, y_train)  #train the model on training data
Enter fullscreen mode Exit fullscreen mode
  • Predicting:
prediction = lr.predict(X_test.iloc[0].values.reshape(1, 1))

Enter fullscreen mode Exit fullscreen mode
  • Visualizing:
plt.scatter(df['CGPA'], df['Package'])
plt.plot(X_train, lr.predict(X_train), color='red')
plt.xlabel('CGPA (Input)')
plt.ylabel('Package (LPA - Output)')
plt.show()
Enter fullscreen mode Exit fullscreen mode

Image description

  • Evaluating Model Accuracy
from sklearn.metrics import r2_score
r2Score = r2_score(y_test, lr.predict(X_test))
print("r2 score: ", r2Score)
Enter fullscreen mode Exit fullscreen mode

Conclusion

Today, we have taken our first step into the word of simple linear regression. we have learned about its fundamentals, the equation, and how to implement it using machine learning.

Simple Linear Regression is a cool way to see how two things are related using a straight line. Imagine how studying might affect your grades - that's a relationship!

Next time, we will dive deeper into the detailed theory behind the simple linear regression

Top comments (0)