DEV Community

Sean Atukorala
Sean Atukorala

Posted on

1

Linear Regression in Machine Learning Explained

Ever wonder what the terms Regression and Linear Regression mean in a machine learning context? If so, keep reading to find out!

Image of a Barn and Fence in Autumn

What is Regression in Machine Learning?

Regression is primarily a technique for analyzing the relationship between independent variables/features and a dependent variable/outcome. Regression is a method most frequently used to solve supervised machine learning problems.

Regression modeling, which as the name implies uses regression, consists of building a mapping function such that input variables map to a continuous output variable. The input variables are independent and the continuous output variable is dependent in this case. This type of modelling is mainly useful in two ways: (1) To predict the outcome of new and unseen input data (2) To forecast and predict gaps in missing data.

What does Linear Regression mean in Machine Learning?

Linear regression is a type of machine learning model. It is most easily explained as a linear equation that combines a specific set of input values, which are independent variables, to the predicted output value, which is a dependent variable.

Although linear regression is originally a concept derived from the field of statistics, it is used extensively in machine learning for the purposes of understanding the relationship between independent input variables and a dependent output variable.

There are many different types of linear regression, the most notable of which are simple linear regression(where there is only a single input variable) and multiple linear regression(where there are multiple input variables).

The main points of differentiation between different types of linear regression are concerning the number of independent variables and the type of relationship between the independent and dependent variables.

Linear regression can be defined as a simple linear equation due to the fact that ultimately, it is a line drawn across a set of data points that is designed to model those data points in the most accurately possible way.

Here is what a linear equation looks like:

y = ax + c
Enter fullscreen mode Exit fullscreen mode

Where a represents the slope of the line and the c represents its y-intercept.

Example of Linear Regression:

In order to better understand linear regression, let's go over an example. Here is a graph showing the heights of fathers in comparison to their sons:

Figure 1: Comparison of fathers' heights to their sons

Figure 1: Comparison of fathers' heights to their sons

Here is a code snippet using the R programming language that will calculate the linear regression line for the graph above:

library(tidyverse)
library(HistData)
library(caret)

# Obtain the dataset data and rename the `childHeight` column to `son`
galton_heights <- GaltonFamilies %>%
  filter(childNum == 1 & gender == "male") %>%
  select(father, childHeight) %>%
  rename(son = childHeight)

y <- galton_heights$son
test_index <- createDataPartition(y, times = 1, p = 0.5, list = FALSE)

# Divide the original dataset into train and test sets
train_set <- galton_heights %>% slice(-test_index)
test_set <- galton_heights %>% slice(test_index)


# Fit linear regression model
fit <- lm(son ~ father, data = train_set)

# Plot the graph and the linear regression line
plot(galton_heights, pch = 16, col = "blue")
abline(fit, col="red", lwd=5)
Enter fullscreen mode Exit fullscreen mode

And here is the linear regression line that the above code produces:

Figure 2: Linear regression line in graph

Figure 2: Linear regression line in graph

Ok, that's it for this blog post on what Linear Regression is. I hope you found it useful!

Conclusion

Thanks for reading this blog post!

If you have any questions or concerns please feel free to post a comment in this post and I will get back to you if I find the time.

If you found this article helpful please share it and make sure to follow me on Twitter and GitHub, connect with me on LinkedIn and subscribe to my YouTube channel.

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (1)

Collapse
 
devmehta profile image
Dev Mehta β€’

Great post buddy. If you want to also understand Gradient Descent and how linear regression works behind the scenes with visual learning of the topic, I would recommend checking out this blog post. Also, learning about basic linear algebra and calculus would help new developers getting into this field :)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

πŸ‘‹ Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay