## DEV Community is a community of 875,078 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Deepak Raj

Posted on • Originally published at codeperfectplus.herokuapp.com

# Gradient Descent: The Mother of All Algorithms?

This article was originally published at Codeperfectplus.

##### Introduction

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction to find the minima of the function. it' the backbone of a machine-learning algorithm.Gradient descent is originally proposed by Cauchy in 1847. It is an important and widely used algorithm in machine learning.

When we have two or more derivative of the same function, they are called Gradient

#### Loss/Error Function

In simple linear regression when we have a single input. and we have to obtain a line that best fits the data. The best fit line is the one for which total prediction error (all data points) are as small as possible. Error is the distance between the point to the regression line.

E.g: Relationship between hours of study and marks obtained. The goal is to find the relationship between the hours of study and marks obtained by the student. for that, we have to find a linear equation between these two variables.

$y_(predict) = b_0 + b_1.x$

Error is the difference between predicted and actual value. to reduce the error and find the best fit line we have to find value for bo and b1.

$lossfunction = (y_p-y_a)^2$

for finding a best fit line value of bo and b1 must be that minimize the error.error is the difference between predicted and actual output.

Now that we have a derivative, gradient descent will use it to find where the sum of squared is lowest. The process of finding the optimal value for m(coefficient) and b(intercept) to reduce the error/lost function is called Gradient Descent. Gradient descent finds the minimum value by taking steps from an initial guess until it reaches the best value.


from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=1, noise=0.1)

def LinearRegression(X, y, m_current=0, b_current=0, epochs=2000, learning_rate = 0.001):
N = float(len(y))

for i in range(epochs):
y_current = m_current * X + b_current
m_gradient = (-2/N) * sum(y-y_current) * X
b_gradient = (-2/N) * sum(y-y_current)
m_current = m_current - (learning_rate * m_gradient)
b_current = b_current - (learning_rate * b_gradient)
return m_current, b_current

m, b = LinearRegression(X, y)
print(m , b)