Deepak Raj

Posted on Jul 16, 2020 • Edited on Sep 5, 2023

Gradient Descent: The Mother of All Algorithms?

#machinelearning #datascience #python #programming

Introduction to Gradient Descent

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction to find the minima of the function. it' the backbone of a machine-learning algorithm.Gradient descent is originally proposed by Cauchy in 1847. It is an important and widely used algorithm in machine learning.

When we have two or more derivative of the same function, they are called Gradient

Loss/Error Function

In simple linear regression when we have a single input. and we have to obtain a line that best fits the data. The best fit line is the one for which total prediction error (all data points) are as small as possible. Error is the distance between the point to the regression line.

E.g: Relationship between hours of study and marks obtained. The goal is to find the relationship between the hours of study and marks obtained by the student. for that, we have to find a linear equation between these two variables.

y_(predict) = b_0 + b_1.x

Error is the difference between predicted and actual value. to reduce the error and find the best fit line we have to find value for bo and b1.

lossfunction = (y_p-y_a)^2

for finding a best fit line value of bo and b1 must be that minimize the error.error is the difference between predicted and actual output.

Now that we have a derivative, gradient descent will use it to find where the sum of squared is lowest. The process of finding the optimal value for m(coefficient) and b(intercept) to reduce the error/lost function is called Gradient Descent. Gradient descent finds the minimum value by taking steps from an initial guess until it reaches the best value.

Gradient Descent Implementation in Python


from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=1, noise=0.1)

def LinearRegression(X, y, m_current=0, b_current=0, epochs=2000, learning_rate = 0.001):
    N = float(len(y))

    for i in range(epochs):
        y_current = m_current * X + b_current
        m_gradient = (-2/N) * sum(y-y_current) * X 
        b_gradient = (-2/N) * sum(y-y_current)
        m_current = m_current - (learning_rate * m_gradient)
        b_current = b_current - (learning_rate * b_gradient)  
    return m_current, b_current

m, b = LinearRegression(X, y) 
print(m , b)

Deepak Raj

Founder SystemGuard | Building a complete server monitoring and threat detection tool

DEV Community

Gradient Descent: The Mother of All Algorithms?

Introduction to Gradient Descent

Loss/Error Function

Gradient Descent Implementation in Python

Deepak Raj

Top comments (0)

Read next

Top 7 Data Careers You Should Know About in 2025

AI Breakthroughs: Language Models Can Now Control Computer Interfaces Like Humans

Behavioral Questions in AI Interviews: 2025 Insights

Cómo crear un Wallpaper dinámico con la Hora y Fecha usando Python

Introduction to Gradient Descent

Loss/Error Function

Gradient Descent Implementation in Python

Deepak RajFollow

Read next

Top 7 Data Careers You Should Know About in 2025

AI Breakthroughs: Language Models Can Now Control Computer Interfaces Like Humans

Behavioral Questions in AI Interviews: 2025 Insights

Cómo crear un Wallpaper dinámico con la Hora y Fecha usando Python

Deepak Raj