Suppose you're building a neural network, maybe even a deep and complex one. You’ve set up the layers, initialized weights, and defined activation functions. But here's a question:
Can this neural network make accurate predictions without tuning?
The short answer: No.
Building a neural network is just the start, the real magic lies in fine-tuning its parameters (i.e., weights and biases) so it actually learns from the data.
Why Do We Need to Optimize Weights and Biases?
To make better predictions, we want the network to minimize the loss and maximize accuracy.
Fine-tuning means updating weights and biases over multiple epochs using feedback from the output (loss) to improve the next prediction. This continues until we reach a point of minimal loss.
But how exactly do we update the weights and biases?
Early (Inefficient) Ideas for Updating Weights
1. Random Weights & Biases
Try random values, compute the loss, and repeat the process until the lowest loss is achieved.
❌ This is inefficient and slow.
2. Guided Random Tweaks
Set random weights → calculate loss → try new weights close to the previous ones if loss decreases → stop if it doesn’t.
✅ Better than the first, but still not optimal.
Now let’s go one level higher...
Enter: Backpropagation + Gradient Descent
Let’s say we want to go downhill on a loss curve to reach the lowest point (minimum loss). To do this efficiently, we need to know:
The direction to move in → determined by the slope (derivative)
How much to move → controlled by the learning rate
This is where calculus enters the picture.
Gradient Descent — The Update Rule
To minimize the loss L, we update the weights and biases using:
w=w−η⋅∂w∂L
b=b−η⋅∂b∂L
Where:
∂w∂L
= derivative of the loss with respect to weight
∂b∂L
= derivative of the loss with respect to bias
But how do we get these derivatives?
That’s where Backpropagation comes in.
Backpropagation: Going Back to Learn Better
Let’s take an example:
A neural network with 3 neurons in the hidden layer
A single output neuron in the final layer
Loss function: Mean Squared Error →
L=(ypred−ytrue)2
To update the weights, we apply the chain rule from calculus to compute the gradients.
Gradients for Hidden Layer Weights
Neuron 1
∂w11∂L=∂y∂L∗∂a1∂y∗∂z1∂a1∗∂w11∂z1
∂w12∂L=∂y∂L∗∂a1∂y∗∂z1∂a1∗∂w12∂z1
∂w13∂L=∂y∂L∗∂a1∂y∗∂z1∂a1∗∂w13∂z1
∂w14∂L=∂y∂L∗∂a1∂y∗∂z1∂a1∗∂w14∂z1
Neuron 2
∂w21∂L=∂y∂L∗∂a2∂y∗∂z2∂a2∗∂w21∂z2
∂w22∂L=∂y∂L∗∂a2∂y∗∂z2∂a2∗∂w22∂z2
∂w23∂L=∂y∂L∗∂a2∂y∗∂z2∂a2∗∂w23∂z2
∂w24∂L=∂y∂L∗∂a2∂y∗∂z2∂a2∗∂w24∂z2
Neuron 3
∂w31∂L=∂y∂L∗∂a3∂y∗∂z3∂a3∗∂w31∂z3
∂w32∂L=∂y∂L∗∂a3∂y∗∂z3∂a3∗∂w32∂z3
∂w33∂L=∂y∂L∗∂a3∂y∗∂z3∂a3∗∂w33∂z3
∂w34∂L=∂y∂L∗∂a3∂y∗∂z3∂a3∗∂w34∂z3
Note: For example,
w21
flows through neuron 2, so we apply the chain rule using neuron 2’s activation and output.
Gradients for Biases
Bias of Neuron 1
∂b1∂L=∂y∂L∗∂a1∂y∗∂z1∂a1∗∂b1∂z1
Bias of Neuron 2
∂b2∂L=∂y∂L∗∂a2∂y∗∂z2∂a2∗∂b2∂z2
Bias of Neuron 3
∂b3∂L=∂y∂L∗∂a3∂y∗∂z3∂a3∗∂b3∂z3
🎯 We calculate these gradients by moving backward through the network — and that’s why the algorithm is called Backpropagation.
Applying Gradient Descent
Once we compute the gradients, we use the gradient descent update rule:
w=w−η⋅∂w∂L
b=b−η⋅∂b∂L
Where 0.05 is an example learning rate (you can adjust it).
After applying the update:
✅ Loss decreases
✅ Accuracy improves
Repeat this process across epochs to gradually optimize the model.
Before vs After Optimization
📍 Before Gradient Descent: You're sitting somewhere randomly on the loss curve.
📍 After One Update: You’ve moved closer to the local minimum.
This is the power of gradient descent — it helps your model learn how to learn.
Why This Matters
The heart of deep learning is optimization.
And the heart of optimization is:
Backpropagation + Gradient Descent
Without these, neural networks would just be complex calculators spitting out random values.
Thanks to backpropagation, networks can learn from their mistakes, and thanks to gradient descent, they can improve continuously.
Top comments (0)
Subscribe
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Top comments (0)