When exploring machine learning, there are many rules to learn. One important rule we should always keep in mind is the chain rule.
Just like what you imagine when hearing the word “chain”, you can think of a chain reaction—where changing one thing causes a change in a second thing, which then causes a change in a third.
The Core Concept: Link
Let’s take a shoe-size example.
Think of three measurements:
- Weight
- Height
- Shoe size
From these measurements, Just assume the following:
- Weight predicts height
- Height predicts shoe size
If you want to know how much weight affects shoe size, the chain rule says you simply multiply the derivatives of these links together.
So, we can express this as:
Change in Shoe Size per Weight = (Change in Shoe Size per Height) × (Change in Height per Weight)
Plugging in Values
Let’s plug in some example values to understand this better.
Assume that for every 1 unit increase in weight, height increases by 2 units.
Change in height / Change in weight = 2 / 1 = 2
Now assume that for every 1 unit increase in height, shoe size increases by 1/4 unit.
Change in shoe size / Change in height = (1/4) / 1 = 1/4
According to the chain rule:
Change in shoe size per weight = 1/4 × 2 = 1/2
This means that for every 1 unit increase in weight, shoe size increases by 1/2 unit.
So, that is basically the chain rule.
Wrapping Up
The chain rule is a fundamental building block that prepares us for concepts like backpropagation.
In the next part of this article, we’ll move on to gradient descent.
If you’ve ever struggled with repetitive tasks, obscure commands, or debugging headaches, this platform is here to make your life easier. It’s free, open-source, and built with developers in mind.
👉 Explore the tools: FreeDevTools
👉 Star the repo: freedevtools

Top comments (0)