In the previous article, we prepared a chain rule equation to compute the derivative of cross entropy with respect to bias b3.
We will be solving that in this article step by step.
Let us solve the first part.
We begin by computing the derivative of the cross entropy with respect to the predicted probability for Setosa.
We use a familiar formula:
Applying this here gives
Now let us solve the second part:
We start by writing the softmax equation for the predicted probability:
Taking the derivative with respect to the raw output for Setosa gives
We will use this result in the chain rule.
Now let us solve the final part:
This is the derivative of the raw output for Setosa with respect to the bias ( b_3 ).
Taking the derivative with respect to ( b_3 ):
- The derivative of the blue bent surface with respect to ( b_3 ) is 0, since it is independent of ( b_3 ).
- The derivative of the orange bent surface with respect to ( b_3 ) is also 0.
- The derivative of ( b_3 ) with respect to ( b_3 ) is 1.
This gives
Now that all parts are computed, we return to the chain rule.
When the observed data corresponds to Setosa, the predicted probability for Setosa is used to compute the cross-entropy loss. The expression above is therefore the derivative of the cross entropy with respect to ( b_3 ).
Simplifying this expression gives
So, when the predicted probability for Setosa is used to compute the cross entropy, the derivative of the cross entropy with respect to ( b_3 ) is
In the next article, we will continue by applying the same process for Virginica.
Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.
Just run:
ipm install repo-name
โฆ and youโre done! ๐













Top comments (0)