Discussion on: Build a flexible Neural Network with Backpropagation in Python

View post

Replies for: Nice, but never seems to converge on array([[ 0.92, 0.86, 0.89]]). What's a good learning rate for the W update step? It should probably get sm...

Hey! I'm not a very well-versed in calculus, but are you sure that would be the derivative? As I understand, self.sigmoid(s) * (1 - self.sigmoid(s)), takes the input s, runs it through the sigmoid function, gets the output and then uses that output as the input in the derivative. I tested it out and it works, but if I run the code the way it is right now (using the derivative in the article), I get a super low loss and it's more or less accurate after training ~100k times.

I'd really love to know what's really wrong. Could you explain why the derivative is wrong, perhaps from the Calculus perspective?

Haytam Zanid • Aug 13 '18

There is nothing wrong with your derivative. max is talking about the actual derivative definition but he's forgeting that you actually calculated sigmoid(s) and stored it in the layers so no need to calculate it again when using the derivative.

Justin Chang • Oct 22 '17

The derivation for the sigmoid prime function can be found here.