Pytorch for Neural Networks Part 9: Taking Steps Toward Better Predictions

#ai #machinelearning

In the previous article, we went through the optimization loop and passed all three training inputs through the model.

In this article, we will explore the additional steps we need to take when the total loss is not yet small enough.

Taking a Step Toward a Better Bias

If total_loss is still too large, we need to take a small step toward a better value for final_bias.

We do this using:

optimizer.step()

In the previous article, we saw that:

loss.backward()

calculates derivatives and stores them inside the model parameters.

The optimizer can then use these stored derivatives to determine the correct direction to update the parameter.

Clearing Old Derivatives

After updating the model, we need to clear the stored derivatives.

We do this using:

optimizer.zero_grad()

Here is the updated training loop:

for epoch in range(100):
    total_loss = 0

    for iteration in range(len(inputs)):
        input_i = inputs[iteration]
        label_i = labels[iteration]

        output_i = model(input_i)

        loss = (output_i - label_i) ** 2

        loss.backward()

        total_loss += float(loss)

    if total_loss < 0.0001:
        print("Num steps: " + str(epoch))
        break

    optimizer.step()
    optimizer.zero_grad()

    print(
        "Step: "
        + str(epoch)
        + " Final Bias: "
        + str(model.final_bias.data)
        + "\n"
    )

Why Do We Need `zero_grad()`?

We clear the derivatives because of how PyTorch works.

Earlier, we saw that:

loss.backward()

accumulates derivatives.

This means that if we do not clear them, then the next time we enter the loop, the new derivatives will be added to the old derivatives from the previous step.

That would lead to incorrect updates.

So after every optimization step, we reset the stored derivatives using:

optimizer.zero_grad()

Tracking the Final Bias

At the end of each loop, we print:

the current epoch number
the current value of final_bias

This allows us to track how final_bias changes during training.

The process continues until:

the total loss becomes very small, or
we finish all 100 epochs.

Printing the Final Optimized Bias

Once training is complete, we can print the final optimized value for final_bias.

print(
    "Final bias, after optimization: "
    + str(model.final_bias.data)
)

In the next article, we will see how this training process actually runs and how the value of final_bias changes over time, to get our final result from the script.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

Give it a ⭐ star on Github