<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rijul Rajesh</title>
    <description>The latest articles on DEV Community by Rijul Rajesh (@rijultp).</description>
    <link>https://dev.to/rijultp</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1207862%2Ff06197aa-d585-4225-94a6-86243238376f.png</url>
      <title>DEV Community: Rijul Rajesh</title>
      <link>https://dev.to/rijultp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rijultp"/>
    <language>en</language>
    <item>
      <title>Pytorch for Neural Networks Part 10: Completing Training and Verifying the Results</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Wed, 10 Jun 2026 19:17:41 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-10-completing-training-and-verifying-the-results-4c7l</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-10-completing-training-and-verifying-the-results-4c7l</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-9-taking-steps-toward-better-predictions-483c"&gt;previous article&lt;/a&gt;, we completed the implementation for optimizing &lt;code&gt;final_bias&lt;/code&gt; using &lt;strong&gt;gradient descent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this final part, we will run the code and observe what happens &lt;strong&gt;before and after optimization&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before Optimization
&lt;/h2&gt;

&lt;p&gt;When we first run the code, the value of &lt;code&gt;final_bias&lt;/code&gt; is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the starting value before any optimization takes place.&lt;/p&gt;

&lt;p&gt;If we graph the model at this stage, we get the following result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9n9fxlkq4dqvz9vid6u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9n9fxlkq4dqvz9vid6u.png" alt=" " width="764" height="562"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see, the model does &lt;strong&gt;not fit the training data very well&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The predictions are far from the values we want, which means the loss is relatively large.&lt;/p&gt;




&lt;h2&gt;
  
  
  Watching Gradient Descent Update the Bias
&lt;/h2&gt;

&lt;p&gt;As the training loop runs, we can observe &lt;code&gt;final_bias&lt;/code&gt; changing at each step of &lt;strong&gt;gradient descent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;With every epoch:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The model calculates the loss.&lt;/li&gt;
&lt;li&gt;Backpropagation computes the derivatives.&lt;/li&gt;
&lt;li&gt;The optimizer updates &lt;code&gt;final_bias&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The process repeats.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Over time, the model gradually moves toward a better value for &lt;code&gt;final_bias&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  After Optimization
&lt;/h2&gt;

&lt;p&gt;After &lt;strong&gt;34 steps&lt;/strong&gt;, the &lt;strong&gt;total loss becomes very small&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At this point, the optimization process stops.&lt;/p&gt;

&lt;p&gt;The final optimized value becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;16.0019&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felalz5j7wiys5d6fb32g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felalz5j7wiys5d6fb32g.png" alt=" " width="671" height="328"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Verifying the Result
&lt;/h2&gt;

&lt;p&gt;Finally, we can verify that the optimized model now fits the training data correctly by graphing the updated outputs.&lt;/p&gt;

&lt;p&gt;Using the graphing code from earlier, we get:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyb90p1g6kdfv6mi49u1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyb90p1g6kdfv6mi49u1.png" alt=" " width="716" height="574"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now the model fits the training data much better.&lt;/p&gt;

&lt;p&gt;This shows that &lt;strong&gt;gradient descent successfully optimized &lt;code&gt;final_bias&lt;/code&gt;&lt;/strong&gt;, allowing the neural network to learn the correct relationship from the training data.&lt;/p&gt;

&lt;p&gt;So that is it for this small introduction to building and training neural networks with PyTorch.&lt;/p&gt;

&lt;p&gt;We will explore more areas related to coding neural networks in the coming articles.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 9: Taking Steps Toward Better Predictions</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Tue, 09 Jun 2026 19:14:51 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-9-taking-steps-toward-better-predictions-483c</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-9-taking-steps-toward-better-predictions-483c</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-8-training-with-multiple-inputs-fo"&gt;previous article&lt;/a&gt;, we went through the optimization loop and passed all &lt;strong&gt;three training inputs&lt;/strong&gt; through the model.&lt;/p&gt;

&lt;p&gt;In this article, we will explore the additional steps we need to take when the &lt;strong&gt;total loss is not yet small enough&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Taking a Step Toward a Better Bias
&lt;/h2&gt;

&lt;p&gt;If &lt;code&gt;total_loss&lt;/code&gt; is still too large, we need to take a small step toward a better value for &lt;code&gt;final_bias&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We do this using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the previous article, we saw that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;calculates derivatives and stores them inside the model parameters.&lt;/p&gt;

&lt;p&gt;The optimizer can then use these stored derivatives to determine the &lt;strong&gt;correct direction to update the parameter&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Clearing Old Derivatives
&lt;/h2&gt;

&lt;p&gt;After updating the model, we need to clear the stored derivatives.&lt;/p&gt;

&lt;p&gt;We do this using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the updated training loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;input_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;label_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;label_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.0001&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Num steps: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Step: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; Final Bias: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_bias&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why Do We Need &lt;code&gt;zero_grad()&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;We clear the derivatives because of how PyTorch works.&lt;/p&gt;

&lt;p&gt;Earlier, we saw that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;accumulates derivatives&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This means that if we do not clear them, then the next time we enter the loop, the new derivatives will be added to the old derivatives from the previous step.&lt;/p&gt;

&lt;p&gt;That would lead to incorrect updates.&lt;/p&gt;

&lt;p&gt;So after every optimization step, we reset the stored derivatives using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Tracking the Final Bias
&lt;/h2&gt;

&lt;p&gt;At the end of each loop, we print:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the current &lt;strong&gt;epoch number&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;the current value of &lt;strong&gt;&lt;code&gt;final_bias&lt;/code&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows us to track how &lt;code&gt;final_bias&lt;/code&gt; changes during training.&lt;/p&gt;

&lt;p&gt;The process continues until:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the &lt;strong&gt;total loss becomes very small&lt;/strong&gt;, or&lt;/li&gt;
&lt;li&gt;we finish all &lt;strong&gt;100 epochs&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Printing the Final Optimized Bias
&lt;/h2&gt;

&lt;p&gt;Once training is complete, we can print the final optimized value for &lt;code&gt;final_bias&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Final bias, after optimization: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_bias&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;In the next article, we will see how this training process actually runs and how the value of &lt;code&gt;final_bias&lt;/code&gt; changes over time, to get our final result from the script.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 8: Training with Multiple Inputs</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Mon, 08 Jun 2026 19:24:32 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-8-training-with-multiple-inputs-fo</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-8-training-with-multiple-inputs-fo</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-7-training-with-loss-and-derivatives-1j8c"&gt;previous article&lt;/a&gt;, we began the optimization loop using the &lt;strong&gt;first input value&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this article, we will continue the same process for the remaining inputs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Processing the Second Training Point
&lt;/h2&gt;

&lt;p&gt;We start by selecting the &lt;strong&gt;input dose&lt;/strong&gt; and the &lt;strong&gt;label&lt;/strong&gt; for the second point in the training dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpvrtjd9ynphslrjyigd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpvrtjd9ynphslrjyigd.png" alt=" " width="571" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we run this second input through the model to get a &lt;strong&gt;predicted output&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Calculating the Loss
&lt;/h2&gt;

&lt;p&gt;Now, we calculate the &lt;strong&gt;loss&lt;/strong&gt;, which in this case is the &lt;strong&gt;squared residual&lt;/strong&gt; between the predicted value and the observed value.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdd1890fyne5k2u8d2df.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdd1890fyne5k2u8d2df.png" alt=" " width="552" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Calculating the Derivative
&lt;/h2&gt;

&lt;p&gt;After calculating the loss, we call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This calculates the &lt;strong&gt;derivative of the loss function with respect to &lt;code&gt;final_bias&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;However, there is something important to understand here.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;loss.backward()&lt;/code&gt; does &lt;strong&gt;not replace&lt;/strong&gt; the derivative from the previous training point.&lt;/p&gt;

&lt;p&gt;Instead, it &lt;strong&gt;adds to it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The derivative from the &lt;strong&gt;first point&lt;/strong&gt; is remembered.&lt;/li&gt;
&lt;li&gt;The derivative from the &lt;strong&gt;second point&lt;/strong&gt; is calculated.&lt;/li&gt;
&lt;li&gt;Both derivatives are &lt;strong&gt;added together&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, &lt;code&gt;loss.backward()&lt;/code&gt; &lt;strong&gt;accumulates derivatives&lt;/strong&gt; each time we go through the nested loop.&lt;/p&gt;

&lt;p&gt;After this, we add the new loss value to &lt;code&gt;total_loss&lt;/code&gt;, just like before.&lt;/p&gt;

&lt;h2&gt;
  
  
  Processing the Third and Final Input
&lt;/h2&gt;

&lt;p&gt;Now, let us process the &lt;strong&gt;third and final training point&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Again, we calculate the &lt;strong&gt;squared residual&lt;/strong&gt; for the last point.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5d7eptd5vhxw7o1a5fw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5d7eptd5vhxw7o1a5fw.png" alt=" " width="716" height="548"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When we call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PyTorch adds the derivative for this final point to the derivatives from the previous two points.&lt;/p&gt;

&lt;p&gt;Then, we add this squared residual to &lt;code&gt;total_loss&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At this stage, we are done processing all the training points for one epoch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Checking Whether Training Should Stop
&lt;/h2&gt;

&lt;p&gt;After the loop finishes, we check whether &lt;code&gt;total_loss&lt;/code&gt; has become very small.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;input_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;label_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;label_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.0001&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Num steps: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;total_loss&lt;/code&gt; becomes very small, it indicates that the model fits the training data well.&lt;/p&gt;

&lt;p&gt;At that point, we can stop training.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;break&lt;/code&gt; statement exits the optimization loop and ends the training process.&lt;/p&gt;

&lt;p&gt;However, if the loss is still too large, there are a few more steps we need to perform.&lt;/p&gt;

&lt;p&gt;We will explore those in the next article.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 7: Training with Loss and Derivatives</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Sun, 07 Jun 2026 20:21:14 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-7-training-with-loss-and-derivatives-1j8c</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-7-training-with-loss-and-derivatives-1j8c</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-6-understanding-epochs-and-loss-19c8"&gt;previous article&lt;/a&gt;, we explored concepts such as &lt;strong&gt;total loss&lt;/strong&gt; and &lt;strong&gt;epochs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Now, we will continue with the training process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;input_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;label_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;label_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

        &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Running Through the Training Data
&lt;/h2&gt;

&lt;p&gt;We start with a &lt;strong&gt;nested &lt;code&gt;for&lt;/code&gt; loop&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This loop runs each data point from the training dataset through the model and calculates the &lt;strong&gt;total loss&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The inner loop starts with the &lt;strong&gt;first training point&lt;/strong&gt; and determines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the input value (or &lt;strong&gt;dose&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;the known output value (or &lt;strong&gt;effectiveness&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We store these values in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;input_i&lt;/span&gt;
&lt;span class="n"&gt;label_i&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting the Predicted Output
&lt;/h2&gt;

&lt;p&gt;Next, we run the input dose through the neural network:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives us the &lt;strong&gt;predicted output&lt;/strong&gt; from the model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Calculating the Loss
&lt;/h2&gt;

&lt;p&gt;Now, we calculate the difference between the &lt;strong&gt;predicted output&lt;/strong&gt; and the &lt;strong&gt;known label&lt;/strong&gt; using a &lt;strong&gt;loss function&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this example, we use the &lt;strong&gt;squared residual&lt;/strong&gt;, which is simply the square of the difference between the predicted value and the known value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_i&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;label_i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predicted output = &lt;strong&gt;0&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Known label = &lt;strong&gt;0&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the loss would be &lt;strong&gt;0&lt;/strong&gt;, which means the prediction is perfect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Calculating Derivatives with Backpropagation
&lt;/h2&gt;

&lt;p&gt;After calculating the loss, we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This calculates the &lt;strong&gt;derivative of the loss function&lt;/strong&gt; with respect to the parameters we want to optimize.&lt;/p&gt;

&lt;p&gt;In our case, this helps PyTorch determine how the &lt;strong&gt;final bias&lt;/strong&gt; should change to reduce the loss.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tracking the Total Loss
&lt;/h2&gt;

&lt;p&gt;Finally, we add the loss value to &lt;code&gt;total_loss&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;total_loss&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows us to keep track of &lt;strong&gt;how well the model fits the entire training dataset&lt;/strong&gt; during each epoch.&lt;/p&gt;

&lt;p&gt;We have only gone through the &lt;strong&gt;first training point&lt;/strong&gt; so far.&lt;/p&gt;

&lt;p&gt;We still need to repeat this process for the remaining training points.&lt;/p&gt;

&lt;p&gt;We will continue that in the next article.&lt;/p&gt;







&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Identity Federation Explained: Understanding SSO, OAuth, OIDC, and SAML</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Sat, 06 Jun 2026 20:14:00 +0000</pubDate>
      <link>https://dev.to/rijultp/identity-federation-explained-understanding-sso-oauth-oidc-and-saml-2b3m</link>
      <guid>https://dev.to/rijultp/identity-federation-explained-understanding-sso-oauth-oidc-and-saml-2b3m</guid>
      <description>&lt;p&gt;Have you ever clicked &lt;strong&gt;"Sign in with Google"&lt;/strong&gt; instead of creating a new account for a website?&lt;/p&gt;

&lt;p&gt;If so, you've already used &lt;strong&gt;Identity Federation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Identity Federation allows applications to trust a central identity source, known as an &lt;strong&gt;Identity Provider (IdP)&lt;/strong&gt;, to authenticate users. Instead of managing separate usernames and passwords for every application, users can sign in once and access multiple services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Identity Federation in Action
&lt;/h2&gt;

&lt;p&gt;Imagine your company uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slack&lt;/li&gt;
&lt;li&gt;Jira&lt;/li&gt;
&lt;li&gt;Salesforce&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without identity federation, you would need separate credentials for each application.&lt;/p&gt;

&lt;p&gt;With identity federation, you sign in using your company account, and the other applications trust that authentication. This creates a smoother and more secure user experience.&lt;/p&gt;

&lt;p&gt;This capability is known as &lt;strong&gt;Single Sign-On (SSO)&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Components
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Identity Provider (IdP)&lt;/strong&gt;&lt;br&gt;
The system that verifies a user's identity. Examples include Microsoft Entra ID, Okta, Google Identity, and Keycloak.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service Provider (SP)&lt;/strong&gt;&lt;br&gt;
The application a user wants to access, such as Slack, Jira, or Salesforce.&lt;/p&gt;

&lt;p&gt;When an application claims to support &lt;strong&gt;federation&lt;/strong&gt;, it means it can integrate with external Identity Providers and trust them for user authentication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Federation Protocols
&lt;/h2&gt;

&lt;h3&gt;
  
  
  OAuth 2.0
&lt;/h3&gt;

&lt;p&gt;OAuth 2.0 is an authorization framework that allows applications to access resources on a user's behalf without requiring their password.&lt;/p&gt;

&lt;p&gt;For example, a photo-printing app can request access to your Google Photos account. After you approve the request, Google grants the application limited access without exposing your password.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenID Connect (OIDC)
&lt;/h3&gt;

&lt;p&gt;OpenID Connect (OIDC) is built on top of OAuth 2.0 and adds user authentication.&lt;/p&gt;

&lt;p&gt;When you click &lt;strong&gt;"Sign in with Google"&lt;/strong&gt;, OIDC helps the application verify who you are and obtain basic profile information, such as your name or email address.&lt;/p&gt;

&lt;h3&gt;
  
  
  SAML
&lt;/h3&gt;

&lt;p&gt;Security Assertion Markup Language (SAML) is an XML-based standard used to exchange authentication information between an Identity Provider and a Service Provider.&lt;/p&gt;

&lt;p&gt;SAML is commonly used in enterprise environments to enable Single Sign-On across multiple business applications.&lt;/p&gt;

&lt;p&gt;Identity Federation enables organizations to centralize authentication and provide Single Sign-On across multiple applications.&lt;/p&gt;

&lt;p&gt;The most common technologies involved are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OAuth 2.0&lt;/strong&gt; for authorization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OIDC&lt;/strong&gt; for authentication and identity information&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SAML&lt;/strong&gt; for enterprise Single Sign-On&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 6: Understanding Epochs and Loss</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Fri, 05 Jun 2026 12:10:59 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-6-understanding-epochs-and-loss-19c8</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-6-understanding-epochs-and-loss-19c8</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-5-preparing-the-model-for-training-7b2"&gt;previous article&lt;/a&gt;, we prepared everything needed to optimize our neural network and find the ideal value for the &lt;strong&gt;final bias&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this article, we will begin implementing the optimization process.&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating the Optimizer
&lt;/h2&gt;

&lt;p&gt;First, we create an &lt;strong&gt;optimizer object&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We will use &lt;strong&gt;Stochastic Gradient Descent (SGD)&lt;/strong&gt; to optimize &lt;code&gt;final_bias&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To optimize &lt;code&gt;final_bias&lt;/code&gt;, we pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to &lt;code&gt;SGD&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;PyTorch will automatically optimize every parameter where:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In our case, only &lt;code&gt;final_bias&lt;/code&gt; has &lt;code&gt;requires_grad=True&lt;/code&gt;, so that is the only parameter that will be updated during training.&lt;/p&gt;

&lt;p&gt;Here, &lt;code&gt;lr&lt;/code&gt; stands for &lt;strong&gt;learning rate&lt;/strong&gt;, which is set to &lt;strong&gt;0.1&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The learning rate controls how large each update step is during optimization.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding Epochs
&lt;/h2&gt;

&lt;p&gt;Before continuing, there is one important term we need to understand: &lt;strong&gt;epoch&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;epoch&lt;/strong&gt; is one complete pass through the entire training dataset.&lt;/p&gt;

&lt;p&gt;In this example, our training data contains &lt;strong&gt;3 data points&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Every time all 3 training points are passed through the model once, we call it &lt;strong&gt;one epoch&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the Optimization Loop
&lt;/h2&gt;

&lt;p&gt;We can now start the optimization process using a &lt;code&gt;for&lt;/code&gt; loop that counts the number of epochs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop will run the training process &lt;strong&gt;100 times&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In other words, the model will see the full training dataset &lt;strong&gt;100 times&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tracking the Loss
&lt;/h2&gt;

&lt;p&gt;Next, we initialize a variable called &lt;code&gt;total_loss&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This stores the &lt;strong&gt;loss&lt;/strong&gt;, which is a measure of &lt;strong&gt;how well the model fits the training data&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To better understand &lt;code&gt;total_loss&lt;/code&gt;, let us look at an example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73pg5l47jy5ipccu3mp1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73pg5l47jy5ipccu3mp1.png" alt=" " width="573" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the figure above, the unoptimized model fits the training data poorly.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;residuals&lt;/strong&gt; (the difference between what the model predicts and what we know is true) are large.&lt;/p&gt;

&lt;p&gt;Because the residuals are large, the &lt;strong&gt;loss will also be relatively large&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;Now imagine the model improves and fits the training data more closely.&lt;/p&gt;

&lt;p&gt;The residuals become smaller.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaccd8h5vcht1qt0bi5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaccd8h5vcht1qt0bi5n.png" alt=" " width="739" height="603"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this case, the &lt;strong&gt;loss becomes smaller&lt;/strong&gt; because the model predictions are closer to the correct values.&lt;/p&gt;




&lt;p&gt;So, during each epoch, we use &lt;code&gt;total_loss&lt;/code&gt; to keep track of &lt;strong&gt;how well the model fits the training data&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We will continue building the optimization process in the next article.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 5: Preparing the Model for Training</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Thu, 04 Jun 2026 12:05:01 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-5-preparing-the-model-for-training-7b2</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-5-preparing-the-model-for-training-7b2</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-4-testing-the-neural-network-2j5f"&gt;previous article&lt;/a&gt;, we created input values for our neural network and tested it out.&lt;/p&gt;

&lt;p&gt;In this article, we will start preparing for &lt;strong&gt;backpropagation&lt;/strong&gt; so that we can handle cases where the &lt;strong&gt;final bias is unknown&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating a Trainable Version of the Neural Network
&lt;/h2&gt;

&lt;p&gt;To begin, we will create a copy of our neural network.&lt;/p&gt;

&lt;p&gt;Let us call it &lt;code&gt;MyBasicNN_train&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We name it this way because this version of the neural network will be &lt;strong&gt;trained&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN_train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w01&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;40.8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w10&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;12.6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b10&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w11&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;2.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_bias&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Making the Final Bias Trainable
&lt;/h2&gt;

&lt;p&gt;Here, we initialize &lt;code&gt;final_bias&lt;/code&gt; with a value of &lt;strong&gt;0.0&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We also set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;for &lt;code&gt;final_bias&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This tells PyTorch that this parameter should be &lt;strong&gt;optimized during training&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Unlike the other weights and biases, which remain fixed, the final bias will be updated using &lt;strong&gt;backpropagation&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visualizing the Untrained Model
&lt;/h2&gt;

&lt;p&gt;Now, let us visualize what this untrained model looks like.&lt;/p&gt;

&lt;p&gt;This time, we will use &lt;code&gt;MyBasicNN_train&lt;/code&gt; instead of &lt;code&gt;MyBasicNN&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN_train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;output_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_doses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;whitegrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lineplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;input_doses&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;output_values&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;green&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;linewidth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.5&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Effectiveness&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dose&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;output_values&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separates the tensor from PyTorch’s gradient tracking system so that we can safely plot the values.&lt;/p&gt;

&lt;p&gt;The result looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtx0x04pt4zhl959te5c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtx0x04pt4zhl959te5c.png" alt=" " width="768" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Training Is Needed
&lt;/h2&gt;

&lt;p&gt;In the original graph, we had:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Effectiveness = 1&lt;/strong&gt; when &lt;strong&gt;Dose = 0.5&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;which is the correct value.&lt;/p&gt;

&lt;p&gt;However, in this new graph, we get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Effectiveness = 17&lt;/strong&gt; when &lt;strong&gt;Dose = 0.5&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;which is clearly far too high.&lt;/p&gt;

&lt;p&gt;This tells us that the &lt;strong&gt;final bias is not correct&lt;/strong&gt; and needs to be optimized.&lt;/p&gt;

&lt;p&gt;To fix this, we will train the neural network.&lt;/p&gt;




&lt;h2&gt;
  
  
  Training Data
&lt;/h2&gt;

&lt;p&gt;To train the model, we need two things:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Input values
&lt;/h3&gt;

&lt;p&gt;We will use three input doses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Labels
&lt;/h3&gt;

&lt;p&gt;These are the correct output values we want the model to learn:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With these inputs and labels, we are ready to start optimizing the &lt;strong&gt;final bias&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the next article, we will explore how to do this using &lt;strong&gt;backpropagation&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 4: Testing the Neural Network</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Tue, 02 Jun 2026 19:30:55 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-4-testing-the-neural-network-2j5f</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-4-testing-the-neural-network-2j5f</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-3-forward-passes-1ccn"&gt;previous article&lt;/a&gt;, we defined the &lt;strong&gt;forward pass&lt;/strong&gt; for our neural network.&lt;/p&gt;

&lt;p&gt;Now, we will provide inputs to the network so that we can test whether it works correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating Input Values
&lt;/h2&gt;

&lt;p&gt;To create a sequence of input values, we can define them like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;input_doses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we use the PyTorch function &lt;code&gt;linspace()&lt;/code&gt; to create a tensor containing &lt;strong&gt;11 evenly spaced values between 0 and 1&lt;/strong&gt;, including both endpoints.&lt;/p&gt;

&lt;p&gt;The resulting tensor is stored in a variable called &lt;code&gt;input_doses&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can print &lt;code&gt;input_doses&lt;/code&gt; to see what it looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.0000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.7000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="mf"&gt;0.9000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0000&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Creating the Neural Network
&lt;/h2&gt;

&lt;p&gt;Now, we need to create an instance of our neural network.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Passing Inputs Through the Neural Network
&lt;/h2&gt;

&lt;p&gt;Next, we pass the input values through the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;output_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_doses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, PyTorch automatically calls the &lt;code&gt;forward()&lt;/code&gt; method that we defined earlier.&lt;/p&gt;

&lt;p&gt;The outputs from the neural network are stored in the variable &lt;code&gt;output_values&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Plotting the Results
&lt;/h2&gt;

&lt;p&gt;Now, let us visualize the outputs using a graph.&lt;/p&gt;

&lt;p&gt;For this, we will use &lt;strong&gt;Seaborn&lt;/strong&gt; and &lt;strong&gt;Matplotlib&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;seaborn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;whitegrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lineplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;input_doses&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;output_values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;green&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;linewidth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.5&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we set labels for the graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Effectiveness&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dose&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives us the following result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffb14j7fuqkav0hnq30n2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffb14j7fuqkav0hnq30n2.png" alt=" " width="568" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can try out the code yourself by &lt;a href="https://colab.research.google.com/drive/1L4lngCNSbojTn4GlIyXoBsWh7BAHlmeQ?usp=sharing" rel="noopener noreferrer"&gt;checking out this colab notebook&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the next article, we will explore a scenario where the &lt;strong&gt;final bias is unknown&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We will solve this problem by implementing &lt;strong&gt;backpropagation in code&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 3: Forward Passes</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Mon, 01 Jun 2026 19:18:57 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-3-forward-passes-1ccn</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-3-forward-passes-1ccn</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-2-initializing-weights-and-biases-n7i"&gt;previous article&lt;/a&gt;, we defined an example neural network and started creating variables for the &lt;strong&gt;weights and biases&lt;/strong&gt; using PyTorch.&lt;/p&gt;

&lt;p&gt;In this article, we will continue further by exploring how to perform &lt;strong&gt;forward passes&lt;/strong&gt; through the neural network.&lt;/p&gt;

&lt;p&gt;For reference: This is the neural network which we are going to recreate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3lto81noak6djghc0rw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3lto81noak6djghc0rw.png" alt=" " width="800" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, lets begin!&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating the First Calculation
&lt;/h2&gt;

&lt;p&gt;We will start by creating &lt;code&gt;input_to_top_relu&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_to_top_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we take the input, multiply it by the weight &lt;code&gt;w00&lt;/code&gt;, and add the bias &lt;code&gt;b00&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This gives us the input for the first &lt;strong&gt;ReLU activation function&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Applying the ReLU Activation Function
&lt;/h2&gt;

&lt;p&gt;Next, we pass &lt;code&gt;input_to_top_relu&lt;/code&gt; through the &lt;strong&gt;ReLU activation function&lt;/strong&gt; using &lt;code&gt;F.relu()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_to_top_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt;
    &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_top_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;F&lt;/code&gt; comes from the following import:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn.functional&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This module gives us access to activation functions such as &lt;strong&gt;ReLU&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Scaling the ReLU Output
&lt;/h2&gt;

&lt;p&gt;Now we use &lt;code&gt;top_relu_output&lt;/code&gt; to calculate &lt;code&gt;scaled_top_relu_output&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here, we simply multiply the output by another weight.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_to_top_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt;
    &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_top_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w01&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Repeating the Same Process for the Bottom Path
&lt;/h2&gt;

&lt;p&gt;We perform the same set of operations for the bottom part of the network.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_to_top_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt;
    &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_top_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w01&lt;/span&gt;

    &lt;span class="n"&gt;input_to_bottom_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w10&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b10&lt;/span&gt;
    &lt;span class="n"&gt;bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_bottom_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w11&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Combining Everything Together
&lt;/h2&gt;

&lt;p&gt;Finally, we add the outputs from the top and bottom paths along with the final bias.&lt;/p&gt;

&lt;p&gt;Then, we pass the result through one final &lt;strong&gt;ReLU activation function&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_to_top_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt;
    &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_top_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_top_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;top_relu_output&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w01&lt;/span&gt;

    &lt;span class="n"&gt;input_to_bottom_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w10&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b10&lt;/span&gt;
    &lt;span class="n"&gt;bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_bottom_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scaled_bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bottom_relu_output&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w11&lt;/span&gt;

    &lt;span class="n"&gt;input_to_final_relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;scaled_top_relu_output&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;scaled_bottom_relu_output&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_bias&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_to_final_relu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What We Have Built So Far
&lt;/h2&gt;

&lt;p&gt;At this point, our neural network has &lt;strong&gt;two methods&lt;/strong&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;__init__()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This method is used to &lt;strong&gt;initialize the weights and biases&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;forward()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This method performs a &lt;strong&gt;forward pass&lt;/strong&gt; through the neural network.&lt;/p&gt;

&lt;p&gt;It takes an input value and calculates the output using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;weights&lt;/li&gt;
&lt;li&gt;biases&lt;/li&gt;
&lt;li&gt;activation functions&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Now that we have built the forward pass, the next step is to test whether everything works correctly by plugging in some input values.&lt;/p&gt;

&lt;p&gt;We will explore that in the next article.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 2: Initializing Weights and Biases</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Sun, 31 May 2026 20:11:34 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-2-initializing-weights-and-biases-n7i</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-2-initializing-weights-and-biases-n7i</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/pytorch-for-neural-networks-part-1-writing-your-first-neural-network-in-pytorch-37lc"&gt;previous article&lt;/a&gt;, we got started with expressing a neural network in the form of Python code.&lt;/p&gt;

&lt;p&gt;In this article, we will continue building on that.&lt;/p&gt;

&lt;p&gt;This is the neural network that we will recreate using code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5jkl14aa87wzovh50ig.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5jkl14aa87wzovh50ig.png" alt=" " width="800" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see the weights and biases shown in the diagram above. Let us now add them to our code.&lt;/p&gt;

&lt;p&gt;We start with the basic neural network class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our first weight has the value &lt;strong&gt;1.70&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We can represent it like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we initialize a new variable called &lt;code&gt;w00&lt;/code&gt; and make it a &lt;strong&gt;neural network parameter&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When we define a weight as a parameter, PyTorch treats it as part of the neural network and gives us the option to optimize it during training.&lt;/p&gt;

&lt;p&gt;Since this value is stored as a &lt;strong&gt;tensor&lt;/strong&gt;, the neural network can take advantage of features such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;automatic differentiation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;accelerated mathematical operations&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are unfamiliar with tensors, check out my &lt;a href="https://dev.to/rijultp/tensors-explained-part-1-how-ai-systems-represent-data-mbj"&gt;earlier article on tensors&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Since we do not need to optimize this weight, we set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;requires_grad&lt;/code&gt; is short for &lt;strong&gt;requires gradient&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;By setting it to &lt;code&gt;False&lt;/code&gt;, we tell PyTorch that this parameter should not be updated during optimization.&lt;/p&gt;

&lt;p&gt;In a similar way, we can define the rest of the weights and biases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w00&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b00&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w01&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;40.8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w10&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;12.6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b10&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w11&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;2.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_bias&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;16.&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;requires_grad&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;Now that we have initialized the weights and biases, the next step is to create a &lt;strong&gt;forward pass&lt;/strong&gt; through the neural network.&lt;/p&gt;

&lt;p&gt;The forward pass defines how the input moves through the network using these weights and biases.&lt;/p&gt;

&lt;p&gt;To handle this logic, we need to define another method called &lt;code&gt;forward()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We will cover that in the next article.&lt;/p&gt;







&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Pytorch for Neural Networks Part 1: Writing Your First Neural Network in Pytorch</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Fri, 29 May 2026 21:16:24 +0000</pubDate>
      <link>https://dev.to/rijultp/pytorch-for-neural-networks-part-1-writing-your-first-neural-network-in-pytorch-37lc</link>
      <guid>https://dev.to/rijultp/pytorch-for-neural-networks-part-1-writing-your-first-neural-network-in-pytorch-37lc</guid>
      <description>&lt;p&gt;In my previous series of articles, we mainly explored the &lt;strong&gt;theory behind various neural network concepts&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this new series, we will focus on &lt;strong&gt;putting that knowledge into practice using code&lt;/strong&gt;. This will be a fun way to turn what we have learned into something more practical.&lt;/p&gt;

&lt;p&gt;We will start with the basics and build things step by step.&lt;/p&gt;

&lt;p&gt;For this article, we will be using the following modules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Importing PyTorch
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;torch&lt;/code&gt; is used to create &lt;strong&gt;tensors&lt;/strong&gt;, which store all the numerical data in neural networks, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;raw input data&lt;/li&gt;
&lt;li&gt;weights&lt;/li&gt;
&lt;li&gt;biases&lt;/li&gt;
&lt;/ul&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This module helps us define and build neural network components.&lt;/p&gt;

&lt;p&gt;It also allows us to make weights and biases part of the neural network.&lt;/p&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn.functional&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This module gives us access to various &lt;strong&gt;activation functions&lt;/strong&gt; and other useful operations.&lt;/p&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torch.optim&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SGD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SGD&lt;/code&gt;, which stands for &lt;strong&gt;Stochastic Gradient Descent&lt;/strong&gt;, is an optimization algorithm used to fit the neural network to data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating a Neural Network
&lt;/h2&gt;

&lt;p&gt;Now let us begin building our neural network.&lt;/p&gt;

&lt;p&gt;When creating a neural network in PyTorch, we usually start by creating a class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we create a class named &lt;code&gt;MyBasicNN&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This class inherits from a PyTorch class called &lt;code&gt;nn.Module&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;By inheriting from &lt;code&gt;nn.Module&lt;/code&gt;, our class gains all the functionality needed to behave like a neural network in PyTorch.&lt;/p&gt;




&lt;h2&gt;
  
  
  Initializing the Neural Network
&lt;/h2&gt;

&lt;p&gt;Next, we define the initialization method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyBasicNN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we define the constructor (&lt;code&gt;__init__&lt;/code&gt;) for our neural network.&lt;/p&gt;

&lt;p&gt;The line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;calls the initialization method of the parent class &lt;code&gt;nn.Module&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This ensures that all the necessary PyTorch functionality is properly set up for our neural network.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Comes Next?
&lt;/h2&gt;

&lt;p&gt;The next step is to initialize the &lt;strong&gt;weights and biases&lt;/strong&gt; for our neural network.&lt;/p&gt;

&lt;p&gt;Before doing that, we first need an example problem so we know what kind of neural network we want to build.&lt;/p&gt;

&lt;p&gt;We will explore that in the next article.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Tensors Explained Part 2: Why Tensors Are Useful</title>
      <dc:creator>Rijul Rajesh</dc:creator>
      <pubDate>Fri, 29 May 2026 03:52:37 +0000</pubDate>
      <link>https://dev.to/rijultp/tensors-explained-part-2-why-tensors-are-useful-31g1</link>
      <guid>https://dev.to/rijultp/tensors-explained-part-2-why-tensors-are-useful-31g1</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/rijultp/tensors-explained-part-1-how-ai-systems-represent-data-mbj"&gt;previous article&lt;/a&gt;, we started with a brief introduction to &lt;strong&gt;tensors&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this article, we will explore &lt;strong&gt;why tensors are useful&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Tensors Matter
&lt;/h2&gt;

&lt;p&gt;Unlike normal scalars, arrays, matrices, and multi-dimensional matrices, tensors are designed to take advantage of &lt;strong&gt;hardware acceleration&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tensors do not just store data in different shapes.&lt;/p&gt;

&lt;p&gt;They are also designed to perform mathematical operations on that data &lt;strong&gt;efficiently and quickly&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tensors and Hardware Acceleration
&lt;/h2&gt;

&lt;p&gt;Tensors can take advantage of &lt;strong&gt;GPUs (Graphics Processing Units)&lt;/strong&gt;, which many of us use in our day-to-day devices.&lt;/p&gt;

&lt;p&gt;GPUs are very good at performing many mathematical calculations in parallel, making them useful for training neural networks.&lt;/p&gt;

&lt;p&gt;There is also specialized hardware called &lt;strong&gt;TPUs (Tensor Processing Units)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;TPUs are specifically designed to work with tensors and help neural networks run even faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automatic Differentiation
&lt;/h2&gt;

&lt;p&gt;Another important use case of tensors is in &lt;strong&gt;backpropagation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In neural networks, we estimate the optimal weights and biases using backpropagation.&lt;/p&gt;

&lt;p&gt;This process requires calculating many derivatives and applying the &lt;strong&gt;chain rule&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of manually calculating all these derivatives, tensor frameworks can handle this automatically using something called &lt;strong&gt;automatic differentiation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This means that even as neural networks become more complex, tensors help manage the difficult mathematical calculations behind the scenes.&lt;/p&gt;

&lt;p&gt;So that is it for tensors.&lt;/p&gt;

&lt;p&gt;In the next article, we will explore another topic&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c6mz17iiajj885fmxgb.png" alt=" " width="360" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;git-lrc&lt;/a&gt; fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.&lt;/p&gt;

&lt;p&gt;Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.&lt;/p&gt;

&lt;p&gt;Give it a ⭐ &lt;a href="https://github.com/HexmosTech/git-lrc" rel="noopener noreferrer"&gt;star on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
