<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: dattran1999</title>
    <description>The latest articles on DEV Community by dattran1999 (@dattran1999).</description>
    <link>https://dev.to/dattran1999</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F203670%2F30ac41e7-2f46-4913-b1f4-46d82df778f6.png</url>
      <title>DEV Community: dattran1999</title>
      <link>https://dev.to/dattran1999</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dattran1999"/>
    <language>en</language>
    <item>
      <title>How Neural Networks Work</title>
      <dc:creator>dattran1999</dc:creator>
      <pubDate>Thu, 25 Feb 2021 11:40:29 +0000</pubDate>
      <link>https://dev.to/dattran1999/how-neural-networks-work-dma</link>
      <guid>https://dev.to/dattran1999/how-neural-networks-work-dma</guid>
      <description>&lt;h1&gt;
  
  
  Teaching Philosophy
&lt;/h1&gt;

&lt;p&gt;I know that there are A LOT of tutorials/blog posts on neural networks already (some of my favourites include 3B1B series on YouTube),&lt;br&gt;
but I am a big advocate of learning by doing. So this series will not just present a bunch of information to you, &lt;br&gt;
but actually asking you to implement the things we covered in each post.&lt;/p&gt;
&lt;h1&gt;
  
  
  Introduction to Neural Network
&lt;/h1&gt;



&lt;p&gt;The inspiration of &lt;strong&gt;Artificial Neural Networks&lt;/strong&gt; (or neural network for short) comes from Biological Neural Networks. But I haven't had a biology class &lt;br&gt;
since high school so I have no idea how a biological neural network works :) but I bet it looks something like this:&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1063%2F0%2Au-AnjlGU9IxM5_Ju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1063%2F0%2Au-AnjlGU9IxM5_Ju.png" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For this tutorial, we will go through the &lt;em&gt;primitive&lt;/em&gt; building block of an Artificial Neural Networks, which is a perceptron.&lt;/p&gt;
&lt;h2&gt;
  
  
  Assumed maths knowledge
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Functions&lt;/li&gt;
&lt;li&gt;Coordinates geometry
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;
  
  
  Perceptron
&lt;/h1&gt;

&lt;p&gt;Perceptron and its learning rule is not popular anymore, but it is a great start for building an understanding of how everything works.&lt;/p&gt;

&lt;p&gt;The goal of perception is to &lt;strong&gt;classify&lt;/strong&gt; sets of points.&lt;/p&gt;
&lt;h2&gt;
  
  
  How perceptron works
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Definition&lt;/strong&gt;: A perceptron is a function that takes several inputs, and produces one output.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpythonmachinelearning.pro%2Fwp-content%2Fuploads%2F2017%2F09%2FSingle-Perceptron.png.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpythonmachinelearning.pro%2Fwp-content%2Fuploads%2F2017%2F09%2FSingle-Perceptron.png.webp" alt="img"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Formula&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fy%3Df%28w_0%2Bw_1x_1%2Bw_2x_2%2Bw_3x_3%29" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fy%3Df%28w_0%2Bw_1x_1%2Bw_2x_2%2Bw_3x_3%29" alt="\Large y=f(w_0+w_1x_1+w_2x_2+w_3x_3)"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
where w's are the weights, f is the activation function (explained below), x's are the inputs, and y is the output.&lt;/p&gt;

&lt;p&gt;This is basically putting a polynomial into some function called activation function.&lt;/p&gt;

&lt;p&gt;And the goal of perceptron is to &lt;strong&gt;classify&lt;/strong&gt; sets of points.&lt;/p&gt;
&lt;h2&gt;
  
  
  Weights of perceptron and Classification
&lt;/h2&gt;

&lt;p&gt;To understand the importance of weights, it's useful to think about the case where we only have 2 inputs. &lt;br&gt;
Considering only the part where we multiply inputs by weights and sum them up, we have:&lt;br&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fw_0%2Bw_1x_1%2Bw_2x_2" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fw_0%2Bw_1x_1%2Bw_2x_2" alt="w_0+w_1x_1+w_2x_2"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
Notice that this equation is very similar to the standard form of linear equation, which is of the form:&lt;br&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fw_0%2Bw_1x_1%2Bw_2x_2%3D0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3Fw_0%2Bw_1x_1%2Bw_2x_2%3D0" alt="w_0+w_1x_1+w_2x_2=0"&gt;&lt;/a&gt;  &lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;Consider the following diagram, where we want to classify point A and point B (i.e. finding a way to separate them).&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2F3QvSpEX.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2F3QvSpEX.png" alt="linear equation"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
In the diagram, the line has equation&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3D0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3D0" alt="-2x_1+x_2-1=0"&gt;&lt;/a&gt;&lt;br&gt;
Looking visually, it's clear that the line separates the two points. Below is the mathematical explanation.&lt;/p&gt;

&lt;p&gt;From coordinate geometry, we know that any points to the "above" or "to the left" of the line (e.g. point A) will satisfy &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3E0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3E0" alt="-2x_1+x_2-1&amp;gt;0"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
and any points to the "below" or "to the right" of the line (e.g. point B) will satisfy &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3C0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3C0" alt="-2x_1+x_2-1&amp;lt;0"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;p&gt;With that straight line, we have successfully classified point A and point B into 2 classes. &lt;br&gt;
But that only works visually, not mathematically yet. To make it work mathematically, we need the &lt;strong&gt;activation function&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Importance of Weights
&lt;/h3&gt;

&lt;p&gt;It is important to note that if the weights are different, we might not be able to classify point A and point B. One such example is the line &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F0x_1%2Bx_2%3D0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F0x_1%2Bx_2%3D0" alt="0x_1+x_2=0"&gt;&lt;/a&gt;, which is a horizontal line passing through (0,0)&lt;/p&gt;

&lt;p&gt;So a question to ask it how do we find the weights that will correctly classify the points we have. The answer to that is through perceptron learning, and we will cover that in the next post.&lt;/p&gt;
&lt;h2&gt;
  
  
  Activation function
&lt;/h2&gt;

&lt;p&gt;More often that not, we want the output in the range 0 to 1 only, to notate if that certain perceptron is activated or not. &lt;br&gt;
So we need some function, called activation function, to do that for us. &lt;br&gt;
One simple way to achieve that is to use the &lt;em&gt;heaviside&lt;/em&gt; function, which converts all negative numbers to 0, and all positive numbers (including 0) to 1.&lt;/p&gt;

&lt;p&gt;Coming back to our example, for point A, it satisfies &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3E0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flatex.codecogs.com%2Fsvg.latex%3F-1-2x_1%2Bx_2%3E0" alt="-2x_1+x_2-1&amp;gt;0"&gt;&lt;/a&gt;&lt;br&gt;
Hence putting that in the heaviside function will output 1. With the similar approach, putting B in the heaviside function outputs 0.&lt;/p&gt;

&lt;p&gt;Therefore, we have correctly classify points A and B mathematically.&lt;/p&gt;
&lt;h1&gt;
  
  
  Sum Up
&lt;/h1&gt;

&lt;p&gt;Weights will determine if a straight line (or plane in higher dimension) can separate the points into classes. Only a set of weights will be able to separate the points.&lt;/p&gt;

&lt;p&gt;Activation function is just a function that generalize all points that fit certain criteria.&lt;/p&gt;
&lt;h1&gt;
  
  
  Exercise
&lt;/h1&gt;

&lt;p&gt;Write a function that takes a list of pairs of coordinates, and a list of classes, determine if the given weights will be able to classify the classes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_correct_weights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w_0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w_2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="c1"&gt;# example from above
&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="c1"&gt;# classes[i] is class of coords[i]
&lt;/span&gt;&lt;span class="n"&gt;classes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;is_correct_weights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# True
&lt;/span&gt;&lt;span class="nf"&gt;is_correct_weights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# False
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;NOTE: do it in any language you want, but it is recommended to use Python, since we will use Python much more later on.  &lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
