<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Petro Liashchynskyi</title>
    <description>The latest articles on DEV Community by Petro Liashchynskyi (@liashchynskyi).</description>
    <link>https://dev.to/liashchynskyi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F40306%2F3b174e3f-3094-44d4-b9f1-00ed011c9d44.jpg</url>
      <title>DEV Community: Petro Liashchynskyi</title>
      <link>https://dev.to/liashchynskyi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/liashchynskyi"/>
    <language>en</language>
    <item>
      <title>NestJS CODEX: auxiliaries for CRUD using Mongoose with transactions support</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Sat, 04 Nov 2023 09:45:00 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/nestjs-codex-auxiliaries-for-crud-using-mongoose-with-transactions-support-6m1</link>
      <guid>https://dev.to/liashchynskyi/nestjs-codex-auxiliaries-for-crud-using-mongoose-with-transactions-support-6m1</guid>
      <description>&lt;p&gt;I've been working with Nest for quite a long. Setting up base stuff such as CRUD services, etc. is a bit of exhausting. So I've created a project that can help you with that. &lt;/p&gt;

&lt;p&gt;R﻿epository provides a robust CRUD service using NestJS and Mongoose, designed to simplify the development of database interactions with built-in transaction support via Async Local Storage.&lt;/p&gt;

&lt;h1&gt;
  
  
  Features
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;CRUD Operations: Simplify create, read, update, and delete operations using Mongoose.&lt;/li&gt;
&lt;li&gt;Transaction Management: Handle transactions smoothly and reliably in your services.&lt;/li&gt;
&lt;li&gt;Async Local Storage: Utilize Async Local Storage for context management throughout the life of a request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;F﻿or more information please refer to &lt;a href="https://github.com/liashchynskyi/nestjs-codex/"&gt;https://github.com/liashchynskyi/nestjs-codex/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This post was originally posted on my blog - &lt;a href="https://liashchynskyi.net/posts/nestjs-codex-auxiliaries-for-crud-using-mongoose-with-transactions-support"&gt;https://liashchynskyi.net/posts/nestjs-codex-auxiliaries-for-crud-using-mongoose-with-transactions-support&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nestjs</category>
      <category>webdev</category>
      <category>mongoose</category>
      <category>crud</category>
    </item>
    <item>
      <title>Sending posts from WordPress site to your Telegram channel</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Sat, 08 Feb 2020 11:20:45 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/sending-posts-from-wordpress-site-to-your-telegram-channel-aia</link>
      <guid>https://dev.to/liashchynskyi/sending-posts-from-wordpress-site-to-your-telegram-channel-aia</guid>
      <description>&lt;p&gt;Hi! I'm gonna show you how you can send a message (WordPress post) to a Telegram channel. All you need is a public Telegram channel and a bot. The last one you can create via &lt;strong&gt;BotFather&lt;/strong&gt; in Telegram.&lt;/p&gt;

&lt;p&gt;Steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a public channel and a Telegram BOT (via BotFather).&lt;/li&gt;
&lt;li&gt;Remember the bot token.&lt;/li&gt;
&lt;li&gt;Add the bot to administrators of previously created channel.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At the final step you should add the following code to your &lt;code&gt;functions.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;
&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;telegram_send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="nv"&gt;$new_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$old_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$post&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="nv"&gt;$new_status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'publish'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;  &lt;span class="nv"&gt;$old_status&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'publish'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;$post&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;post_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'post'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$apiToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"TOKEN"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s1"&gt;'chat_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'@channel_name'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'text'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Read more: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nf"&gt;get_permalink&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$post&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="no"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;
   &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file_get_contents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="nv"&gt;$apiToken&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage?"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;http_build_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nf"&gt;add_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="s1"&gt;'transition_post_status'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'telegram_send_message'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Do remember to replace &lt;code&gt;TOKEN&lt;/code&gt; with your actual bot token and &lt;code&gt;channel_name&lt;/code&gt; with the name of the channel.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, all the new posts will be also published in your Telegram channel.&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>php</category>
      <category>telegram</category>
      <category>web</category>
    </item>
    <item>
      <title>My first and not very successful experience with AWS</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Tue, 19 Nov 2019 18:42:06 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/my-first-and-not-very-successful-experience-with-aws-482l</link>
      <guid>https://dev.to/liashchynskyi/my-first-and-not-very-successful-experience-with-aws-482l</guid>
      <description>&lt;p&gt;Hello, today I'm gonna tell you a story about my experience with Amazon Web Services.&lt;/p&gt;

&lt;p&gt;I started using it not so long time ago. What is AWS? Amazon provides on-demand cloud computing platform. VM instances, databases, load balancers, and more. And it's so cool 😁&lt;/p&gt;

&lt;p&gt;So if you create an account for the first time you have the ability of 12 months "free" using several AWS services. This thing is called &lt;strong&gt;Free Tier&lt;/strong&gt; and restricts the use of certain services. For example, the using of &lt;em&gt;EC2&lt;/em&gt; (virtual machines) service is limited to 750 hours. You will not be charged for 750 hours of using your VM and that's it. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unfortunately, there's is a thing I didn't know about when using EC2.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Standard vs Unlimited
&lt;/h1&gt;

&lt;p&gt;If you create an EC2 instance of &lt;em&gt;t2&lt;/em&gt; or &lt;em&gt;t3&lt;/em&gt; type, then there is a thing enabled by default which called &lt;em&gt;unlimited feature&lt;/em&gt;. And it's bad in my case 💩&lt;/p&gt;

&lt;p&gt;Ok, let's figure it out. Every EC2 VM has its own bound of CPU utilization called &lt;em&gt;baseline&lt;/em&gt;. Imagine, you are using a VM for heavy tasks when the CPU is loaded above baseline, even at 100%. In that case (if you have &lt;em&gt;unlimited feature&lt;/em&gt; enabled), you'll pay money for extra CPU utilization. &lt;/p&gt;

&lt;p&gt;But... If you have &lt;em&gt;standard feature&lt;/em&gt; enabled, then your CPU utilization will not exceeds baseline bound. Meanwhile, if the CPU frequency rises above the baseline, it will immediately be reduced - great, because there is no additional cost.&lt;/p&gt;

&lt;p&gt;Unfortunately, I didn't know that and paid almost 9 dollars. Despite the fact that I currently used only  552 hours of 750 free and the average workload per month.., attention, is &lt;strong&gt;zero&lt;/strong&gt;. I don't know how the costs is computed on AWS 😆 I don't know why they charged me $9. &lt;strong&gt;But the money was refunded to me.&lt;/strong&gt; Thanks to Melanie from AWS tech support ♥️.&lt;/p&gt;

&lt;h1&gt;
  
  
  Thoughts
&lt;/h1&gt;

&lt;p&gt;When creating VMs in EC2 - please, disable &lt;em&gt;unlimited feature&lt;/em&gt; if you really don't need it enabled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EC2 -&amp;gt; Instances -&amp;gt; Actions -&amp;gt; Instance Settings -&amp;gt; Change T2/T3 Unlimited -&amp;gt; Disable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Hope it helps someone of you 😉 Stay awesome! Thanks!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazon</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Small, fast and simple Python CLI image converter for CNNs</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Tue, 30 Jul 2019 15:34:32 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/small-fast-and-simple-python-cli-image-converter-for-cnns-27co</link>
      <guid>https://dev.to/liashchynskyi/small-fast-and-simple-python-cli-image-converter-for-cnns-27co</guid>
      <description>&lt;p&gt;&lt;a href="https://camo.githubusercontent.com/993fb4553c12b46fa81bfe4ae4931b33b6c7e892/68747470733a2f2f692e696d6775722e636f6d2f4b49693433315a2e706e67" class="article-body-image-wrapper"&gt;&lt;img src="https://camo.githubusercontent.com/993fb4553c12b46fa81bfe4ae4931b33b6c7e892/68747470733a2f2f692e696d6775722e636f6d2f4b49693433315a2e706e67" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hello, people) I've been working on CLI tool that's gonna help with dataset augmentation and image converting for CNNs or GANs or even other thing that needs images as input data. &lt;/p&gt;

&lt;p&gt;Here it is &lt;a href="https://github.com/liashchynskyi/rudi"&gt;https://github.com/liashchynskyi/rudi&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Support the repo with a star, if you like it! Thanks, hope this tool will help you 😀&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>python</category>
    </item>
    <item>
      <title>Intro to CUDA technology</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Wed, 19 Jun 2019 15:04:05 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/intro-to-cuda-technology-79g</link>
      <guid>https://dev.to/liashchynskyi/intro-to-cuda-technology-79g</guid>
      <description>&lt;p&gt;Hello again! Let's talk about CUDA and how it's gonna help you to speed up the data processing. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;No code today! Only theory  😎 &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Imagine, if you had known there will be no CUDA in the world you would still train any of neural networks forever 🙁 So, what the heck is CUDA?&lt;/p&gt;




&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia (&lt;a href="https://en.wikipedia.org/wiki/CUDA"&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Before we begin, you should understand what is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;device&lt;/strong&gt; - video card by itself, GPU - runs commands received from CPU&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;host&lt;/strong&gt; - central processor (CPU) - runs certain tasks on &lt;strong&gt;device&lt;/strong&gt;, allocates memory, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kernel&lt;/strong&gt; - function (task) that will be ran by &lt;strong&gt;device&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CUDA allows you to implement algorithms using special syntax of &lt;em&gt;C language&lt;/em&gt;. The architecture of CUDA let you organize GPU instructions access and manage its memory. All in your hands, bro! Be careful. &lt;/p&gt;

&lt;p&gt;Good news - this technology is supported by &lt;a href="https://en.wikipedia.org/wiki/CUDA#Language_bindings"&gt;several&lt;/a&gt; languages. Choose the best one 😉&lt;/p&gt;

&lt;h2&gt;
  
  
  Magic? No 😮
&lt;/h2&gt;

&lt;p&gt;Let's find out how code is launched by GPU.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;host&lt;/strong&gt; allocates some memory on &lt;strong&gt;device&lt;/strong&gt;;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;host&lt;/strong&gt; copies the data from its own memory to &lt;strong&gt;device's&lt;/strong&gt; memory;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;host&lt;/strong&gt; runs &lt;strong&gt;kernel&lt;/strong&gt; on &lt;strong&gt;device&lt;/strong&gt;;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;device&lt;/strong&gt; performs that &lt;strong&gt;kernel&lt;/strong&gt;;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;host&lt;/strong&gt; copies results from &lt;strong&gt;device's&lt;/strong&gt; memory to own memory.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nYGRMyhZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://upload.wikimedia.org/wikipedia/commons/thumb/5/59/CUDA_processing_flow_%2528En%2529.PNG/300px-CUDA_processing_flow_%2528En%2529.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nYGRMyhZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://upload.wikimedia.org/wikipedia/commons/thumb/5/59/CUDA_processing_flow_%2528En%2529.PNG/300px-CUDA_processing_flow_%2528En%2529.PNG" alt="Processing flow on CUDA" width="300" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is no step 1 (allocating memory) on figure, but steps 1 and 2 can be merged.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Cb60VCtj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.imgur.com/IT0sgzh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Cb60VCtj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.imgur.com/IT0sgzh.png" alt="CUDA Runtime" width="450" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;CPU interacts with GPU over the &lt;em&gt;CUDA Runtime API, CUDA Driver API and CUDA Libraries&lt;/em&gt;. The main difference between Runtime and Driver API is pretty simple - it's  a level of abstraction. &lt;/p&gt;

&lt;p&gt;Runtime API (&lt;em&gt;RAPI&lt;/em&gt;) is more abstract, aka more user-friendly. Driver API (&lt;em&gt;DAPI&lt;/em&gt;) - a low level API, driver level. In general, RAPI is an abstract wrapper over the DAPI. You can use both of them. I can tell you from my experience it's more difficult to use DAPI, because you should think about low-level things, that's not funny 😑.&lt;/p&gt;

&lt;p&gt;And you should understand another thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the time spent on creating the kernel will be greater than the time of this kernel actually running, you'll get &lt;strong&gt;zero efficiency&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Anyways, the thing is launching tasks, allocating memory on GPU takes some time, therefore &lt;strong&gt;you shouldn't run "easy" tasks on it&lt;/strong&gt;. Easy tasks can be performed by your CPU in milliseconds.&lt;/p&gt;

&lt;p&gt;Should you run a kernel on the GPU even if the CPU can compute it more quickly? Actually no... Why? Let's find it out!&lt;/p&gt;

&lt;h2&gt;
  
  
  Hardware
&lt;/h2&gt;

&lt;p&gt;The architecture of the GPU has built a bit differently than the CPU. Since graphic processors were originally used only for graphical calculations involving independent parallel data processing, the GPU is designed for parallel computing. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZxbT7GKp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/http://docs.nvidia.com/cuda/cuda-c-programming-guide/graphics/gpu-devotes-more-transistors-to-data-processing.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZxbT7GKp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/http://docs.nvidia.com/cuda/cuda-c-programming-guide/graphics/gpu-devotes-more-transistors-to-data-processing.png" alt="GPU Arch" width="431" height="140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The GPU is designed to handle a large number of threads (elementary parallel processes).&lt;/p&gt;

&lt;p&gt;As you can see, the GPU consists of many &lt;a href="https://en.wikipedia.org/wiki/Arithmetic_logic_unit"&gt;ALU&lt;/a&gt; merged in several groups with common memory. This approach can speed up productivity, but sometimes it's hard to program something in that way. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In order to achieve the best acceleration, you must think about the strategy of memory accessing and take into account the GPU features.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;GPU is oriented for heavy tasks with large volumes of data and consists of streaming processor array (SPA), that includes texture processor clusters (TPC). TPC consists of a set of streaming multiprocessors (SM), each of them includes several streaming processors (SP) or cores (modern GPU can have more than 1024 cores).&lt;/p&gt;

&lt;p&gt;GPU cores work by &lt;a href="https://en.wikipedia.org/wiki/SIMD"&gt;SIMD&lt;/a&gt; principle, but a bit different.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9zuJBrRc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.redd.it/s008j9ibbfpx.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9zuJBrRc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.redd.it/s008j9ibbfpx.jpg" alt="mem" width="558" height="695"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;SP can work with different data, but they should execute the same command at the same moment of time. Different threads execute the same command.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pXEU6S5e--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.imgur.com/aNYwTFf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pXEU6S5e--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://i.imgur.com/aNYwTFf.png" alt="SM" width="614" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As a result, the GPU actually became a device that implements a stream computing model - there are streams of input and output data, which consist of identical elements, which can be processed independently of each other.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--UaidCWE_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://i.imgur.com/DQ3A1iY.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--UaidCWE_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://i.imgur.com/DQ3A1iY.gif" alt="Kernel" width="545" height="175"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Compute capabilities
&lt;/h2&gt;

&lt;p&gt;Every single GPU has its own coefficient of productivity or &lt;em&gt;compute capabilities&lt;/em&gt; - quantitative characteristic of the performance speed of certain operations on the graphic processor. Nvidia called that &lt;strong&gt;Compute Capability Version&lt;/strong&gt;. Higher version is better than lower 😁&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Compute Capability Version&lt;/th&gt;
&lt;th&gt;GPU Chip&lt;/th&gt;
&lt;th&gt;Videocard&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;G80, G92, G92b, G94, G94b&lt;/td&gt;
&lt;td&gt;GeForce 8800GTX/Ultra, Tesla C/D/S870, FX4/5600, 360M, GT 420&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.1&lt;/td&gt;
&lt;td&gt;G86, G84, G98, G96, G96b, G94, G94b, G92, G92b&lt;/td&gt;
&lt;td&gt;GeForce 8400GS/GT, 8600GT/GTS, 8800GT/GTS, 9400GT, 9600 GSO, 9600GT, 9800GTX/GX2, 9800GT, GTS 250, GT 120/30/40, FX 4/570, 3/580, 17/18/3700, 4700x2, 1xxM, 32/370M, 3/5/770M, 16/17/27/28/36/37/3800M, NVS420/50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.2&lt;/td&gt;
&lt;td&gt;GT218, GT216, GT215&lt;/td&gt;
&lt;td&gt;GeForce 210, GT 220/40, FX380 LP, 1800M, 370/380M, NVS 2/3100M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.3&lt;/td&gt;
&lt;td&gt;GT200, GT200b&lt;/td&gt;
&lt;td&gt;GeForce GTX 260, GTX 275, GTX 280, GTX 285, GTX 295, Tesla C/M1060, S1070, Quadro CX, FX 3/4/5800&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.0&lt;/td&gt;
&lt;td&gt;GF100, GF110&lt;/td&gt;
&lt;td&gt;GeForce (GF100) GTX 465, GTX 470, GTX 480, Tesla C2050, C2070, S/M2050/70, Quadro Plex 7000, Quadro 4000, 5000, 6000, GeForce (GF110) GTX 560 TI 448, GTX570, GTX580, GTX590&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;........&lt;/td&gt;
&lt;td&gt;.........&lt;/td&gt;
&lt;td&gt;........&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;GM107, GM108&lt;/td&gt;
&lt;td&gt;GeForce GTX 750 Ti, GeForce GTX 750, GeForce GTX 860M, GeForce GTX 850M, GeForce 840M, GeForce 830M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;........&lt;/td&gt;
&lt;td&gt;.........&lt;/td&gt;
&lt;td&gt;........&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can find the whole list &lt;a href="https://en.wikipedia.org/wiki/CUDA#GPUs_supported"&gt;here&lt;/a&gt;. Compute Capability Version describes a lot of parameters such as quantity of threads per block, max number of threads and blocks, size of warp and &lt;a href="https://en.wikipedia.org/wiki/CUDA#Version_features_and_specifications"&gt;more&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Threads, blocks and grids
&lt;/h2&gt;

&lt;p&gt;CUDA uses a lot of separate threads for computing. All of them are grouped in hierarchy like that - &lt;strong&gt;&lt;em&gt;grid / block / thread&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3BmYysde--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://i.imgur.com/NpL4VNC.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3BmYysde--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://i.imgur.com/NpL4VNC.gif" width="524" height="526"&gt;&lt;/a&gt;Blocks struct&lt;/p&gt;

&lt;p&gt;The top layer – &lt;em&gt;grid&lt;/em&gt; – is related to the kernel and unites all threads performing that kernel. Grid is an 1D- or 2D-array of &lt;em&gt;blocks&lt;/em&gt;. Each block is an 1D / 2D / 3D array of &lt;em&gt;threads&lt;/em&gt;. In this case, each block represents a completely independent set of coordinated threads. &lt;strong&gt;Threads from different blocks cannot interact with each other&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Above, I mentioned the difference from the SIMD architecture. There is still a concept like &lt;strong&gt;warp&lt;/strong&gt; - a group of 32 threads (depending on the architecture of the GPU, but almost always 32). So, only threads within the same group (warp) can be physically executed at the same moment of time. Threads of different warps can be at different stages of the program running. This method of data processing is called &lt;strong&gt;SIMT&lt;/strong&gt; (Single Instruction - Multiple Threads). Warp's management is carried out at the hardware level.&lt;/p&gt;

&lt;h2&gt;
  
  
  In some cases GPU is slower than CPU, but why?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Don't try to run easy tasks on your GPU&lt;/strong&gt;. I'm gonna explain that.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Delay&lt;/em&gt; - it's the waiting time between requesting for a particular resource and accessing that resource;&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Bandwidth&lt;/em&gt; - the number of operations that are performed per unit of time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, the main question is: why does a graphics processor sometimes stumble? Let's find it out! &lt;/p&gt;

&lt;p&gt;We have two cars:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;passenger van - speed &lt;em&gt;120 km/h&lt;/em&gt;, capacity of 9 people;&lt;/li&gt;
&lt;li&gt;bus - speed &lt;em&gt;90 km/h&lt;/em&gt;, capacity of 30 people.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If an operation is the movement of one person at a certain distance, let it be 1 kilometer, then the delay (the time for which one person will pass 1 km) for the first car is &lt;em&gt;3600/120 = 30s&lt;/em&gt;, and the bandwidth is &lt;em&gt;9/30 = 0.3&lt;/em&gt;.&lt;br&gt;
For the bus it's &lt;em&gt;3600/90 = 40s&lt;/em&gt;, and the bandwidth is &lt;em&gt;30/40 = 0.75&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Thus the CPU is a passenger van, the GPU is a bus: it has a big delay, but also a large bandwidth. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the delay of each particular operation is not as important as the number of these operations per second for your task it is worth considering the use of the GPU.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Thoughts
&lt;/h2&gt;

&lt;p&gt;The distinctive features of the GPU (compared to the CPU) are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;architecture, maximally aimed at increasing the speed of calculation of textures and complex graphic objects;&lt;/li&gt;
&lt;li&gt;The peak power of the typical GPU is much higher than that of the CPU;&lt;/li&gt;
&lt;li&gt;Thanks to the specialized pipelining architecture, the GPU is much more effectively in processing graphical information than the CPU.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my opinion, the main disadvantage is that this technology is supported only by Nvidia GPUs.&lt;/p&gt;

&lt;p&gt;The GPU may not always give you an acceleration while performing certain algorithms. Therefore, before using the GPU for computing, you need to think carefully if it is necessary in that case. You can use a graphics card for complex calculations: work with graphics or images, engineering calculations, etc., but &lt;strong&gt;do not use the GPU for simple tasks&lt;/strong&gt; (of course, you can, but then the efficiency will be 0).&lt;/p&gt;

&lt;p&gt;See ya! And remember:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When using a GPU it's much easier to slow down a program than that speed it up.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>beginners</category>
      <category>cuda</category>
      <category>machinelearning</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How neural network works? Let's figure it out</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Sat, 19 Jan 2019 13:29:12 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/how-neural-network-works-lets-figure-it-out-32o0</link>
      <guid>https://dev.to/liashchynskyi/how-neural-network-works-lets-figure-it-out-32o0</guid>
      <description>&lt;p&gt;Hey, what's up 😁 In my &lt;a href="https://dev.to/liashchynskyi/creating-of-neural-network-using-javascript-in-7minutes-o21"&gt;previous article&lt;/a&gt; i have described how to build neural network from scratch with only JavaScript. Today, at the request of several people, i'll try to explain mathematical principle of neural networks. Bro, you finally will understand what under the hood of that monster is!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;And first, i'm gonna tell you another secret: there's no magic, just only math 😵&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article is based on my &lt;a href="https://dev.to/liashchynskyi/creating-of-neural-network-using-javascript-in-7minutes-o21"&gt;previous one&lt;/a&gt;. If you don't read it yet, it's time to do that! I will use the same formulas and try to explain them. Let's go! &lt;/p&gt;

&lt;h1&gt;
  
  
  Preparation
&lt;/h1&gt;

&lt;p&gt;I'm gonna solve &lt;a href="https://en.wikipedia.org/wiki/Exclusive_or" rel="noopener noreferrer"&gt;XOR&lt;/a&gt; again 😅 It's not a joke, bro! There are many data science books start with solving it 😎 One more time i remind you XOR input table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Inputs&lt;/th&gt;
&lt;th&gt;Outputs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0 0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0 1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1 0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1 1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;To demonstrate it let's use the following structure of neural network.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1w2zxmaskv8paawp0nl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1w2zxmaskv8paawp0nl.png" alt="nn structure" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here we have &lt;strong&gt;2&lt;/strong&gt; neurons in input layer, &lt;strong&gt;4&lt;/strong&gt; in hidden and &lt;strong&gt;1&lt;/strong&gt; in output layer.&lt;/p&gt;

&lt;h1&gt;
  
  
  Weights initialization
&lt;/h1&gt;

&lt;p&gt;The main goal of neural network training is adjusting the weights to minimize the output error. In most cases, the weights is initializing randomly and during neural net training these ones is adjusting by backpropagation algorithm.&lt;/p&gt;

&lt;p&gt;So, let's initialize the weights randomly from &lt;code&gt;[0, 1]&lt;/code&gt; range.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcw9sic6owyu01y2lj489.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcw9sic6owyu01y2lj489.png" alt="weights" width="281" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Graphically, it looks like this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8plank30mfm39ozqo0wj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8plank30mfm39ozqo0wj.png" alt="weights-init" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Forward propagation
&lt;/h1&gt;

&lt;p&gt;Ok, let's compute neuron inputs. I will use only one input case to save time: &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;1&lt;/code&gt; so the output will be &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The formula:&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tx24mzrzd2y5pom64xf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tx24mzrzd2y5pom64xf.png" alt="net" width="233" height="90"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, for the first neuron in the hidden layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;net1_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;

&lt;span class="cm"&gt;/**

1..n, n = 2 (2 neurons in the input layer)

0 value of the first input element
1 value of the second input element

0.2 the weight from first input neuron to first hidden
0.6 the weight from second input neuron to first hidden

Understand, bro? 😏 

*/&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For second one and others:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;net2_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="nx"&gt;net3_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;span class="nx"&gt;net4_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we need one more thing - we need to choose activation function. I'll use &lt;strong&gt;sigmoid&lt;/strong&gt;.&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa93ncbmrtlhxloayunjp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa93ncbmrtlhxloayunjp.png" alt="sigmoid" width="800" height="598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The formula and derivative:&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnh5y83d5ixzjjtd49cr7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnh5y83d5ixzjjtd49cr7.png" alt="sigm" width="474" height="110"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzve5rj57fx53y4cku2z9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzve5rj57fx53y4cku2z9.png" alt="deriv" width="372" height="78"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So, now we apply our activation to each of computed &lt;strong&gt;net&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;output1_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;net1_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.64&lt;/span&gt;
&lt;span class="nx"&gt;output2_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;net2_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.66&lt;/span&gt;
&lt;span class="nx"&gt;output3_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;net3_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.71&lt;/span&gt;
&lt;span class="nx"&gt;output4_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;net4_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.57&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've got the output values for each neuron in the hidden layer. Graphically, it looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ilxl2iocjdz1fg156n8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ilxl2iocjdz1fg156n8.png" alt="w-hidden" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And now, when we've got output values for hidden layer neurons we can calculate the output value for the output layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;net_o&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.66&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.71&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.57&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.28&lt;/span&gt;
&lt;span class="nx"&gt;output_o&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;net_o&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.28&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.78&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here we go.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcv8nbtraaulpnsugneny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcv8nbtraaulpnsugneny.png" alt="out" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Back propagation
&lt;/h1&gt;

&lt;p&gt;Bro, look at the output value. What do you see? &lt;code&gt;0.78&lt;/code&gt; right? If you remember the XOR table you know that we should have got &lt;code&gt;1&lt;/code&gt; for this case &lt;code&gt;0 1&lt;/code&gt;, but we've got &lt;code&gt;0.78&lt;/code&gt;. That's called an error. Let's calculate that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Output error and delta
&lt;/h3&gt;

&lt;p&gt;The formula:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnuqdieb9i5ugdch0b33s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnuqdieb9i5ugdch0b33s.png" alt="error" width="278" height="60"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;target&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;output_o&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mf"&gt;0.78&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.22&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we need to calculate the &lt;strong&gt;delta error&lt;/strong&gt;. In general, that's the value by which you adjust the weights.&lt;/p&gt;

&lt;p&gt;The formula:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zmwy949xofjw2tr9zki.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zmwy949xofjw2tr9zki.png" alt="delta" width="336" height="76"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can use &lt;a href="https://keisan.casio.com/exec/system/15157249643425" rel="noopener noreferrer"&gt;this&lt;/a&gt; site for sigmoid derivative calculation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output_o&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.22&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.21&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.22&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hidden error and delta
&lt;/h3&gt;

&lt;p&gt;Let's do the same for each neuron in the hidden layer. The formula is different a little bit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslleflud66s59x6amoz4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslleflud66s59x6amoz4.png" alt="error-hidden" width="428" height="75"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We need to calculate the error for each neuron. Remember it, bro. Let's get started!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;error1_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.024&lt;/span&gt;
&lt;span class="nx"&gt;error2_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.028&lt;/span&gt;
&lt;span class="nx"&gt;error3_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.012&lt;/span&gt;
&lt;span class="nx"&gt;error4_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.016&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And again the delta!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3lj622n27fpaa1x3dfa1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3lj622n27fpaa1x3dfa1.png" alt="delta" width="398" height="80"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;delta_error1_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output1_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;error1_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.024&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.22&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.024&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.005&lt;/span&gt;
&lt;span class="nx"&gt;delta_error2_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output2_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;error2_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.66&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.028&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.224&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.028&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.006&lt;/span&gt;
&lt;span class="nx"&gt;delta_error3_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output3_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;error3_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.71&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.012&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.220&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.012&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.002&lt;/span&gt;
&lt;span class="nx"&gt;delta_error4_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output4_h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;error4_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deriv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.57&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.016&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.23&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.016&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.003&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The time has come! 😎
&lt;/h3&gt;

&lt;p&gt;Now, we have all variables to update the weights. The formulas look like this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dq2oyk0sfovuv4p6ms5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dq2oyk0sfovuv4p6ms5.png" alt="wetights" width="669" height="127"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's start from the &lt;em&gt;hidden&lt;/em&gt; to the &lt;em&gt;output&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;

&lt;span class="nx"&gt;hidden_to_output_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;output1_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6000256&lt;/span&gt;
&lt;span class="nx"&gt;hidden_to_output_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;output2_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.66&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7000264&lt;/span&gt;
&lt;span class="nx"&gt;hidden_to_output_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;output3_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.71&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3000284&lt;/span&gt;
&lt;span class="nx"&gt;hidden_to_output_4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;output4_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.57&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.4000228&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've got the values too close to the old weights. It's because we chose the learning rate too small. It's a very important hyper parameter. When you choose it too small - your network will training for years 😄 Otherwise, when it's a large number - your network will train faster, but it's accuracy may be low for new data. So you have to choose it correctly. The optimal value is in range between &lt;code&gt;1e-3&lt;/code&gt; and &lt;code&gt;2e-5&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Ok, let's do the same for the &lt;em&gt;input&lt;/em&gt; to the &lt;em&gt;hidden&lt;/em&gt; synapses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;//for the first hidden neuron&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error1_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.005&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error1_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.005&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.600005&lt;/span&gt;

&lt;span class="c1"&gt;//for the second one&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error2_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.006&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error2_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.006&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.700006&lt;/span&gt;

&lt;span class="c1"&gt;//for the third one&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error3_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.002&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error3_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.002&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.900002&lt;/span&gt;

&lt;span class="c1"&gt;//for the fourth one&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error4_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.003&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
&lt;span class="nx"&gt;input_to_hidden_8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;input_1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;delta_error4_h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;learning_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.003&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.300003&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it! Finally 😉&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusions
&lt;/h1&gt;

&lt;p&gt;Oh, finally we did all the math stuff! But we only did that for one training set - &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;1&lt;/code&gt;. For our problem we solve (XOR) we have &lt;strong&gt;4&lt;/strong&gt; training sets (see the table above). That means you have to do the same calculations we just did above for each training set! Brrr, that's terrible 😑 Too much math 😆&lt;/p&gt;

&lt;p&gt;So, in machine learning when you do one forward propagation step (from the input layer to the output) and one backward (from the output layer to the input) for one training set it's called an &lt;strong&gt;iteration&lt;/strong&gt;. Another important term is &lt;strong&gt;epoch&lt;/strong&gt;. Epoch counter is iterating when you pass through your neural network all the training sets. In our case, we have 4 training sets. One iteration means one training set passed through neural network. When all training sets passed through a network - here we have one epoch. Then: &lt;strong&gt;4&lt;/strong&gt; iterations equals &lt;strong&gt;1&lt;/strong&gt; epoch. Understand, bro? 🤗 In general, more epochs - a higher accuracy, less epochs - a lower accuracy.&lt;/p&gt;

&lt;p&gt;That's it. No magic, only math. Hope, you've understood it, bro 😊 See ya! Happy coding 😇&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>neuralnetworks</category>
      <category>math</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Creating of neural network using JavaScript in 7 minutes!</title>
      <dc:creator>Petro Liashchynskyi</dc:creator>
      <pubDate>Sat, 12 Jan 2019 12:58:53 +0000</pubDate>
      <link>https://dev.to/liashchynskyi/creating-of-neural-network-using-javascript-in-7minutes-o21</link>
      <guid>https://dev.to/liashchynskyi/creating-of-neural-network-using-javascript-in-7minutes-o21</guid>
      <description>&lt;p&gt;Hey, what's up 😁 Today, i'm gonna tell you how to build a simple neural network with JavaScript by your own with no &lt;em&gt;AI frameworks&lt;/em&gt;. Let's go!&lt;/p&gt;

&lt;p&gt;For good understanding you need to know these things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OOP, JS, ES6;&lt;/li&gt;
&lt;li&gt;basic math;&lt;/li&gt;
&lt;li&gt;basic linear algebra.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Simple theory
&lt;/h1&gt;

&lt;p&gt;A neural network is a collection of &lt;strong&gt;neurons&lt;/strong&gt; with &lt;strong&gt;synapses&lt;/strong&gt; connected them. A neuron can be represented as a function that receive some input values and produced some output as a result.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6hs3hvt66ieyu16uuzz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6hs3hvt66ieyu16uuzz.png" alt="Simple neuron" width="326" height="81"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every single synapse has its own &lt;strong&gt;weight&lt;/strong&gt;. So, the main elements of a neural net are neurons connected into layers in specific way.&lt;/p&gt;

&lt;p&gt;Every single neural net has at least an input layer, at least one hidden and an output layer. When each neuron in each layer is connected to all neurons in the next layer then it's called multilayer perceptron (MLP). If neural net has more than one of hidden layer then it's called Deep Neural Network (DNN).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vqs5g2efgy3lotjhf6x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vqs5g2efgy3lotjhf6x.png" alt="DNN" width="597" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The picture represents DNN of type &lt;strong&gt;6–4–3–1&lt;/strong&gt; means 6 neurons in the input layer, 4 in the first hidden, 3 in the second one and 1 in the output layer.&lt;/p&gt;




&lt;h1&gt;
  
  
  Forward propagation
&lt;/h1&gt;

&lt;p&gt;A neuron can have one or more inputs that can be an outputs of other neurons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0eulnpfk3kngyg4ujyn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0eulnpfk3kngyg4ujyn.png" alt="Synapses" width="360" height="208"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;X1 and X2 - input data;&lt;/li&gt;
&lt;li&gt;w1, w2 - weights;&lt;/li&gt;
&lt;li&gt;f(x1, x2) - activation function;&lt;/li&gt;
&lt;li&gt;Y - output value.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, we can describe all the stuff above by mathematical formula:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p4kytvg5glf4h5ws89w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3p4kytvg5glf4h5ws89w.png" alt="Neuron input" width="233" height="90"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The formula describes neuron input value. In this formula: &lt;strong&gt;n - number of inputs, x - input value, w - weight, b - bias&lt;/strong&gt; (w e won't use that feature yet, but only one thing you should know about that now - it always equals to 1).&lt;/p&gt;

&lt;p&gt;As you can see, we need to multiply each input value by its weight and summarize products. We have sum of the products of multiplying &lt;strong&gt;x&lt;/strong&gt; by &lt;strong&gt;w&lt;/strong&gt;. The next step is passing the output value &lt;strong&gt;&lt;em&gt;net&lt;/em&gt;&lt;/strong&gt; through activation function. &lt;strong&gt;&lt;em&gt;The same operation needs to be applied to each neuron in our neural net.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Finally, you know what the forward propagation is.&lt;/p&gt;




&lt;h1&gt;
  
  
  Backward propagation (or backpropagation or just backprop)
&lt;/h1&gt;

&lt;p&gt;Backprop is one of the powerful algorithms first introduced in 1970. &lt;a href="http://neuralnetworksanddeeplearning.com/chap2.html" rel="noopener noreferrer"&gt;[Read more about how it works.]&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Backprop consists of several steps you need apply to each neuron in your neural net.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First of all, you need to calculate error of the output layer of neural net.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8aohg5yke1bupikpdenb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8aohg5yke1bupikpdenb.png" alt="error formula" width="278" height="60"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;target&lt;/strong&gt;  -  true value, &lt;strong&gt;output&lt;/strong&gt;  -  real output from neural net.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Second step is about calculating &lt;strong&gt;&lt;em&gt;delta error value.&lt;/em&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqauirk5r7r8sn4w9nao.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqauirk5r7r8sn4w9nao.png" alt="delta error" width="336" height="76"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;f'&lt;/strong&gt;  -  derivative of activation function.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Calculating an error of hidden layer neurons.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfmy02i6edqqicn0xvts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfmy02i6edqqicn0xvts.png" alt="hidden neuron error" width="428" height="75"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;synapse&lt;/strong&gt;  -  weight of a neuron that's connected between hidden and output layer.&lt;/p&gt;

&lt;p&gt;Then we calculate &lt;strong&gt;&lt;em&gt;delta&lt;/em&gt;&lt;/strong&gt; again, but now for hidden layer neurons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxln9rubgzgm0zaxclgvg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxln9rubgzgm0zaxclgvg.png" alt="hidden delta" width="398" height="80"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;output&lt;/strong&gt;  -  output value of a neuron in a hidden layer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It's time to update the weights.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosk16bbrecjzxxw991fn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosk16bbrecjzxxw991fn.png" alt="weights update" width="669" height="127"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;lrate&lt;/strong&gt;  -  learning rate.&lt;/p&gt;

&lt;p&gt;Buddies, we just used the simplest backprop algorithm and gradient descent😯. If you wanna dive deeper then watch this video.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/Ilg3gGewQ5U"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;And that's it. We’re done with all math. Just code it!!!&lt;/p&gt;




&lt;h1&gt;
  
  
  Practice
&lt;/h1&gt;

&lt;p&gt;So, we’ll create MLP for solving XOR problem (really, man? 😯).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;From the simplest things to the hardest, bro. All in good time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Input, Output for XOR.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37mtbfpsbetmm8esv7pm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37mtbfpsbetmm8esv7pm.png" alt="XOR" width="268" height="208"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We’ll use &lt;strong&gt;Node.js&lt;/strong&gt; platform and &lt;strong&gt;math.js&lt;/strong&gt; library (which is similar to &lt;strong&gt;numpy&lt;/strong&gt; in &lt;strong&gt;Python&lt;/strong&gt;). Run these commands in your terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;mlp &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;mlp 
npm init 
npm &lt;span class="nb"&gt;install &lt;/span&gt;babel-cli babel-preset-env mathjs

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Let’s create a file called &lt;code&gt;activations.js&lt;/code&gt; which will contain our activation functions definition. In our example we’ll use classical sigmoid function (oldschool, bro).&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;



&lt;p&gt;Then let’s create &lt;code&gt;nn.js&lt;/code&gt; file that contains &lt;code&gt;NeuralNetwork&lt;/code&gt; class implementation.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;It seems that something is missing.. ohh, exactly! we need to add &lt;code&gt;trainable&lt;/code&gt; ability to our network.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;And just add &lt;code&gt;predict&lt;/code&gt; method for producing result.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Finally, let’s create &lt;code&gt;index.js&lt;/code&gt; file where all the stuff we created above will joined.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Predictions from our neural net:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kn25jdyntnkuonwlnp0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kn25jdyntnkuonwlnp0.png" alt="Predictions" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusions
&lt;/h1&gt;

&lt;p&gt;As you can see, the error of the network is going to zero with each next epoch. But you know what? I’ll tell you a secret — it won’t reach zero, bro. That thing can take very long time to be done. It won’t happens. Never.&lt;/p&gt;

&lt;p&gt;Finally, we see results that are very close to input data. The simplest neural net, but it works!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source code is available on my &lt;a href="https://github.com/liashchynskyi/skynet" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://codeguida.com/post/1418" rel="noopener noreferrer"&gt;Original article&lt;/a&gt; posted by me in my native language.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>neuralnetworks</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
