In this post I will explain what neural networks are, how do they work and finish with the steps on how to build a simple neural network.
π€ What is a neural network?
A Neural Network is in its most basic form a bunch of procedures that receive a determined kind of input and give an expected response to it.
Let's understand it better with a visual example:
Neural Networks are an emulation on the basics of how human brains work and how information is passed down between neurons in order to take on more human-like tasks such as pattern recognition.
The most remarkable feature about neural networks compared to other code or scripts is the fact that they become good at their task overtime and through training. They are way more flexible in what they can do but also have other limitations because of that.
β But how is it on the inside?
In the first place we need to know the components of neural networks and how they look like:
- Neurons (Nodes): These are the basic units that process information. Each neuron receives one or more inputs, processes them and produces a value (usually from 0 to 1) that will be output.
- Layers: A node or aggregation of nodes.
- Input layer: Where the input is introduced into the network
- Hidden layer(s): Most of the process and computing takes place here
- Output layer: The final result of the calculus is shown
- Weights: Values that adjust the importance of every imput for a neuron, they can be visualized as the value of the path between nodes (they usually go from -1 to 1)
- Bias: Additional value added into the weighted input for every neuron
Now let's take a closer look at a neuron's parts and how it behaves.
-
Inputs:
- If the neuron is in the Input Layer this value will be a number from 0 to 1 given by the user or another script/code.
- Else, it is the aggregation of every value from previous neurons that have a connection with the current neuron, linked with every weight from the previous neurons and the bias.
Activation function: Decides whether a neuron should be activated based on its inputs
Value Algorithm: Using an algorithm it converts the pairs of input value * weight for every neuron + the bias and converts all the data into a decimal value ranging from 0 to 1
Note: I did not use a real algorithm for this example
π§ How does it learn?
Now that we know how a neural network is on the inside let's see how it learns and improves over time.
In order for the AI to evolve there are two main points of focus with some parameters involved:
- Epoch or generation: This is every time a full loop of training has taken place. Every step of the evolution.
- NΒΊ of Epochs: AI usually train for a specified amount of epochs, although it can be set to keep training until manually stopped.
- Population: This is the number of instances of the AI that will be trained in every epoch, an epoch ends when all the population has finished the training.
- Entropy: The percentage of randomness that will be added into the collected data for the next generation.
- Survivability: Tweaks the population that will "survive" into the next generation
- Score, Fitness or Feed: In order for the AI to know that it is doing good or bad a score is given to it. This will depend on the kind of training.
- Fail state: A fail state can be configured, which will reward the minimum score thus "punishing" the AI for it. This allows for a better tuning of the AI or focing it into certain behaviours. For example if a walking AI falls over a fail state can be given.
β¨ The Magic of Evolution β¨
Finally
Now that we have all the important elements that are present in a neural network we can finally get into the bread and butter of how it evolves over time. Let's take a look step by step:
(Disclaimer: there are more ways to train a neural network, this one is just the most commonly used)
- Initialization: The neural network is started with some weights and biases (usually randomized)
-
Forward Propagation:
- The input data are fed into the input layer.
- They are multiplied by the weights and the biases in each neuron of the hidden layer are summed.
- The resulting sum is passed through the activation function to produce an output.
- This process is repeated for each hidden layer until the output layer is reached, where the final result is obtained.
- Error calculation: The output is compared with the expected result using a set algorithm.
-
Backpropagation:
- The data from the best performers of the current generation (accounting for survivability) is gathered.
- The gradient of the error with respect to the weights and biases is calculated.
- The weights and biases are adjusted for every element of the next epoch to minimize the error (taking into account entropy).
- Repetition: The forward and backward propagation are repeated for many iterations (epochs) until the last epoch is reached or the process gets stopped.
Now that we know how neural networks are structured let's finally see how to make them work step by step!
π How to create a neural network π
-
Configure Inputs: We need to convert whichever input we want to introduce into the network to a numerical value of 0 to 1. We will need as many nodes or neurons as the amount of inputs.
- In case of images, the usual way is to have an input for every pixel and convert the RGB values of each pixel into 0 to 1 values.
- Training an AI car you can have an input for every sensor on the car plus the current speed and position of the car
-
Configure the Hidden Layer(s): We can set up as many hidden layers with as many nodes and connections as we wish. We need to have into account that as the system gets more complex the training time increases but if it's too simple it can be inaccurate. A common rule of thumb is the following:
The optimal size of the hidden layer is usually between the size of the input and size of the output layers
-
Configure Output: We will need as many output neurons as the information that we desire to extract from the neural network at once. It can be convenient to have every output node labeled correctly for complex networks.
- For binary results such as yes/no or this is a cat/this is a dog, a single neuron is enough.
- For the previous example of an AI car the outputs can be three nodes: left steering, right steering and speed
-
Set up the parameters:
- Population and Epochs will determine the time the AI will take to learn as well as our PC usage. The greater those values the more trained our AI will be but the longer it will take or the greater computational strength needed.
- Both Entropy and Survivability should be kept quite small since applying too much randomness may cause the AI to never learn appropriately.
- The Score system also has to be configured in this step, although it can be changed after a training in case the AI does not behave as intended
Get a dataset: Now we need to create or obtain all the data for the AI to train with. This can be manually gathered, obtain through online repositories or imported from an API or program.
Training!: We can finally run the code! Let the magic happen and watch as the AI learns!
-
Rinse and repeat: Now it's time to look at our creation and ask ourselves, does the AI do mistakes? Can we improve on it?
The answers to both questions will be a YES on our first attempt, so here is where we tweak parameters, check the dataset, change the Score system and repeat the training until we get a result that satisfies us.Note: It's advisable to save previously trained neural networks, at least the best one so far, in case we don't see a way to improve on it by changing things around
Last words ~
So that's it! I hope my explanaitions were easy to follow along and that you learned something from this post :D
Also don't hesitate to contact me if you see any wrong information or weird explanations since I am no expert myself and I'm just getting into using neural networks. I am eager to improve this post as I learn more about them myself!
Top comments (0)