Neural networks are a hot topic in the technology industry today and that is partially because it makes a cameo appearance in many everyday devices. From your phone's camera to an Alexa to even a toothbrush, companies and organizations are jumping on the AI hype train. Whether some of these are appropriate uses for neural networks and AI is up for debate (and you certainly won't see me wonder here why would anyone possibly need neural networks for brushing their teeth), but understanding how neural networks work will not only give you a bullet point on your resume but it will also enable you to know when to use them in real-world situations.
Of course, before you can code neural networks in any language or toolkit, first you must understand what they are.
What is this AI thing anyway?
Artificial Intelligence (AI), obviously, refers to machines attempting to do tasks that humans would otherwise do. There are many branches within AI, such as robotics and data mining, but in this post, we will focus on one specific subset, machine learning, because it's easy to get lost in the broad subject. Neural Networks are a part of the machine learning discipline so it helps to get a basic understanding of that first.
If you go to Wikipedia, they have this definition for machine learning:
Machine learning (ML) is the study of computer algorithms that improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so.
In other words, we are feeding large amounts of test data into a system in order to teach it which data is correct or incorrect, true or false, etc. The idea is similar to using flashcards to memorize words or definitions and then testing yourself with true or false questions. Eventually, you will be able to identify words and maybe even their synonyms without the help of flashcards, and similarly in machine learning the system will be able to correctly classify objects without the need for test data or human intervention.
Different machine learning algorithms commonly either use supervised learning to try to identify a target object or use unsupervised learning to identify a distribution in a bunch of objects. Other algorithms exist but are less used in business settings.
Now back to the point:
Neural networks are entities that are chained together in layers to do one processing function. By stacking these entities, which we call neurons (which by the way have nothing to do with brain neurons other than the name similarity), into layers, we can perform complex processing from neurons that do simple things.
Neural networks are composed of three layers. There is an input layer full of neurons that contain the input split into chunks, one or more hidden layers which each process the input from a neuron, and an output layer with neurons storing the output.
Neural networks are used in machine learning algorithms to do the actual classification. Each layer has several neurons and each of them processes a fragment of the input data, starting at the input layer which splits apart the input data into chunks in an application-defined way, then each layer of the hidden layers processes each chunk to make an output which is eventually transmitted to the output layer.
Actually, a neuron's computation is very simple. It takes a numerical input, multiplies it by a weight value, and then passes it as an output to a neuron at the next layer. The key is that all the neurons have different but predictable, weighting values.
As you saw in the image above, some of the layers have a different number of neurons. The layers do not all have to have the same number of neurons. As a consequence, some neurons may take more than one input from the previous layer, and send an output to more than one neuron in the next layer.
Now, what happens when a neuron takes multiple inputs? Well after they are weighted, most likely with different weight values. We also add a constant number called a bias to it to get the final output. By the way, all neurons have a different bias, even single input neurons.
Finally, each neuron has a special function that takes the sum of the weighted inputs plus the bias as a single argument. We will see such functions in the next section.
Here are some practical examples of weighting functions in neurons that will you will encounter in production machine learning programs. There are many more weighting functions besides the ones listed here.
Here is another image that hopefully will help you understand what a neuron does.
Thanks for reading. In the next few posts, we will learn more about neural networks and will also explore some machine learning toolkits and learn how to do ML programming in various languages.