The human brain was a model for developing neural networks because it could sufficiently learn, understand, and solve complex problems. This understanding of the brain’s nature gave birth to the concept of Artificial Neural Networks(ANN), first proposed by Warren Mcculloch (a neuroscientist) and Walter Pitts (a logician) in 1943. To better understand how the Mcculloch and Pitts (MCP) model worked, take a lesson from biological science and understand how the brain works.
In a biological neuron, as seen above, the dendrites are responsible for receiving messages/electrical signals from other cells, much like an input device. The soma, like the CPU, is responsible for processing these signals and returning an output/response, and the axon conveys this output to other neurons and organs in the body.
Now, the MCP model was a simplified implementation of the biological neuron. The design of the MCP made it possible to perform an aggregated sum, ∑, of its inputs and compare it to a fixed threshold, ∮, and returns an output of zero or one if the sum is greater or less than the threshold. This output may correspond to a particular action, such as classifying an email as spam or not. However, the MCP model had its limitations, which were:
- It only accepted binary inputs (0 and 1).
- It gave every input equal importance, which failed to reflect the workings of life events because some things matter more than others.
- It did not apply to non-linear separable events.
These limitations led to the proposal of the Perceptron by Frank Rosenblatt, an American Psychologist, in 1958.
Perceptron : The Single Neural Network.
The Perceptron neural network is a supervised learning algorithm for binary linear classification. It consists of four parts:
- Inputs.
- Weights and Biases.
- Activation function.
- Output.
Here is how a Perceptron neural network works;
The inputs are multiplied by their respective weights and summed together with the bias, w0. The result is called the weighted sum. This equation represents the math behind it: (1*w0) + (x1*w1) + (x2*w2) + ... + (xn*wn)
. What is the purpose of weights? Weights show the importance of each input. The higher the weight for an input, the more effect it will have on the final result. Assume you are training a perceptron to classify if an email is spam or not. If an input x1 which represents an email sent to multiple accounts, has a weight, w1 of 10, it will have a greater effect on the final output of the perceptron.
After computing the weighted sum, the perceptron passes it to the activation function. The activation function is a non-linear function that converts the weighted sum into a range of values, for example, 0 to 1 (sigmoid function), -1 to +1 (tanh function), 0 to infinity (Relu function). This conversion helps the perceptron learn complexity within the data and decide what action to perform or not, also referred to as a neuron deciding to fire or not.
How about the purpose of the bias? The bias is a constant value, similar to the intercept, c in the equation of a straight line $$y = mx + c$$, that helps to shift the activation to the left or right, making it crucial in deciding if a neuron fires or not. Take a look at this example that explains further: after computing the weighted sum for a set of inputs, you arrive at a value of -0.35, but the neuron will not perform any action(fire) for any output less than zero. To ensure that the neuron fires for this set of inputs, you need to add a bias, say, 1, to the weighted sum, giving a value of 0.65, resulting in the neuron firing.
Perceptron Implementation In Python.
In this section, you will see how to implement a perceptron neural network that decides if it fires or not based on its inputs.
x_inputs = [1.2, 0.8, -0.7]
w_weights = [1, 0.87, -0.2]
def activating_step_function(weighted_sum):
if weighted_sum > 0:
return 1
else:
return 0
def neuron_fire_or_not(value_activation_function):
if value_activation_function <= 0:
return "Do Not Fire Neuron"
else:
return "Fire Neuron"
def perceptron():
weighted_sum = 0
for x, w in zip(x_inputs, w_weights):
weighted_sum += x*w
print(weighted_sum)
value_activation_function = activating_step_function(weighted_sum)
return neuron_fire_or_not(value_activation_function)
if "__main__" == "__name":
perceptron()
In the demo above, the weighted sum is computed in the perceptron_function
by summing the inputs multiplied by their respective randomly set weights. Then the weighted sum is passed to the activation_step_function
and returns 0 for every weighted sum less than or equal to zero or output of one if it is greater than zero. The value from the activation_step_function
is further passed to the neuron_fire_or_not
function and fires the neuron for every value greater than zero. Run the code and see what happens.
Output:
1.2
1.896
2.036
'Fire Neuron'
By printing the weighted sum on every iteration, you see that the weighted sum has a final value of 2.036. Since the final sum is greater than 0, the neuron will be activated (get fired).
Multilayer Perceptron.
The perceptron neural network, similar to the MCP model, had its limitations. For one, it could only return binary values, 0 and 1, as a binary classifier and can be trained only on linear separable problems.
Assume the red and blue dots are two different classes of a problem you are trying to solve, for example, spam classification. The red dots represent spam emails and the blue dots non-spam emails. The graph to the left is linearly separable as you can draw a line to separate the two classes. But for the image to the right, there is no way you can draw a line that distinctly separates the two classes. A single perceptron neural network will solve the problem to the left but not the one to the right.
These limitations led to the development of the Multilayer Perceptron (MLP) neural network. The MLP comprises two or more single perceptron neural networks combined. It has three layers, the input layer, the hidden layer, and the output layer, with each node in a layer (the circles you see in the diagram above) fully connected to every node in the next layer, making it a fully-connected neural network. With this structure in place, a multilayer perceptron will have multiple weights and biases at each network layer. This level of sophistication and complexity allows a multilayer perceptron neural network to solve both linear and non-linear separable problems.
Recap.
- The human brain inspired the first model of a neural network - The Mcculloch and Pitts (MCP) model.
- Frank Rosenblatt invented the Perceptron neural network in 1958 to overcome the limitations of the MCP model.
- The Perceptron comprises inputs, weights and biases, an activation function, and an output.
- The Perceptron was also limited because it could only output binary values (0 and 1) and solve only linear separable problems.
- The Multilayer Perceptron (MLP) overcame the limitations of the single layer Perceptron by combining two or more perceptrons.
- The MLP has an input layer, hidden layer(s), and an output layer.
Thank you for reading this article, if you enjoyed it, be sure to like it.
You can follow on Twitter and LinkedIn to learn more about the world of machine learning and artificial intelligence in general and if you have any questions, kindly leave a comment below.
Top comments (0)