Image Classification with CNNs – Part 4: Dealing with Variations in Input

#ai #machinelearning

In the previous article, we have processed the input through the neural network and predicted the O shape correctly.

Now lets do the same for the X shape.

If we repeat the same process of creating the feature map and then applying max pooling, we will obtain the following result.

Now let us see what happens when we shift the image of the letter X one pixel to the right.

Let's calculate the feature map.

Now let's pass it through the RELU activation function

Now, we can apply Max pooling.

Now, let's take the input nodes and process it through the neural network.

We can observe that the output value for the letter X is now much closer to 1 than the output value for the letter O, which is -0.2.

In any convolutional neural network, no matter how complex it is, the model will consist of the following components:

Filters, which are also called convolutions
Applying an activation function to the filter output
Pooling the output of the activation function

These components together allow convolutional neural networks to handle variations in image position while still making correct predictions.

So that is the end of this series, we will explore Recurrent Neural networks next.

Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.

Just run:

ipm install repo-name

… and you’re done! 🚀

🔗 Explore Installerpedia here

DEV Community

Image Classification with CNNs – Part 4: Dealing with Variations in Input

Top comments (0)