Knowing Convolution Basics

#deeplearning #convolutionalneuralnetwork #convolution

In this article, we are going to learn about the grayscale image, colour image and the process of convolution.

Grayscale image

A grayscale image where the image is represented as only the shades of grey. The intensity of the various pixels of the image is denoted using the values from 0 to 255. i.e., from black to white in terms of an 8-bit integer. It uses only one channel.

Colour image

Coloured images are constructed by combining red, green and blue (RGB) colours in variable proportions. These 3 colours and hence they are called the primary colours. The colour image pixels contain three channels: The R channel, G channel and the B channel, each having its own intensity values ranging from 0 to 255.

What is a convolution

Convolution is the process of multiplying each pixel with the corresponding pixel value of the filter and then adding all of the products to get the result. These combinations of result give the output image representation.

Now let us look at an example of convolution.

We pass a 6x6 input through a filter (Here we are using a vertical filter). We get a 4x4 output.

Now let us look at how each of the entries in the output is obtained.

We place the filter on top of the input starting from the top left corner till we reach the bottom right corner. Then we perform the process of convolution (multiply the corresponding entries and add them together). The obtained result is the corresponding output entry. Here we take stride value as 1. That is we jump 1 step to the right after each calculation. When we reach the column end, we jump 1 row below. This process goes on till we reach the bottom right corner.

The Convolution operation: The part of the input to be convolved with the filter in each step is highlighted.

The 1st output entry:
1(2)+1(0)+1(-1)+1(1)+1(0)+1(-2)+1(2)+1(0)+1(-1)

=2-1+1-2+2-1

=1