DEV Community

Cover image for Object Detection using Regions with CNN features
Varun Pusarla
Varun Pusarla

Posted on

Object Detection using Regions with CNN features

Object detection is one of the most trending techniques right now in the field of computer vision. Research has been taking place at a really fast pace in this field and the results are just astonishing.
But what exactly is object detection?

Object detection deals with identifying and locating objects of certain classes. Some of the most successful object detection algorithms as of this date are as follows:

  1. A series of region based CNNs : RCNN, Fast RCNN, Faster RCNN
  2. YOLO
  3. SSD

In this article we’re going to look at RCNN and a series of advancements made to it.

1. Regions with CNN (R-CNN):

This algorithm involves finding specific regions in an image that are likely to contain objects (using Selective Search and forwarding them through CNN to extract features. Later the extracted features are used to predict the class and bounding box around them.

Basically, an RCNN involves the following steps:

  1. Around 2000 bottom-up region proposals are extracted from an input image.
  2. Regardless of size or aspect ratio of the candidate region, we warp all pixels in a tight bounding box around it to the required size. For each proposal it computes features using a large CNN.
  3. It classifies each region using class-specific linear SVMs.
  4. The algorithm also predicts four values which are offset values to increase the precision of the bounding box.

image

Drawbacks:

  1. Training is slow: Around 2000 region proposals are supposed to be classified per image, it takes a huge amount of time.
  2. Inference is slow: It takes around 47s/per image with a VGG16. Real time detection is not possible.

2. Fast R-CNN:

Fast R-CNN is an improved successor to the R-CNN algorithm. It involves several changes that makes it faster and more accurate as compared to the previously discussed R-CNN.
The main problem with R-CNN was it performs a CNN pass for each region proposal without sharing computation. Fast R-CNN improves on the R-CNN by forwarding the whole image through CNN.

It involves the following steps:

  1. The entire image is passed forward through a CNN to generate a convolutional feature map.
  2. Region of Interest are identified from the convolutional feature map and a ROI pooling layer is applied on them to reshape them all to the same size. Each proposal is then passed to a fully connected layer.
  3. Softmax layer and linear regression layer are then used parallelly to output the classes and bounding boxes.

image

Drawbacks:

Although Fast R-CNN is an improvement over R-CNN in terms of speed and accuracy it still uses selective search for region proposal which is actually a time consuming process.

3. Faster R-CNN:

Faster R-CNN is further an advancement over the Fast R-CNN. The major difference between Fast R-CNN and Faster R-CNN is the fact that it uses Region Proposal Network (RPN) for generating regions of interest.

Following are the steps involved in Faster R-CNN:

  1. The entire image is passed forward through a CNN to generate a convolutional feature map (just like we did in case of Fast R-CNN).
  2. Regions of Interest are identified by applying a Region Proposal Network (RPN) on these feature maps which return the object proposals with their objectness score.
  3. ROI pooling layer is applied on them to bring them to the same size and then the proposals are passed to fully connected layer.
  4. Softmax layer and linear regression layer are applied on its top to classify and output the bounding boxes.

image

Faster R-CNN is the best among the object detection algorithms we discussed in this article. Object detection is not only limited to region based CNNs. There a lot of other algorithms like YOLO, SSD and RetinaNet which we will discuss in the upcoming articles.

References:

  1. http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf
  2. https://arxiv.org/pdf/1311.2524.pdf
  3. https://arxiv.org/pdf/1504.08083.pdf

Oldest comments (0)