DEV Community

Cover image for How Automatic Image Moderation Tools Use AI to Reduce Data Biases

Posted on

How Automatic Image Moderation Tools Use AI to Reduce Data Biases

Even though it looks like magic to the uninitiated, automated image moderation rests in fact on reality-grounded deep learning techniques.

Since the creation of AlexNet in 2012, neural architectures used for computer vision have been making strides in performances. While those deep learning techniques can beat human performance on some tasks, they are heavily data-dependent, which is why is it essential be extra vigilant about biased data.
In this article, we will talk about the dataset bias issues and focus on an interesting technique that can be used to alleviate it (Grad-CAM).

Training dataset & Dataset bias
Automated image moderation algorithms are trained to spot characteristics elements related to some situations, actions, body parts or positions (in order to detect specific content like porn, violence or obscene gestures). To do so, they need to analyze thousands of pictures gathered in a training dataset. But a dataset can easily be biased, because while a gigantic amount of data is generated everyday, only a small part of it is accessible and there is no way to be sure its distribution is really representative of the environment in which it will be used.

For example, a face detection dataset made of 7000 images of men and 3000 images of women, will be biased with respect to the reality. Hence, when training a model on this dataset, this bias will translate into a poor performance on women face detection when used in a real-world situation.

Comprehensive test set
In order to detect biases and improve the algorithm by evaluating it on a dataset as comprehensive and diversified as possible, it is essential to gather a 100% distinct test set, ideally from different sources. At PicPurify, our go-to technique to bootstrap a test set is to extract a portion of a commercially-exploitable dataset like Google’s open image dataset, and to verify it manually.

Case study: PicPurify’s obscene gesture classifier
In order to train our obscene gesture classifier, we first built a dataset of people giving the middle finger to the camera. Then, we used this dataset to train a neural network to classify images into positive / negative categories.
After evaluating the model on this test set, the misclassified images provided us with a first insight of the biases learned by the model: it was clear that the model initially tended to be confused by candid hand-gestures (pointing index, peace signs). Once we identified this trend, we gathered candid hand-gestures image datasets and retrained our model.

To improve our models performance, we are using a visualization tool for convolutional neural networks: Grad-CAM (Gradient-weighted Class Activation Mapping). An interesting property of convolutional neural networks is their ability to retain spatial information through most of their layers. Grad-CAM makes good use of this property by generating a heat map highlighting the locations that caused the model’s decision.

This tool adds an important information to misclassified images, giving us a better insight of what caused the bias. For example we were able to detect that our model misclassified the image below because it was confused by the pen, looking a bit like a raised middle finger. Now, based on this machine learning technique, we are able to understand what was misinterpreted by the model. The next step here will be to gather as many similar images as possible to retrain the model.

Top comments (0)