DEV Community

loading...
Cover image for Detecting social distancing with use of Azure Custom Vision
Stratiteq

Detecting social distancing with use of Azure Custom Vision

Cecilia Rundberg
Cognitive Science student, Summer intern at Stratiteq
・4 min read

I am a student at University of Gothenburg studying Cognitive Science. The last couple of weeks I have spent as a summer intern at Stratiteq and have been working on an AI project about how to detect social distancing with use of drone surveillance.

Nowadays social distancing is important in our society and it sure would be useful to develop autonomous tools for helping us out with keeping the safe distance. In this post I will explain how I trained a custom model and used it for calculating distance between people.

The model was trained with Azure Custom Vision, an AI service which allows easy customisation and training of custom models. It was then tested via a custom-made web page and with JavaScript code. The drone used for this project was DJI Mavic Mini.

Photo by Stefano Lombardo on Unsplash

The first step is to make sure that the model we are creating in Custom Vision is trained to be able to detect a person and to distinguish one from other appearing objects. To do this we need enough data that can be trained on and to get hold of this we used Aerial Semantic Segmentation Drone Dataset found on Kaggle together with a few test pictures we took with the drone at our Stratiteq after-work event. In total I used 100 pictures for the training.

Kaggle dataset

When uploading a picture, the so far untrained model will point out what it thinks is an object and you will then need to correctly tag it, in this case we will point out and tag all the people in each picture with "person". After doing this the model can be trained and tested. For each iteration you will get a performance measure consisting of precision, recall and mAP, standing for:

  • Precision – the fraction of relevant instances among the retrieved instances
  • Recall – the fraction of the total amount of relevant instances that were retrieved
  • mAP – overall object detector performance across all tags

Training results

A good starting point for creating the custom-made web page is to use Microsoft’s Quickstart: Analyze a remote image using the REST API and JavaScript in Computer Vision. We can easily modify this quick start example for our own calculations.

Beside defining the subscription key and the endpoint URL we need to change the quick start example to use Custom Vision endpoint. These can be seen in the the Custom Vision dashboard under "Prediction URL".

var uriBase = endpoint + "customvision/v3.0/Prediction/…";

We also need to set the custom header “Prediction-Key” for our request.

xhrObj.setRequestHeader("Prediction-Key",""); 

Custom Vision will analyze the pictures we send and provide with result data out of our created model. For our testing purposes we uploaded the pictures to the Azure Blob Storage.

In order to calculate the distance in code from the result data, we will use the prediction values for the detected people. With each of the x, y, height and width values we get we will calculate the center of the object bounding boxes.

var x0 = x[i] + width[i] / 2;
var y0 = y[i] + height[i] / 2;

Applying the Pythagorean theorem gives us the distance between two centers, in our case that gives us the distance between two persons.

var distanceInPixels = Math.sqrt((x0 - x1)**2 + (y0 - y1)**2);

The calculation is currently made in pixels and we would like to have it in meters. When taking the test pictures with the drone we made measurements and markings on the ground to be able to tell the actual area size. Before we tested the pictures, we cropped them to these markings. We also knew the flight height of the drone.

Measuring the area

The calculations were visualised on our web page by drawing a bounding box for each of the detected person, and by drawing lines between them. The line between the persons will be green if the distance is 2 meters or more and red if the distance is less than 2 meters.

var canvas = document.getElementById('resultImage');
var context = canvas.getContext('2d');
context.beginPath();
context.moveTo(x0, y0);
context.lineTo(x1, y1);
if (distanceInPixels < meterToPixel) {
    context.strokeStyle = 'Red';
} else {                        
    context.strokeStyle = 'LightGreen';
}
context.lineWidth = 4;
context.stroke();

In the following animated GIF you can see the test results from 16 pictures we used in the testing.

Results

Except in the Covid-19 situation this kind of autonomous distance calculation can be useful in other areas. Different kind of working places such as the ones dealing with hazardous materials for example, could have use of it. That would of course require different kinds of improvements of this simple model. It would need to be able to detect different kinds of objects and not only people which would require additional data to train on.

Looking at this workflow and the result you can see that Microsoft Azure Cognitive Services provides an easy way to develop custom applications powered by Artificial Intelligence.

Thank you for reading, I hope this post gave you an idea how to build custom models and how to empower your applications with AI!

Discussion (4)

Collapse
techdiwakar profile image
Diwakar Sharma

Amazing....👍

Collapse
shaijut profile image
Shaiju T • Edited

Nice Idea, 😄, Is distance in pixels accurate, we need distance in meters right ? Also how did you create that video with bounding box, using Javascript ? What I am planning is capture objects from camera, convert the video to frames and detect the object , apply bounding boxes, then convert back from frames to video. Hope this is possible.

Collapse
botelho profile image
Comment marked as low quality/non-constructive by the community. View Code of Conduct
botelho

The only thing you are empowering here is an unethical system of surveillance, simply disgusting! How much naiveness is necessary for not understanding that these types of works are not beneficial for people but only for tyranny to be implemented?

Collapse
dmfay profile image
Dian Fay

I don't know how helpful it is to go from zero to "simply disgusting" that fast but yeah, you're not wrong. This technology is meant to facilitate control over people. Of course it's ostensibly for our own good, but like other seemingly-innocent panopticisms it cannot be considered in a vacuum. These systems are always applied unequally and divide us into measured and measurers, to say nothing of the ramifications automated compliance detection might have combined with identification, social/fiscal credit, and other existing technologies.