Often as AI Cloud Developer Advocate I find myself talking in front of various audiences. Now that I have some time to work on some new demos while working from home I’ve been thinking of how to better understand the reach of the audience I talk too for when life resumes to normal.
Often I estimate the reach of my of talks based on registrations or by eyeballing but what if there was a better way. As an AI engineer I was inspired by the work done by the Microsoft CAT team on crowd counting. I couldn’t find any tutorials on how to train these kinds of models at scale on the cloud and decided to write my own and document the process for my readers.
Code for the post can be found below
If you are new to Azure you can get started a free subscription using the link below.
Before we get started we should try to understand the differences between the top two audience estimation approaches.
Detection Based Approaches , such as person detection, semantic segmentation, and pose estimation, work best on small crowds but have trouble scaling to estimating the size of larger audiences of the dozens or more people.
Dense Approaches try to learn a feature mapping between the original image and a density map of generated from placing a keypoint on all the people in the image. These models work best for estimating the size of larger audiences.
One of the most popular Dense models for is the Multi Column CNN (MCNN) Model due to it’s relatively quick evaluation time.
There is extensive documentation on how to train your own and use pre-trained detection based models in the cloud as well as some great out of the box cloud services.
The C-3 framework developed by Junyu Gao provides provide pre-processing code for six common mainstream crowd counting data sets and multiple model configurations of MCNN implemented in PyTorch.
In his GitHub Repo, Gao provides links to the processed datasets and pre-trained models he doesn’t provide an end to end example on one document, however so I took the liberty to write one in Jupyter notebooks myself that shows how to install the C3-Framework and run a pre-trained model.
There are 8 simple steps for training a model with the AzureML SDK
In the notebook below you will see an example of each of these 8 steps end to end with to train your own Audience Estimation models at scale.
Once you run the notebook be sure to check out my 9tips for production machine learning and try playing around with code. I’ve left a couple of challenges for you to try at the bottom of the notebook.
Additionally, I’ve included some other great resources on Crowd Counting below.
- It's a Record-Breaking Crowd! A Must-Read Tutorial to Build your First Crowd Counting Model using Deep Learning
- C^3 Framework系列之一：一个基于PyTorch的开源人群计数框架
- Crowd Counting Made Easy
Here are some additional resources to help with your computer vision journey on Azure.
- Custom Vision | Microsoft Azure
- Computer Vision | Microsoft Azure
- Video Indexer - AI Video Insights | Microsoft Azure
- Computer Vision Explorer
Aaron (Ari) Bornstein is an AI researcher with a passion for history, engaging with new technologies and computational medicine. As an Open Source Engineer at Microsoft’s Cloud Developer Advocacy team, he collaborates with Israeli Hi-Tech Community, to solve real world problems with game changing technologies that are then documented, open sourced, and shared with the rest of the world.