DEV Community

Doug Sillars for api.video

Posted on • Originally published at api.video on

Video Moderation with Machine Learning

Video Content Moderation

User generated content (UGC) is taking over the internet, and one of the fastest growing segments of this trend is video UGC. From sites offering video product reviews to vlogging to online education - video is being created at the fastest pace yet (and shows no sign of stopping).

Many sites are looking for ways to easily incorporate UGC video on their sites.

All of our customers who allow UGC to be posted on their site worry about a few bad actors working to ruin their brand by posting inappropriate content. In order to protect their brand and the content on their site - evaluating any user generated content before it is allowed to be placed on the website is an important step. Traditional content moderation requires human moderators (which takes time and is expensive).

In this post, I'll walk through another alternative: using machine learning to moderate videos. Before publishing, it is scanned using machine learning for content, tested against rules, and then either accepted or rejected based on those rules. Super fast, works 24 hours a day, and there will be no human error in the categorisation of the video. As a demonstration, we've built a site with api.video for video hosting, and using Hive AI to power the video moderation. You can try it out moderate.a.video.

The Basics

api.video

api.video is a full service api--based video hosting solution. Use APIs to upload videos, modify and serve streaming video. Every video that is uploaded is transcoded into a video stream, and can be delivered in a custom video player. In this solution, we'll use the delegated upload (used like a public key) for uploading the videos. We'll use the video tagging function to label each video based on the moderation results. We can search all of the tagged videos to deliver the videos (and custom player) to users based on their moderation state.

Hive AI

Once the videos are uploaded, we need to run moderation before allowing them to appear on the website. Hive AI has several moderation suites. We've set up our API to use 2 endpoints. For short videos (under 25s) - we scan 1 frame every second, and for long videos, one frame every 5 seconds. (The demo above only works for 25s long videos). The API is trained to identify several subjects that, depending on the context of your application, might be important to moderate. In this post, we will use a small subset of the trained models:

  • Safe for work/Not Safe for Work
  • Yes/No: Female nudity
  • Yes/No: Male nudity
  • Yes/No: Female swimwear
  • Yes/No: Shirtless male
  • Yes/No: guns
  • Yes/No: smoking
  • Yes/No: Nazis

The full list of parameters that can be identified can be found here. After each frame is analyzed, a JSON file with data for each frame analyzed is returned for analysis. After analysis, the video is tagged with moderation values - ensuring that it only appears on pages appropriate for the video.

Code

This application is available on Github. The general flow is:

  1. On Upload, a video is tagged "needs moderation" (and will not appear on the site).
  2. hiveAI performs frame analysis.
  3. Based on the analysis, the video is tagged as "SFW" or "NSFW" (etc.).

Alt Text

The App

Alt Text

On entering the site, users are presented with an upload form. As we walk through this example, we'll be following the upload of the intro theme to the classic TV show Baywatch. With api.video, we have created a delegated upload key - which allows us to place the code publicly on the webpage without exposing our API private key. The form takes the video, and uploads the video to api.video. The uploader uses the Blob API to break the videos into 50MB segments for uploading. For the purposes of the demo - we show how many chunks are created, and update the progress of each chunk, in addition to the total video upload:

Alt Text

The response from the upload provides the api.video videoId, which is used to identify the video at api.video. This is then sent to the NodeJS backend as a POST (along with the video's name). On the Node server, we begin the process of moderating the video. First, we call the update video endpoint to add the video's name and to tag the video "needsScreening" to indicate that it has entered the moderation queue.

Transcoding

When the video is uploaded, api.video's servers begin the process of creating different size/bitrate videos to provide adaptive bitrate streaming. We also create a mp4 version of the video. For content moderation, we need to submit the mp4 to HiveAI. The Node server pings api.video's video status endpoint every 2 seconds to determine when the mp4 is ready. Initially, the API will indicate that the video is not playable (transcoding has not yet started). Once transcoding has started, the api lists the encoding status of every format being created, so we can monitor the encoding status of the mp4.

Alt Text

Once the mp4 is created, we can create our connection to HiveAI, and make the request for content moderation. The request looks like this:

{ method: 'POST', url: 'https://api.thehive.ai/api/v2/task/sync', headers: { accept: 'application/json', authorization: 'token {API TOKEN}' }, form: { image_url: 'https://cdn.api.video/vod/vi1iHWIy6Doy0LBJl3ajaED0/mp4/1080/source.mp4' } } 
Enter fullscreen mode Exit fullscreen mode

and a few seconds later, a huge JSON response comes back - 39 categories * x frames analysed. Let's look at a snip of one frame to see the sort of information we get (for brevity, I've only included the first 5 categories that are returned):

{ "class": "general_not_nsfw_not_suggestive", "score": 0.00460230773187999 }, { "class": "general_nsfw", "score": 5.180850871024288e-06 }, { "class": "general_suggestive", "score": 0.995392511417249 }, { "class": "no_female_underwear", "score": 0.9998768576722025 }, { "class": "yes_female_underwear", "score": 0.00012314232779748514 } 
Enter fullscreen mode Exit fullscreen mode

"Not NSFW" means "Not Not Safe for Work" which, removing the double negative is "safe for work." This score is combined with "not suggestive", and is scored at 0.004. Since 0 is a low score, and 1 is a high score, this means that the API has determined that this frame is not considered appropriate for work. Looking at the next 2 values, general NSFW is also very small, but the "general suggestive" is 99.9%. Since the yes:no scores will always add up to one, this means that the general suggestiveness is what makes this frame not safe for work.

Alt Text

Compiling the scores

Hive AI gives you scores for each category on each frame, but it is up to us to define the pass/fail criteria for our videos. In order to do this, I take all the scores for each category, and place them in an array. With the data sequestered for each category, I can calculate the min, max, average and median of each score. I also calculate how many frames appear over 0.9 as a 90% certainty ("yes_smoking") and how many frames appear under 0.1 (90% certainty of "no_smoking.")

Alt Text

In the above "SFW" array (really the 'not not safe for work' response, but that is a mouthful), of 22 frames, I find:

  • min score: 0 (not safe for work)
  • max score: 1 (safe)
  • average: 0.55 right in the middle!
  • median: 0.75 * count of frames > 0.9: 9
  • count of frames <0.1: 7

As you can see, 9 frames are over 0.9 (safe!), but 7 are below 0.1 (not safe!). If the median falls below 0.9, that means at least 50% of the frames are not "certain" to be safe for work ( which I place at the 90% threshold). Based on these numbers, my rudimentary pass/fail algorithm deems the Baywatch intro "NSFW." If you wanted to prevent videos with smoking from appearing to your audience, just one frame with a 90% certainty of smoking would be enough to cause the video to be categorised as "yes_smoking." I used this same threshold for guns, Nazis, nudity, underwear and swimwear.

Interestigly, this does push the Schindler's List trailer into the "yes smoking" category.

Based on these metrics, it is not terribly surprising that the Baywatch video is flagged for "yes female swimwear" and "yes shirtless male." Also unsurprisingly, the algorithm found an absence of smoking, guns and Nazis.

Alt Text

Once we have the categories measured for each application, we can remove the tag needsScreening and add in the new tags from the moderation. This is done with the Video Update endpoint. And that ends the moderation process for the video.

Video Categories

Now that each video has been categorised, it is easy to display each video category. Each video's categories have been added to the video as a tag. and the List videos endpoint allows us to search by tag - returning every video that does not have smoking, or is safe for work, etc. Using Node, we can generate the list of videos (sorted newest first), and send them to the client:

app.get('/no_smoking', (req, res) => { //get list of no smoking videos client = new apiVideo.Client({ apiKey: apiVideoKey}); 
let recordedList = client.videos.search({"tags":'no_smoking', "sortBy":"publishedAt","sortOrder":"desc"}); 
recordedList.then(function(list) { console.log("list of tagged videos"); 
console.log(list); 
return res.render('videos',{list}); }).catch((error) => { console.log(error); }); }); 
Enter fullscreen mode Exit fullscreen mode

The API returns the metadata for each video (including links for the video), so creating a page with each video iFrame is a pretty simple task. On each page in the sample app, you can display (and watch) each of the videos assigned to a category. In the demo, I use Pug for the rendering

each video in list p #{video.title} iframe(type="text/html", src=video.assets.player, width = "960", height="540",frameborder="0", scrollling="no") p #{video.publishedAt} 
Enter fullscreen mode Exit fullscreen mode

For example, the "yes guns" page has 3 movie trailers (as I write this post): Indiana Jones and the Last Crusade, Die Hard and the latest James Bond.

Conclusion

There we have it - we have uploaded a video to api.video, and before it is displayed on the site, it is moderated by HiveAI for several categories of inappropriateness. Based on the analysis for each frame, the video is categorised into buckets and displayed on the appropriate page on the website. Try it yourself, the code is up and running at https://moderate.a.video. If you have questions about using content moderation with your videos at api.video, feel free to reach out on our community or comment on the Github repo. We'd love to see how you are using moderation to sort and categorise your videos, and the rubrics you utilise to decide what categories a video might fall into,

Top comments (0)