Aditya Pratap Singh

Posted on Dec 2, 2022

Accelerating Media Processing with Media Master, using Courier and Azure Functions

Background

Currently, to process images or videos, servers need to process the media while they are being uploaded. This takes a lot of processing power, memory, and time. Media Master applies processing on the videos and images after they are uploaded to the temporary folder and then transfers the processed files to permanent storage.

Any website that accepts media content (photos, videos, etc) does some or the other form of background processing on said media. If this processing runs in the main thread of the web server, it could lead to horrible response times and also increase susceptibility to a Denial of Service attack. Operations like compression, validation, re-encoding and watermarking are often essential to services of many popular web apps. Even more important is the guarantee of being notified whenever something goes wrong in crucial operations like these.

I was motivated to create this program after learning about the flow of video processing on YouTube. So, what exactly does it do? When a user uploads a video, the clip is kept temporarily for processing reasons, and an acknowledgement is delivered to the user on the fly. The user is not obliged to be online during the processing. The video will be processed in stages and layers, and the relevant outcome will be released publicly upon completion of each phase. For example, if a video has been processed and finalised at 144p, it will be immediately accessible for streaming at that resolution without waiting for 240p or 480p render versions.

Tools used

Azure Functions: Azure Functions is the ideal tool for asynchronous background tasks like ours. Azure Functions is a serverless solution that allows to write less code and maintain less infrastructure. Azure Functions provide "compute on-demand" in two ways:

1. Allows to implement system's logic into readily available blocks of code. These code blocks are called "functions". Different functions can run anytime when required to respond to critical events.
2. On increase in requests, Azure Functions meets the demand with as many resources and function instances as necessary - but only while needed. As requests fall, any extra resources and application instances drop off automatically.

Courier: Courier provides an amazing suite of notification integrations. Courier is an API and web studio for development teams to manage all product-triggered communications (email, chat, in-app, SMS, push, etc.) in one place. This is how Courier works:

1. Application events can be sent to Courier via their API or SDK
2. Courier receives and processes events that provide information about the notification content and receiver.
3. Courier creates a notice template and sends it to the appropriate provider (supports over 60 providers across all channels).

Roadmap

Storing the files
Detect uploads on the storage
Converting Media files
Processing the files
Sending a feedback to the users on successful or failed uploading of files

Instructions

Part 1: Storing Files

The first step was to store files for processing. I found Azure Blob Storage to be an ideal choice as it flexibly scales up for high-performance computing and is highly secured using authentication with Azure Active Directory and RBAC along with rest encryption.

Imported BlobServiceClient and ContentSettings from azure.storage.blob to employ Azure Storage resources and blob containers. I have set the container name(CONTAINER_NAME) as “devmrfitz”. Then I called get_container_client to get a reference to a BlobClient object and upload the tempfile.


import azure.functions as func

from azure.storage.blob import BlobServiceClient

import os 

AZURE_CONNECTION_STRING = os.getenv('AzureWebJobsStorage')

CONTAINER_NAME = "devmrfitz"

COURIER_API_KEY = os.getenv("COURIER_API_KEY") 

def main(myblob: func.blob.InputStream):

        blob_service_client: BlobServiceClient = BlobServiceClient.from_connection_string(AZURE_CONNECTION_STRING)

Part 2: Detecting uploads on the storage

To detect uploads on Azure Blob Storage, we used Azure Functions. Basically, it is a serverless compute service to run event-triggered code where event, in our case, is upload. It runs a script or piece of code in response to a variety of events. Imported azure.functions and used to azure.functions.InputStream to detect file upload.

Part 3: Converting Media files

Now, the tedious task was to process video files which was made extremely easy using FFmpeg which uses demuxers to read input files and get packets containing encoded data from. In case of video files, ffmpeg tries to keep them synchronized by tracking lowest timestamp on any active input stream. It uses libavfilter library under the hood to enable the use of filters to process raw audio and video.

Part 4: Processing the files

To support inversion, resizing, watermarking and trimming and compressing of videos and images, I used PIL(Pillow) library which is a Python Imaging Library which contains all the methods which supported my usecase.

Image inversion: ImageOps.invert(image)
Image compression: image.save(destination_path, optimize=True, quality=quality)
Image resizing: image.resize((width, height))
Image watermarking on an Image: image.paste(watermark, (0, 0), watermark)
Text watermarking on an Image: image.paste(watermark, (px, py, px + wx, py + wy), watermark)
Trimming videos: subprocess.run([FFMPEG_PATH, ‘-i’, video_path, ‘-ss’, start_time, ‘-to’, end_time, ‘-c’, ‘copy’, output_path, ‘-accurate_seek’], capture_output=True, text=True)
Compressing videos: subprocess.run([FFMPEG_PATH, ‘-i’, video_path, ‘-c:v’, ‘libx265’, ‘-crf’, str(crf), ‘-c:a’, ‘copy’, output_path], capture_output=True, text=True)

Part 5: Sending feedback

There needs to be a response mechanism that informs the user as well as the server about successful upload and the processing that follows it. To implement this, I used Courier service which is a multi-channel notification service that enables us to send notifications to users using emails(my choice for this project), Discord notification, Slack message, etc.

Courier passes messages to Integrations via the Send endpoint. We must send an Authorization header with each request. The Courier Send API also requires an event. The authorization token and event values are the "Auth Token" and "Notification ID" we see in the detail view of our “Test Appointment Reminder” event. Click the gear icon next to the Notification's name to reveal them.

These variables can finally be fed into Courier's Python SDK to facilitate simple notification sending.

Courier works by taking in an event as input via an API, which is in our case, a successful upload. The event is accompanied with the data required for the feedback and details of the recipient. It then generates a notification and sends it through the channel specified. Here, I have chosen emails to be the channel for receiving notification.

Imported trycourier package from Courier service:

from trycourier import Courier

Added all the metadata:


try:

    ...

except UnidentifiedImageError as e:

    # Send courier notification

    if "email" in metadata:

        courier_client = Courier(auth_token=COURIER_API_KEY)

        courier_client.send_message(

        message={

Receiver’s email

            "to": {
            "email": metadata["email"],

            },

Subject and body content

            "content": {

            "title": "Media Master Warning! Unidentified Image detected",

            "body": "Hello {{emailPrefix}},\n\nAn unidentified file ({{name}}) was uploaded as an image and was not processed. The error generated was: \n\n\n {{error}} \n\nThanks,\nMedia Master",

            },

            "data": {

            "emailPrefix": metadata["email"].split("@")[0],

            "name": myblob.name,

            "error": str(e),

            },

Specific channel

            "routing": {

            "method": "single",

            "channels": ["email"],

            },

        }

        )

    logging.warning(f"Unidentified image: {e}")

except UnidentifiedVideoError as e:

    # Send courier notification

    if "email" in metadata:

        courier_client = Courier(auth_token=COURIER_API_KEY)

        courier_client.send_message(

        message={

            "to": {

            "email": metadata["email"],

            },

            "content": {

            "title": "Media Master Warning! Unidentified Video detected",

            "body": "Hello {{emailPrefix}},\n\nAn unidentified file ({{name}}) was uploaded as a video and was not processed. The error generated was: \n\n\n {{error}} \n\nThanks,\nMedia Master",

            },

            "data": {

            "emailPrefix": metadata["email"].split("@")[0],

            "name": myblob.name,

            "error": str(e),                

            },

            "routing": {

            "method": "single",

            "channels": ["email"],

            },

        }

        )

    logging.warning(f"Unidentified video: {e}")

Conclusion

Our fast, light-weight and easy-to-use media processing service is ready to be used and can help a lot of students and professionals in their day-to-day hustle.

What new features and improvements can you think of for Media Master? Pull requests and forks are always welcome at the Github repo.

About the Author

I'm Aditya Pratap Singh, a full stack developer and a junior at IIIT Delhi, India. I have worked with various languages and frameworks all the way from Javascript to C++. Hit me up @devmrfitz on any popular social platform (https://linktr.ee/devmrfitz).

Quick links

Top comments (1)

Shreya • Dec 2 '22

Such a cool idea!

DEV Community