DEV Community

Cover image for Leveraging GPU Acceleration in BMF for High-Performance Video Processing

Leveraging GPU Acceleration in BMF for High-Performance Video Processing

Introduction

Multimedia processing and GPU acceleration have become a cornerstone for achieving high performance. Babit Multimedia Framework (BMF) harnesses this power, offering unparalleled speed and efficiency for video processing tasks. In this blog post, we'll explore how BMF utilizes GPU acceleration and provide practical examples to help you integrate this capability into your projects.

Understanding GPU Acceleration in BMF

BMF's architecture is designed to exploit the parallel processing capabilities of GPUs. This is crucial for tasks like video transcoding, real-time rendering, and applying complex filters or effects, where the computational intensity can be staggering.

GPU acceleration is like a turbocharged engine in a sports car, propelling you forward at unimaginable speeds. It's all about doing more in less time. Imagine you're editing a video for your youtube channel or streaming a live esports tournament; every millisecond counts. This is where BMF's GPU prowess shines, slicing through processing times like a hot knife through butter.

BMF has carried out Performance optimization in CPU and GPU heterogeneous scenarios that many FFmpeg existing solutions do not have, and enriched the Pipeline. Taking compression and super resolution scenarios as examples, after statistics, the total throughput of BMF has increased by 15%.

BMF's GPU codec is inherited from FFmpeg, using GPU NVENC, NVDEC and other proprietary hardware to accelerate video codec, and using FFmpeg's CUDA filters to accelerate image preprocessing, which is no barrier for users familiar with FFmpeg. At this stage, BMF supports GPU decoding, encoding and one-to-many transcoding.

Key Benefits of GPU Acceleration:

  1. Speed: GPUs can process multiple operations simultaneously, drastically reducing processing time.
  2. Efficiency: Offloading intensive tasks to the GPU frees up the CPU for other operations, improving overall system performance.
  3. Scalability: As video resolutions and processing demands increase, GPUs can scale to meet these challenges.

Setting Up GPU Acceleration in BMF

Before diving into coding, ensure your environment is set up to leverage GPU capabilities. This typically involves installing the necessary GPU drivers and libraries, like CUDA for NVIDIA GPUs. BMF documentation provides detailed setup instructions. You can use tools like co-lab or your own hardware. BMF also has the capability to run on Windows, Mac OS and Linux.

Code Example: Basic GPU-Accelerated Video Processing

Let's start with a simple example of GPU-accelerated video processing in BMF. This example assumes you have BMF and all necessary GPU libraries installed. If you haven't installed it yet, click this link and you can install BMF based on your system set up. You can also use tools like Colab as well. If you're using GPU just make sure you meet the hardware requirements to do so.

Prerequsites:

  • Python 3.9
  • Cmake
  • ffmpeg4
  • Python, C++, or Go experience

Python code

===========

In this example , BMF implements a call to the GPU codec function for video transcoding. BMF basically follows the parameters of FFmpeg, and the lines of code you'll see that are written in red are where true magic happens.

First create a BMF Graph and Decode model, specify the incoming Hardware Accelerator parameter as cuda, and then you can decode the GPU.

import bmf

def test_gpu_transcode():
    print ("Testing gpu transcoding......")
    input_video_path = "input.flv"
    output_video_path = "output.mp4"

    graph = bmf.graph()

    video = graph.decode({
        "input_path": input_video_path,
        "video_params": {
            "hwaccel": "cuda",
        }
    })
Enter fullscreen mode Exit fullscreen mode

Next, use the CUDA filter for the decoded video stream. In BMF, CUDA filters can be used serially at the same time. In this case, we used Scale cuda and Yadif cuda. Then we passed in the audio & video stream to build an Encode model, specifying Codec as h264_nvenc and Pix format as cuda. Once the entire pipeline is complete, call RUN to start execution.

(bmf.encode(
        video["video"].ff_filter("scale_cuda", w=1280, h=720).ff_filter("yadif_cuda"),
        video["audio"], {
            "output_path": output_video_path,
            "video_params": {
                "codec": "h264_nvenc",
                "pix_fmt": "cuda"
            },
        }
    ).run())
Enter fullscreen mode Exit fullscreen mode

Full Code

import bmf

def test_gpu_transcode():
    print ("Testing gpu transcoding......")
    input_video_path = "input.flv"
    output_video_path = "output.mp4"

    graph = bmf.graph()

    video = graph.decode({
        "input_path": input_video_path,
        "video_params": {
            "hwaccel": "cuda",
        }
    })
    (bmf.encode(
        video["video"].ff_filter("scale_cuda", w=1280, h=720).ff_filter("yadif_cuda"),
        video["audio"], {
            "output_path": output_video_path,
            "video_params": {
                "codec": "h264_nvenc",
                "pix_fmt": "cuda"
            },
        }
    ).run())
Enter fullscreen mode Exit fullscreen mode

Advanced GPU-Accelerated Video Processing

For more complex scenarios, BMF allows fine-tuning of GPU settings and integration with other GPU-accelerated libraries.

we introduce CV-CUDA accelerated image preprocessing. In order to fully mobilize the computing power of CUDA, we introduced CV-CUDA in BMF, which is the acceleration operator base specially developed by Nvidia for Computer Vision applications. At this stage, it provides about 45 common high-performance operators.It provides rich API interfaces such as C/C++/Python API, supports batch input of images of different sizes at the same time, and can realize zero-copy data conversion with other deep learning frameworks, and also provides a variety of scene application examples.

Cuda Operators you can use:

  • Blur
  • Crop
  • Flip
  • Gamma
  • Rotate
  • Scale
def test_gpu_transcode():  # Start of function named 'test_gpu_transcode'
    print("Testing GPU transcoding...")    # Print out a string "Testing GPU transcoding..." in the console

    # Variables containing the paths of the input video and the path to save the output video (transcoded one)
    input_video_path = "input.flv"    # Path to the video file we want to transcode
    output_video_path = "output.mp4"  # Path to save the output video

    # Create a BMF graph to represent a series of processing operations
    graph = bmf.graph()

    # Call the 'decode' function of the created BMF graph. Input is the video file pointed by 'input_video_path'.
    # Use hardware acceleration on the GPU to decode the video (hwaccel means hardware accelerator)
    video = graph.decode({
        "input_path": input_video_path,
        "video_params": {
            "hwaccel": "cuda",    # Use NVIDIA CUDA technology for hardware accelerated decoding
        }
    })

    # Call the 'encode' function to encode the video and audio streams.
    # The input video stream is first processed by a GPU scale module to resize it to 1280x720 pixels.
    # The encoded video will be saved to the path pointed by 'output_video_path'.
    # Use NVIDIA NVENC technology for GPU accelerated encoding,
    # and 'pix_fmt' is set to 'cuda' to let the GPU to read in the processed frames directly from its own memory.
    bmf.encode(
        video["video"].module("scale_gpu", {"size": "1280x720"}),    # Scaling the video to dimension of 1280x720 pixels using GPU
        video["audio"],    # Including the audio stream in the processed video
        {
            "output_path": output_video_path,    # Path to save the output video
            "video_params": {
                "codec": "h264_nvenc",    # Use H.264 codec for video encoding with NVENC technology
                "pix_fmt": "cuda"    # The input video frames are in GPU memory
            }
        }
    ).run()    # Execute the graph operations

# Now Call the above defined function
test_gpu_transcode()    # Call the 'test_gpu_transcode' function to start the whole process
Enter fullscreen mode Exit fullscreen mode

Example: Integrating AI Models for Video Enhancement

BMF's flexibility enables the integration of AI models for tasks like super-resolution or frame interpolation. Here's an example of how you might integrate an AI model for super-resolution. Check out this example

Real-World Sorcery with BMF and GPU Acceleration

Let's list up some real-world scenarios where GPU-accelerated BMF works its magic:

  1. The Live Sports Event: Picture a live sports broadcast. With BMF's GPU acceleration, you can stream high-definition, slow-motion replays almost instantaneously. It's like having the ability to freeze time and zoom in on that crucial game-winning goal.
  2. Hollywood films: In film editing, BMF with GPU acceleration is your special effects wizard. Render stunning visual effects in a fraction of the time, bringing dragons to life or creating epic space battles that look breathtakingly real.
  3. The Viral Video Sensation: For content creators, time is of the essence. GPU-accelerated BMF is like having a superpower to edit and render viral-worthy videos in record time, ensuring you hit the trends before they fade.
  4. The Gaming Livestream: In the gaming world, live streaming with real-time effects is key. With BMF's GPU acceleration, you can stream your gameplay with high-quality graphics and overlays, keeping your audience glued to their screens.
  5. The AI-Powered Masterpiece: Dive into the future with AI-enhanced video processing. From upscaling vintage film footage to crystal-clear quality to applying real-time face filters in a video chat, BMF's GPU acceleration makes it all possible, and at lightning speeds.

GPU acceleration in BMF opens up a world of possibilities for high-performance video processing. By leveraging the power of GPUs, developers can achieve remarkable speed and efficiency in multimedia applications. The examples provided are just a starting point -- the real potential lies in how you apply these capabilities to your unique projects.

Remember, the key to successful implementation is understanding your specific processing requirements and how best to utilize BMF's GPU acceleration features to meet those needs.

Top comments (0)