A quick writeup on rendering video to an HTML5 canvas

#webdev #javascript

tl;dr Use pre-rendered JPGs that combine multiple videos at a low frame rate.

I've recently been working on a side project using web-based interactive video. Video is rendered onto an HTML5 canvas depending on various interactions with the screen. There is roughly 24 minutes of source material that could be triggered more or less at any time and that needs to execute on-demand without any lag for buffering.

I've tried several different strategies to pull this off. I'll go through all of the bad ones first, describing why they didn't work for me, and I'll close with the "least worst" solution.

1. Use video

A logical first step when rendering video to a canvas is to use video as source material. In order to get a particular frame, one can set the currentTime like so:

video.currentTime = 3.1416;

There proved to be three issues with this approach:

The seek operation is sometimes slow, which makes the rendering loop miss its deadline if there are several seeks.
Because the seek operation is asynchronous, multiple parts of the canvas can't use the same video without coordinating asynchronous handlers (race condition).
Asynchronous video rendering becomes the only asynchronous operation in an otherwise synchronous canvas rendering pipeline. In my case, this would have required significantly refactoring otherwise-synchronous code (ie changing everything to promises, async/await, Aff).

2. Pre-render the videos to a canvas

In my next experiment, I pre-rendered all of the videos to HTML canvases at a frame rate of 20 fps. With each video weighing in at 600x600, a single canvas holds about 600kb of information. 600 times 24 minutes of source material times 20 fps = 17Gb of pre-rendered canvases. If you want to see Google Chrome explode and choke your computer, try pre-rendering 17Gb of HTML5 canvas! So this wasn't going to work...

3. Use ffmpeg to combine the videos and pre-render them as a series of Jpegs

Certain Jpeg compression methods reduce image size based on similarities across regions of the image. That means that, if you are able to tile source material in a single image, you can take advantage of similarities between the color palettes and reduce the overall size. I took each source video and made it a tile of a JPEG containing 8 video frames from left to right. By doing this and dropping the frame rate to 10 fps, I was able to compress the videos to ~44mb of source material in spread over ~2000 Jpegs. As each image can be downloaded asynchronously, the total time it takes me to download 2000 20kb images synchronously on a 4g connection is around 10 seconds. So this was the winner! I'd recommend this method to anyone doing web-based rendering of a canvas with a large amount of pre-loaded video content.

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

DEV Community

A quick writeup on rendering video to an HTML5 canvas

1. Use video

2. Pre-render the videos to a canvas

3. Use ffmpeg to combine the videos and pre-render them as a series of Jpegs

Create and maintain end-to-end frontend tests

Top comments (0)

Read next

Build a clone of Perplexity with LangGraph, CopilotKit, Tavily & Next.js 🪄

Bringing a DeepSeek R1 LangGraph Agent Into The Real World Using CopilotKit

Today's GitHub Repositories

How I Replaced 2000 Lines of Code with Just 300 in Redux Store — Without Breaking the App!