DEV Community

vizibop
vizibop

Posted on

Producing a Music Video in the Browser

This page and associated codepens are meant to share with other developers a means of creating a Webm video from a dynamically generated canvas driven from user input and a audio file. If this sounds interesting, keep reading.

Introduction

Iā€™m Jason. Iā€™m building Vizibop, a service currently in beta that allows musicians and podcasters to quickly and easily create a unique MP4 video synched to a audio file. Yes, this is a thing.

At its core, Vizibop leverages a HTML Canvas, HTML5 Audio, MediaRecorder, and P5.js to generate a WebM file with front end code only. From there, that WebM file is sent through AWS Elastic Transcoder to convert the Webm file to a MP4 file which can then be shared on sites like Instagram, TikTok, and YouTube. Here is a basic video created from Vizibop:

Creating a basic animation

P5.js is a javascript library that make drawing on a canvas relatively straight forward. Even for the mathematically challenged like myself, particles and flocking are made easier to implement using P5. I am using P5 instance mode to keep things tidy. Here is a very basic scaffolding for a p5 animation:

Playing Music

Audio in the browser is a bit of a rabbit hole. Howler and SoundJS will make your life a lot easier if you get into building jukeboxes and video games. To keep things simple, let's simply create a hidden audio element and allow a user to click a button to begin playing a song. For Vizibop we allow users to upload their own songs in WAV or MP3 format using FileReader and createObjectURL.

Wiring Audio to P5

The P5.sound module makes it super easy to have the variables within your sketch change based on the volume and frequency of a sound. Out of the box you can use Amplitude and FFT to do some pretty amazing things. The Coding Train has a whole series of videos on sound within P5 that I encourage you to watch.

In this example, we want to do some beat detection to change the background of the canvas. Beat detection also gets a little tricky depending on the song and quality of the underlying audio. In Vizibop we let the user define which frequency bands to key off of to really dial in more accurate beat detection.

Customize the animation

The animation above is pretty basic. A better experience allows the users to turn knobs and dials to customize their animation. In our case, I have tried to balance just enough knobs and dials to create a wide range of videos without overwhelming the user with too many options. The possibilities are really unlimited when you combine images, typography, color, motion, and math.

Record a video

We now have the key components to record a music video built entirely in the front end.

Now the fun begins, particularly across browsers and computers. MediaRecorder and captureStream are the magic that allows us to create a Webm video from the components above. As of the writing of this document, Safari only supports MediaRecorder in experimental mode. Firefox does not support VP9 or H264 codecs. Chrome is our best friend here and clearly outperforms in this use case. With 70% market share between Chrome and Firefox, I was willing to move ahead with this approach betting that eventually Safari will make MediaRecorder generally available.

I am still tweaking the configuration here. It seems that both the VP9 and H264 codecs underperform VP8 for this application. Firefox has its issues. Adjusting both the frame rate of the animation and the captureStream combined with the optional videoBitsPerSecond parameter for the MediaRecorder seem to improve visual lag and artifacts due to compression. It's not perfect but so far I am pleased with the results.

What's next?

With this technique, there really isn't a limit to the types of videos musicians and podcasters can create in the browser. With that said, by design, Vizibop will not evolve into a full blown video editor. A guiding principle is to keep the user experience simple while maximizing the number of unique videos that can be created. Awesome Factor = Number of Unique Videos / Number of Knobs and Dials. My intent is to create a tool that is Sesame Street Simple for people who I believe should be spending more time focused on their core craft.

Your thoughts, feedback, advice, and guidance is always welcome

Top comments (0)