MediaStream API in Javascript

#javascript #webdev #beginners

Hello fellow dev’s today we are gonna see how easy it is to record your voice or screen in the browser using Mediastream Recording API, with just a few lines we can have something working immediately, first let’s see how MDN defines Mediastream Recording API.

“The MediaStream Recording API is comprised of a single major interface, MediaRecorder, which does all the work of taking the data from a MediaStream and delivering it to you for processing. The data is delivered by a series of dataavailable events, already in the format you specify when creating the MediaRecorder”

There are a lot of technical words in that explanation but in a extremely simplified way mediaStream provides us the tools to control audio and videos using streams of data to deliver information with events like dataavailable or onstop, after that we manipulate this information however we see fit.

Initial setup

all the code you see in this article is available in the following REPOSITORY
and if you wanna test the code directly you can do it HERE

This project uses only javascript vanilla, we don’t need anything eccentric like react.js or vue.js, but of course if you want to try it using some framework go ahead because it’s basically the same.

HTML

The HTML file is a simple template, with links to our css and js files, other than that we some buttons and a gallery, this is where we gonna display all our audios/videos.

CSS

As for the styling I added some basic flex styles just for centering and a fancy button gradient just for presentation purpose.

Javascript

Now, here we have the main dish, let’s go through almost line by line.

We start by declaring all the HTML selectors we'll end up using for future events, mediaRecorder is gonna be the main object that dictates if we recording audio or our screen and the chunks variable is where we gonna store our recording data before converting it into an HTML element.



const buttons = document.querySelectorAll(".button");
const startAudioButton = document.querySelector("#startAudio");
const startScreenButton = document.querySelector("#startScreen");
const stopButton = document.querySelector("#stopAudio");
const audioList = document.querySelector("#audio-list");
const videoList = document.querySelector("#video-list");

let mediaRecorder = null;
let chunks = [];

Here we add click events to our three beautiful buttons so each one calls the function associate with the HTML element when we want to start or stop recording.



startAudioButton .addEventListener("click", recordAudio);
stopButton.addEventListener("click", stopRecording);
startScreenButton.addEventListener("click", recordSCreen);


function recordAudio() {
    // ...code
}

function  stopRecording() {
    // ...code
}

function  recordSCreen() {
    // ...code
}

The first “big” function we have is for recording audio, here we have a promise that calls the method .getUserMedia() with a json object to specify that we need only audio, this pops up a window asking for our permission to use the microphone inside the browser, after that we get a stream.

This stream can be obtained from audio or video, but in our case we want to capture our microphones stream, so we use it to initialize a new MediaRecorder object.

During the recording we will get a continues flow of data from the event ondataavailable, this data has the following structure:

Here's the definition of a Blob for those that dont know what it means.

“The Blob object represents a blob, which is a file-like object of immutable, raw data; they can be read as text or binary data, or converted into a ReadableStream “

we store all this information inside the array chunks as we are gonna need it later to create the audio element with it.

Then whenever we stop recording we call another function that creates the HTML audio element using the chunks array (Blobs).

Lastly we start the recording with...you guessed it mediaRecorder.start(x) by default it saves the entire file into a single Blob, but if we specify a duration then it creates a Blob every X milliseconds.



function recordAudio() {
  navigator.mediaDevices
    .getUserMedia({ audio: true})
    .then((stream) => {
      mediaRecorder = new MediaRecorder(stream);
      mediaRecorder.ondataavailable = (e) => {
        chunks.push(e.data);
      };
      mediaRecorder.onstop = (e) => {
        createMediaElement("audio", "audio/mp3", audioList);
      };
      mediaRecorder.onerror = (e) => {};
      mediaRecorder.start(1000);
    })
}

We stop the recording by simply calling mediaRecorder.stop()



function stopRecording() {
  mediaRecorder.stop();
}

When we stop a recording we automatically create a mediaRecorder.onstop event , this then calls the function createMediaElement(...) with the mediaType (audio or video), fileType and the placeToAdd (where to insert the element we just created).

Now we use all the stored information in the chunks array to create one Blob and make it into a url.

Then we create the HTML element passing the url as src and we reset the let variables.



function createMediaElement(mediaType, fileType, placeToAdd) {
  const blob = new Blob(chunks, {
    type: fileType,
  });
  const mediaURL = window.URL.createObjectURL(blob);
  const element = document.createElement(mediaType);
  element.setAttribute("controls", "");
  element.src = mediaURL;
  placeToAdd.insertBefore(element, placeToAdd.firstElementChild);
  mediaRecorder = null;
  chunks = [];
}

Screen recording is more or less the same thing, the only big differences is that we call getDisplayMedia instead of getUserMedia and when we create the media element we pass the chunks type as fileType.



function recordSCreen() {
  navigator.mediaDevices
    .getDisplayMedia({ mediaSource: "screen"})
    .then((stream) => {
      mediaRecorder = new MediaRecorder(stream);
      mediaRecorder.ondataavailable = (e) => {
        chunks.push(e.data);
      };
      mediaRecorder.onstop = (e) => {
        createMediaElement("video", chunks[0].type, videoList);
      };
      mediaRecorder.start();
    })
}

With this we have basically covered the entire thing, as you can see there is not much to it.