Hey folks!
In this article, I want to share our experience building a screen recorder inside the browser.
The product is Browser Recorder, a Chrome extension that can record screen, camera, microphone, add zoom effects, trim, cut, and export the final video.
At first, the recording part looked pretty simple. Browser APIs already give us screen capture and camera capture. But when we started to put the webcam on top of the screen recording, the real problems started.
Overview
In this article, I will cover:
- how easy it is to record only the screen
- why webcam overlay changes the architecture
- why live canvas compositing can make long videos choppy
- how we record screen and camera separately
- how we compose the final video during export
- one small bug with canvas sizing that was annoying to debug
The easy part: recording only the screen
If you need only a screen recording, the browser gives you a nice API for it.
async function recordScreen() {
const stream = await navigator.mediaDevices.getDisplayMedia({
video: { frameRate: 30 },
audio: true,
});
const recorder = new MediaRecorder(stream, { mimeType: "video/webm" });
const chunks = [];
recorder.ondataavailable = (event) => chunks.push(event.data);
recorder.onstop = () => {
const blob = new Blob(chunks, { type: "video/webm" });
download(URL.createObjectURL(blob));
};
recorder.start();
return recorder;
}
This is enough for a simple recorder.
You ask for the screen, pass the stream into MediaRecorder, collect chunks, and download a file.
But in our case, this was not enough because users expect more than a raw screen recording. They want a camera overlay, microphone, zoom effects, trim, cuts, and export formats like MP4 or WebM.
The first hard feature was the camera overlay.
Problem
The browser can give you two streams:
- screen stream
- camera stream
But the browser does not create a picture-in-picture video for you.
MediaRecorder can record tracks, but it will not magically draw the camera on top of the screen. So the first idea is usually to use a canvas.
function startCompositing(screenVideo, cameraVideo, canvas) {
const ctx = canvas.getContext("2d");
function draw() {
ctx.drawImage(screenVideo, 0, 0, canvas.width, canvas.height);
const camW = canvas.width * 0.2;
const camH = camW * (cameraVideo.videoHeight / cameraVideo.videoWidth);
ctx.drawImage(
cameraVideo,
canvas.width - camW - 24,
canvas.height - camH - 24,
camW,
camH,
);
requestAnimationFrame(draw);
}
draw();
const composited = canvas.captureStream(30);
return new MediaRecorder(composited, { mimeType: "video/webm" });
}
This approach looks good in a demo.
You see the screen. You see the camera. You can record the canvas stream. Everything feels finished.
But after testing longer recordings, we saw that this approach is not stable enough.
Why live canvas recording was a problem
With live canvas compositing, the browser needs to do many things in real time:
- decode the screen stream
- decode the camera stream
- draw both streams into canvas
- encode the canvas stream again
- keep the UI responsive
For a 20 second video, it can be fine.
For a 5 or 10 minute video, it starts to be visible that frames are missing.
The most annoying part is that there is no big error. Nothing crashes. The exported video just becomes less smooth.
There are a few clocks involved:
- screen capture can be 60fps
- webcam can be 30fps
-
canvas.captureStream(30)asks for 30fps -
requestAnimationFramedepends on how busy the tab is
If the tab cannot keep the drawing loop stable, your output will not be stable too.
So we decided not to fight with the live canvas loop.
Solution
The solution was to separate recording from compositing.
In Browser Recorder, we record screen and camera as independent recordings.
The canvas that user sees during recording is only a preview. It helps to see where the camera is placed, but this canvas is not the final video.
The architecture is:
- Record screen video separately.
- Record camera video separately.
- Mix screen audio and microphone audio into the screen recording.
- Keep camera recording video-only.
- Compose the final video later during export.
Implementation details
For recording we use mediabunny instead of MediaRecorder.
Before starting capture, we check what video codec browser can encode:
const videoCodec =
(await getFirstEncodableVideoCodec(["vp9", "vp8", "avc"])) ?? "vp8";
const outputFormat =
videoCodec === "avc" ? new Mp4OutputFormat() : new WebMOutputFormat();
const audioCodec = videoCodec === "avc" ? "aac" : "opus";
After that, the screen video track goes into MediaStreamVideoTrackSource.
const screenVideoSource = new MediaStreamVideoTrackSource(screenVideoTrack, {
codec: videoCodec,
bitrate: RECORDING_BITRATE.screen,
sizeChangeBehavior: "contain",
});
screenOutput.addVideoTrack(screenVideoSource);
For audio, we mix screen audio and microphone audio with Web Audio API and then pass the mixed track into MediaStreamAudioTrackSource.
const audioDestination = audioContext.createMediaStreamDestination();
screenAudioSource.connect(audioDestination);
micAudioSource.connect(audioDestination);
const mixedAudioTrack = audioDestination.stream.getAudioTracks()[0];
The camera is recorded into its own output:
const camVideoSource = new MediaStreamVideoTrackSource(camVideoTrack, {
codec: videoCodec,
bitrate: RECORDING_BITRATE.camera,
});
camOutput.addVideoTrack(camVideoSource);
This gives us two files in IndexedDB:
- screen recording with audio
- camera recording without audio
This looks more complicated than one canvas, but it gives much more control later.
Export process
The final merge happens during export.
In the export step, we read the screen recording frame by frame with VideoSampleSink.
For every screen frame, we ask the camera provider for the latest camera frame at the same timestamp. Then we draw the screen, zoom effect, and camera overlay into OffscreenCanvas.
After that we create a new VideoSample from the canvas and send it into VideoSampleSource.
Small version of the idea:
for await (const sample of videoSink.samples(segStart, segEnd)) {
const camFrame = await cameraProvider.getFrameAt(sample.timestamp);
renderFrameWithEffects(
sample,
outputCtx,
outWidth,
outHeight,
zoomState,
sourceWidth,
sourceHeight,
hasCameraOverlay,
camFrame,
userCamRecording,
);
const outputSample = new VideoSample(outputCanvas, {
timestamp: outputTimestamp,
});
await videoSource.add(outputSample);
}
This loop is not trying to keep up with real time.
If export needs more time, it can take more time. The final video is still created frame by frame.
This is also the place where we apply:
- camera overlay
- zoom effects
- cuts
- trims
- export resolution
- export format
Right now Browser Recorder supports MP4, WebM, MOV, and MKV export.
Small bug with canvas sizing
One small bug that took time to understand was connected with video dimensions.
On playback, the video looked like it changed width after the first second. There was no animation. No zoom. It just changed size.
The reason was simple.
video.videoWidth and video.videoHeight are 0 before metadata is loaded. If you size canvas before that, you can use wrong dimensions. After metadata is loaded, the real size comes and canvas changes.
The fix is to size canvas only after loadedmetadata.
video.onloadedmetadata = () => {
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
drawFrame();
};
It is a small thing, but in video tools small things become visible very fast.
Summary
The main lesson for us was simple: recording is not the same as composing.
Screen capture and camera capture are already quite good in the browser. The hard part is combining them into one final video with good frame accuracy.
For Browser Recorder, the better architecture was:
- Capture sources separately.
- Keep live canvas only for preview.
- Store screen and camera recordings independently.
- Compose final video during export.
- Test with long videos, not only with short demos.
If you are building a browser recorder, I really recommend testing 5 or 10 minute recordings early. A lot of problems are invisible in a short demo.
Browser Recorder is the Chrome extension where we are working through all of these problems: screen recording, camera overlay, zoom effects, browser-based editing, and export without watermark.
If you have experience with browser video recording, share in the comments how you solved camera overlay and export performance.
Top comments (0)