DEV Community

Mason K
Mason K

Posted on

Build scrub-bar thumbnail previews with FFmpeg and a WebVTT sprite

TL;DR

We're going to add hover-preview thumbnails (the little image that follows your cursor on a video scrub bar) to a player. Backend is one FFmpeg command that builds a sprite sheet, plus a tiny script that writes a WebVTT index. Frontend points the player at the VTT. Total: well under 100 lines.

What we're building

Two static artifacts and a few lines of player config:

  1. storyboard.jpg - a sprite sheet: many small frames tiled into one image.
  2. storyboard.vtt - a WebVTT file mapping each timeline range to a rectangle in the sprite via a #xywh fragment.
  3. Player wiring (Video.js shown, hls.js note at the end).

Versions used: ffmpeg 8.0.2 (anything 7.1+ works), node 20.x.

1. Generate the sprite with FFmpeg

The core command. One frame every 10 seconds, each scaled to 160px wide, tiled into a 10x10 grid:

ffmpeg -i input.mp4 \
  -vf "fps=1/10,scale=160:-1,tile=10x10" \
  -frames:v 1 storyboard.jpg
Enter fullscreen mode Exit fullscreen mode

What each filter does:

  • fps=1/10 - sample one frame per 10 seconds (not every frame).
  • scale=160:-1 - 160px wide, height auto from aspect ratio.
  • tile=10x10 - pack frames into a 10-column, 10-row grid (up to 100 tiles).
# terminal output you'll see
frame=    1 fps=0.0 q=24.0 Lsize=N/A time=N/A bitrate=N/A
video:118kB audio:0kB subtitle:0kB ...
Enter fullscreen mode Exit fullscreen mode

⚠️ A 10x10 grid covers 1000 seconds at a 10s interval. Longer videos overflow one sheet, and a single image can also hit the browser's max-canvas limit (~16k–32k px per side). For anything long, generate multiple sprites. We handle that in the generator below.

2. Know your numbers with ffprobe

Before writing the VTT, get the real duration so the last (partial) row is handled correctly:

ffprobe -v error -show_entries format=duration \
  -of csv=p=0 input.mp4
# 642.40
Enter fullscreen mode Exit fullscreen mode

3. Generate the WebVTT index

This is the part that has to be exact. Each cue's time range must match the FFmpeg interval, or previews drift the deeper you scrub. Generate it from the same interval value, never by hand.

// scripts/makeVtt.js - node 20+
import { writeFileSync } from "node:fs";

const INTERVAL = 10;      // seconds, MUST match fps=1/INTERVAL
const TILE_W = 160;
const TILE_H = 90;        // know your source aspect; 16:9 -> 90
const COLS = 10;
const ROWS = 10;
const PER_SHEET = COLS * ROWS;

function ts(sec) {
  const h = String(Math.floor(sec / 3600)).padStart(2, "0");
  const m = String(Math.floor((sec % 3600) / 60)).padStart(2, "0");
  const s = String(Math.floor(sec % 60)).padStart(2, "0");
  return `${h}:${m}:${s}.000`;
}

export function makeVtt(durationSec, sheetName = "storyboard") {
  const count = Math.ceil(durationSec / INTERVAL);
  let out = "WEBVTT\n\n";

  for (let i = 0; i < count; i++) {
    const start = i * INTERVAL;
    const end = Math.min(start + INTERVAL, durationSec);

    const indexInSheet = i % PER_SHEET;
    const sheet = Math.floor(i / PER_SHEET);     // 0, 1, 2...
    const col = indexInSheet % COLS;
    const row = Math.floor(indexInSheet / COLS);
    const x = col * TILE_W;
    const y = row * TILE_H;

    const img = `${sheetName}-${sheet}.jpg`;     // matches multi-sheet output
    out += `${ts(start)} --> ${ts(end)}\n`;
    out += `${img}#xywh=${x},${y},${TILE_W},${TILE_H}\n\n`;
  }
  return out;
}

const duration = Number(process.argv[2] || 642.4);
writeFileSync("storyboard.vtt", makeVtt(duration));
console.log("wrote storyboard.vtt");
Enter fullscreen mode Exit fullscreen mode

A few cues from the output:

WEBVTT

00:00:00.000 --> 00:00:10.000
storyboard-0.jpg#xywh=0,0,160,90

00:00:10.000 --> 00:00:20.000
storyboard-0.jpg#xywh=160,0,160,90

00:00:20.000 --> 00:00:30.000
storyboard-0.jpg#xywh=320,0,160,90
Enter fullscreen mode Exit fullscreen mode

4. Generate multiple sheets for long video

To keep each sprite under the browser's image cap, split the extraction by segment. Loop over 1000-second windows and tile each into its own storyboard-N.jpg:

# scripts/make_sheets.sh
DUR=$(ffprobe -v error -show_entries format=duration -of csv=p=0 input.mp4)
WINDOW=1000           # 100 tiles * 10s
i=0
start=0
while [ "$(echo "$start < $DUR" | bc)" -eq 1 ]; do
  ffmpeg -ss "$start" -t "$WINDOW" -i input.mp4 \
    -vf "fps=1/10,scale=160:-1,tile=10x10" \
    -frames:v 1 "storyboard-${i}.jpg"
  i=$((i+1))
  start=$((start+WINDOW))
done
Enter fullscreen mode Exit fullscreen mode

💡 Put -ss before -i for a fast keyframe-accurate seek so you're not decoding from the start of the file on every window.

5. Wire it into the player

Video.js with the thumbnails plugin reads the VTT directly:

<!-- index.html -->
<link href="https://vjs.zencdn.net/8.10.0/video-js.css" rel="stylesheet" />
<video id="player" class="video-js" controls preload="auto" width="800">
  <source src="https://cdn.example.com/video/master.m3u8" type="application/x-mpegURL" />
</video>
Enter fullscreen mode Exit fullscreen mode
// app/player.js
import videojs from "video.js";
import "videojs-vtt-thumbnails";

const player = videojs("player");

player.vttThumbnails({
  src: "https://cdn.example.com/video/storyboard.vtt",
});
Enter fullscreen mode Exit fullscreen mode

That's the whole frontend. The plugin parses the #xywh fragments and crops the sprite as you hover.

Rolling your own on a custom seek bar is just as small: parse the VTT once, then on mousemove over the bar, find the cue whose range contains the hovered time and set the preview element's background-image + background-position from the fragment.

// app/customSeekPreview.js (sketch)
function showPreview(hoverTimeSec, cues, el) {
  const cue = cues.find(c => hoverTimeSec >= c.start && hoverTimeSec < c.end);
  if (!cue) return;
  const { img, x, y, w, h } = cue;            // parsed from #xywh
  el.style.width = `${w}px`;
  el.style.height = `${h}px`;
  el.style.backgroundImage = `url(${img})`;
  el.style.backgroundPosition = `-${x}px -${y}px`;
}
Enter fullscreen mode Exit fullscreen mode

6. Verify alignment before you ship

The single most common bug here is drift: the previews look right at the start and wander off near the end. It's almost always an interval mismatch between FFmpeg and the VTT. Catch it in ten seconds with a spot check instead of discovering it in QA.

Extract the frame your VTT claims is at a known timestamp, and eyeball it against the sprite tile:

# what does the real video look like at 5:00 (300s)?
ffmpeg -ss 300 -i input.mp4 -frames:v 1 -q:v 2 check-300.jpg
Enter fullscreen mode Exit fullscreen mode

Then find the cue covering 300s in storyboard.vtt:

00:05:00.000 --> 00:05:10.000
storyboard-0.jpg#xywh=0,360,160,90
Enter fullscreen mode Exit fullscreen mode

Crop exactly that rectangle out of the sprite and compare:

ffmpeg -i storyboard-0.jpg -vf "crop=160:90:0:360" tile-check.jpg
Enter fullscreen mode Exit fullscreen mode

check-300.jpg and tile-check.jpg should show the same shot. If they don't, your INTERVAL, TILE_H, or grid math is off by something. Fix it now, because a confidently wrong preview is worse than no preview at all.

💡 Wire this into CI: render the cropped tile and the real frame for three timestamps, and fail the build if they diverge past a similarity threshold. Cheap insurance against a regression in your generator.

Gotchas checklist

  • [ ] VTT interval must equal the FFmpeg fps=1/N. Mismatch = drift.
  • [ ] Watch the browser image-size cap; split into multiple sheets for long video.
  • [ ] Set TILE_H to your real aspect ratio, or the crop rectangles are wrong.
  • [ ] Cache sprite + VTT on your CDN; they're static and shared across all viewers.
  • [ ] Drop JPEG quality a notch (-q:v 5 or so) - nobody pixel-peeps a hover preview.

What's next

  • Hang sprite generation off the same worker that runs your transcode, so every upload gets previews automatically.
  • Try a WebP sprite for smaller files at the same visual quality.
  • The same sprite + VTT pattern powers chapter markers and share-card previews, so the worker pays for itself more than once.

Top comments (0)