Most teams don’t realize how expensive “just 30 seconds of video” can become — until it silently turns into terabytes of storage and a massive monthly bill.
That’s exactly what happened to us.
We were storing millions of short verification clips, each only ~30 seconds long — but together, they were costing us hundreds of GBs every month.
Instead of throwing money at storage, We decided to fix the root problem.
The result?
👉 20–30 GB saved per day
👉 60% average compression
👉 Zero impact on review quality
At my company, we run a video-based verification flow — users record short clips (typically 30 seconds) that get reviewed by our team. Simple enough. But over time, those clips add up fast.
We had millions of recorded videos in storage with a retention requirement of 2–4 years. The storage bill kept climbing. The turning point came when I actually measured what we were storing — and realized we were saving far more data than we needed to.
This is the story of how I built a pipeline that now compresses 2,000+ videos per day and saves 20–30 GB of storage daily — without touching a single live file.
Real Numbers from Production
Before going into the how, here's what one day's run looks like on our compression dashboard:
MetricValueVideos processed2,203Total space saved15.66 GBAverage compression ratio62.1%Best compression recorded99.92%
An average reduction of 62% reduction in average file size. Some clips (usually silent or nearly-static ones) compress all the way to 99.9%. Even on a typical day, we're saving 15+ GB from a single cron run.
The Problem: Over-Engineered for the Wrong Use Case
Our original recording setup used browser defaults — high resolution, high framerate, stereo audio at 48kHz. Reasonable general-purpose settings, but completely overkill for a 30 second face verification clip.
When the backlog started growing into the millions of files, the cost became impossible to ignore. But before writing a single line of compression code, I needed to understand why the files were so large — and fix the source before compressing the backlog.
Step 1: Research, Experiment, Validate
I didn't just pick lower settings and ship them. I researched codec behavior, tested different bitrate and quality combinations, and — critically — validated with real recordings before touching user traffic.
The validation approach: Our verifiers (the people who review verification videos) also record their own video profiles. This was the perfect test population — small, controlled, and internal. I rolled out the new recording constraints to verifier accounts first and asked them to record 10-second clips, much longer than a typical verification video.
Why 10 seconds? Because problems with codec settings, device compatibility, or audio quality tend to hide in short clips but surface in longer ones. Buffer issues, audio sync drift, encoding artifacts under motion — these all become visible when you push the duration.
Once the verifier videos came back clean across a range of devices (Chrome on Android, Chrome on desktop, Safari on iOS), I had confidence the settings were correct. Only then did I roll them out to user-facing recording.
Step 2: Optimize Video Recording at the Source
Understanding Browser Codec Support (VP8 vs VP9)
Modern browsers support two primary video codecs for MediaRecorder:
VP8 — the older WebM codec. Widely supported across all browsers and platforms, including older Android devices. Less efficient compression than VP9 but extremely reliable.
VP9 — Google's newer codec. Significantly better compression at the same quality level, especially for low-motion content like talking heads. Supported in Chrome, Firefox, and most modern browsers, but not always available on older devices or in certain Safari contexts.
Why fallback matters: If you specify video/webm;codecs=vp9,opus and the browser or device doesn't support VP9, MediaRecorder will either throw an error or silently produce a broken file. A hardcoded codec choice that works on your test device will fail on some percentage of real users' devices. You need to probe support at runtime:
const mimeTypes = [
'video/webm;codecs=vp9,opus', // Best: VP9 video + Opus audio
'video/webm;codecs=vp8,opus', // Good: VP8 video + Opus audio
'video/webm' // Fallback: browser chooses
];
const mimeType = mimeTypes.find(type => MediaRecorder.isTypeSupported(type)) || '';
MediaRecorder.isTypeSupported() checks at runtime whether the current browser/device combination actually supports a given MIME type. The list is ordered by preference — VP9 first, VP8 second, browser default last. Whatever the device supports, it gets the best available option.
The actual recording constraints
const constraints = {
video: {
width: { ideal: 640 },
height: { ideal: 480 },
frameRate: { ideal: 24 }
},
audio: {
sampleRate: 16000,
channelCount: 1, // mono
echoCancellation: true,
noiseSuppression: true
}
};
const mediaRecorder = new MediaRecorder(stream, {
mimeType,
videoBitsPerSecond: 1_000_000, // 1 Mbps cap
audioBitsPerSecond: 64_000 // 64 Kbps cap
});
Why each decision:
640×480 at 24fps — Sufficient to clearly identify a face. 30fps and 1080p are for content you'll watch repeatedly; a verification clip gets reviewed once.
16kHz mono audio — Voice intelligibility tops out around 8kHz. 16kHz captures speech clearly with half the data of 44.1kHz stereo. The verifier needs to hear what someone says, not enjoy high-fidelity audio.
1 Mbps video / 64 Kbps audio — A cap at the recorder level. Without this, VP8 in particular can spike well above reasonable bitrates on some devices.
echoCancellation + noiseSuppression — These reduce audio variance, which compresses better downstream. Noisy or echoey audio encodes less efficiently.
Step 3: Compress the Backlog
Fixing the source stopped the problem from growing. But millions of old files still existed at the original large sizes. I built a three-layer Laravel pipeline to work through them.
Layer 1 — Artisan Command
php artisan media:compress-videos --chunk=200 --dry-run
// or target a single video for testing:
php artisan media:compress-videos --id=202423
The command queries all records with a video/% MIME type, chunks them into batches of 200, and dispatches a queued job per video. The --dry-run flag lets you see what would be processed without doing anything — useful before running on production for the first time.
Layer 2 — Queued Job
class CompressMediaVideo implements ShouldQueue
{
public int $tries = 3;
public int $timeout = 120;
public function handle(VideoCompressorService $compressor): void
{
$media = Media::findOrFail($this->mediaId);
if (!str_starts_with($media->mime_type, 'video/')) {
return; // skip safely
}
$compressor->compress($media);
}
}
One job per video. If a job fails, it retries independently without affecting the others. The compression run can take months on a backlog of millions — having individual, retryable jobs means a server restart or a bad file doesn't set you back to zero.
Layer 3 — Deep Dive into FFmpeg Compression Strategy
This is where the actual compression happens. Some context on how it works:
What FFmpeg is doing: It reads the original WebM file (usually H.264 video + Opus audio, recorded by Chrome), re-encodes the video track using VP9, re-encodes the audio using Opus at a lower bitrate, and writes a new WebM container.
Why VP9 for compression? VP9 uses much more sophisticated algorithms than H.264 — it's better at identifying redundant information across frames and removing it. For a mostly-static scene like a person talking against a wall, VP9 can throw away enormous amounts of data while keeping the image perfectly clear to a human reviewer.
What CRF means: CRF (Constant Rate Factor) is quality-controlled encoding. Instead of targeting a specific bitrate, you specify a quality level and FFmpeg decides how many bits are needed to achieve it. Lower CRF = higher quality = larger file. Higher CRF = more compression = smaller file.
For VP9, CRF 33 is roughly "good quality." CRF 35 is "slightly aggressive." CRF 40+ starts to look visibly degraded. I chose CRF 35 because:
It's a face verification clip, not a film
The reviewer needs to see the face clearly, not count pores
Our production data confirms it works — the 62.1% average compression ratio came from CRF 35
$command = [
'ffmpeg', '-i', $inputPath,
'-c:v', 'libvpx-vp9',
'-crf', '35',
'-b:v', '0', // Required for CRF mode in VP9
'-c:a', 'libopus',
'-b:a', '64k',
'-y', $tempPath
];
Note: -b:v 0 is required when using CRF mode with libvpx-vp9. Without it, VP9 defaults to a target bitrate mode and ignores the CRF value.
Safe Processing with Atomic File Replacement:
$exitCode = $process->run();
if ($exitCode !== 0) {
@unlink($tempPath);
// Log and store error output for debugging
$media->update(['compressed_meta' => ['error' => $process->getErrorOutput()]]);
return;
}
// Only replace the original after successful compression
rename($tempPath, $inputPath);
FFmpeg writes to a .tmp file first. If encoding fails for any reason — corrupt input, disk full, process killed — the original file is untouched. The rename() only happens after a confirmed successful exit code. In a pipeline processing millions of files, this matters.
The service also skips files that would grow: If the compressed output is larger than the original (which happens with already-optimized or very short files), saved_bytes is recorded as 0 and the original is kept. The dashboard's -130.33% low value represents exactly this case — a tiny file that got larger under re-encoding.
Step 4: Storage Lifecycle Management (Trash + Cleanup Strategy)
When a verification video is rejected and the user records a new one, we don't immediately delete the old clip. It gets moved to a trash/ directory. A weekly cron job permanently removes everything older than 7 days in trash.
This gives the ops team a recovery window without keeping rejected videos indefinitely. It also keeps the compression pipeline clean — trash videos are excluded from the compression queue since they're headed for deletion anyway.
Results
One day's run (March 11, 2026):
2,203 videos compressed
15.66 GB saved
62.1% average compression ratio
Projected at scale:
At ~20 GB/day average: 600 GB/month, 7+ TB/year
The backlog has millions of videos — this pipeline will run for months
New uploads are already smaller thanks to the recording constraint changes
Key Takeaways
Validate on internal users first. Rolling out recording changes to verifier accounts before users meant any issues would be caught internally. Testing with 30-second clips specifically surfaces problems that short clips hide.
Probe codec support at runtime. Never hardcode a MIME type. MediaRecorder.isTypeSupported() is a one-liner that ensures every device gets the best encoding it can handle.
CRF over bitrate caps. A fixed bitrate wastes bits on simple scenes and starves complex ones. CRF adapts to content — simple talking-head frames get very small, complex frames get what they need.
4. -b:v 0 is not optional for VP9 CRF. This trips up a lot of people. VP9 in FFmpeg defaults to bitrate mode; CRF only activates when you explicitly set -b:v 0.
Atomic writes protect your data. In bulk processing, things fail. Write to temp, verify success, then rename. Never write directly to the live path.
Queue per file, not per batch. Individual queued jobs make the pipeline restartable, observable, and fault-tolerant. A bad file fails in isolation; everything else keeps running.
I write about backend engineering and fintech systems at shakiltech.com. If this was useful, the next post covers the Laravel queue architecture behind this pipeline.
Top comments (1)
looks like qualified content, but you should correct code snippet's markdown