How to Normalize Audio with FFmpeg (Volume, Loudnorm, and API)

#ffmpeg #audio #api #tutorial

Originally published at ffmpeg-micro.com

FFmpeg audio normalization trips up even experienced developers. The loudnorm filter alone has 6 parameters, a two-pass workflow that requires parsing JSON from stderr, and behavior that changes depending on whether you feed it MP4, WAV, or MKV. Most guides skip the hard parts.

This post covers three approaches to fixing audio levels with FFmpeg: simple volume scaling, broadcast-standard loudnorm normalization (EBU R128), and a cloud API that handles it without installing anything.

Quick answer: normalize audio to -16 LUFS with FFmpeg

ffmpeg -i input.mp4 -af "loudnorm=I=-16:TP=-1.5:LRA=11" -c:v copy output.mp4

This applies EBU R128 loudnorm normalization targeting -16 LUFS (the standard for streaming platforms). The -c:v copy flag passes video through untouched so only the audio gets re-encoded.

Approach 1: Simple volume adjustment with the volume filter

The volume filter is the simplest way to change audio levels. It multiplies every sample by a fixed factor.

Halve the volume:

ffmpeg -i input.mp4 -af "volume=0.5" -c:v copy output.mp4

Double the volume:

ffmpeg -i input.mp4 -af "volume=2.0" -c:v copy output.mp4

Adjust by decibels (more precise):

ffmpeg -i input.mp4 -af "volume=6dB" -c:v copy output.mp4

The volume filter is predictable but dumb. It doesn't analyze the source audio first, so you can easily clip loud sections or amplify noise in quiet ones. For consistent levels across multiple files, you need actual normalization.

Approach 2: EBU R128 normalization with loudnorm

The loudnorm filter implements the EBU R128 broadcast standard. Instead of blindly scaling amplitude, it measures perceived loudness (LUFS) and adjusts dynamically.

Single-pass normalization (good enough for most cases):

ffmpeg -i input.mp4 -af "loudnorm=I=-16:TP=-1.5:LRA=11" -c:v copy output.mp4

What the parameters mean:

I=-16 sets the target integrated loudness to -16 LUFS. YouTube uses -14, Spotify uses -14, podcasts typically use -16 to -19.
TP=-1.5 sets the true peak ceiling to -1.5 dBTP. This prevents clipping on lossy codecs like AAC and MP3 that can overshoot 0 dB during decoding.
LRA=11 sets the loudness range to 11 LU. This controls how much dynamic range the output keeps.

Two-pass normalization (more accurate):

Pass 1 - Analyze and capture stats:

ffmpeg -i input.mp4 -af "loudnorm=I=-16:TP=-1.5:LRA=11:print_format=json" -f null - 2>&1 | tail -12

Pass 2 - Apply the measured values:

ffmpeg -i input.mp4 -af "loudnorm=I=-16:TP=-1.5:LRA=11:measured_I=-19.6:measured_TP=-5.9:measured_LRA=0.6:measured_thresh=-29.6:offset=0.1:linear=true" -c:v copy output.mp4

Which approach should you use?

Scenario	Best approach	Why
Quick fix on one file	volume filter	Simple, predictable, no analysis needed
Batch normalize for a platform	loudnorm single-pass	EBU R128 standard, good enough for streaming
Mastering or archival quality	loudnorm two-pass	Most accurate, preserves dynamics
Automated pipeline	Cloud FFmpeg API	No installation, scales with your workflow

Common pitfalls

Forgetting -c:v copy and re-encoding the entire video. Without it, FFmpeg re-encodes the video stream too. This takes 10-50x longer.

Using volume instead of loudnorm for batch processing. The volume filter applies the same gain to every file. loudnorm measures each file and targets a consistent output level.

Setting the true peak ceiling to 0 dBTP. Lossy codecs generate intersample peaks during decoding that can exceed the encoded peak by 1-3 dB.

Running loudnorm on already-normalized audio. Double-normalizing compresses dynamics further each pass. Check levels first.

FAQ

What LUFS level should I target for YouTube?
YouTube normalizes all audio to -14 LUFS. Targeting -14 means YouTube won't touch your audio. -16 is close enough that the difference is barely audible.

Can I normalize audio without re-encoding the video?
Yes. Use -c:v copy to pass the video stream through unchanged. Only the audio gets re-encoded.

What is the difference between loudnorm and dynaudnorm?
loudnorm targets a specific LUFS level per the EBU R128 standard. dynaudnorm adjusts volume frame-by-frame and can introduce pumping artifacts. For most use cases, loudnorm is the right choice.