DEV Community

Cover image for Server-Side WebRTC Noise Reduction with Pion, FFmpeg, and RNN Models
snowlyg
snowlyg

Posted on • Originally published at lodan.me

Server-Side WebRTC Noise Reduction with Pion, FFmpeg, and RNN Models

This is a sanitized engineering note about server-side audio noise reduction for WebRTC calls.

Source article:
https://www.lodan.me/posts/server-side-webrtc-noise-reduction-pion-ffmpeg-rnn/

What the prototype tests

The goal is not to replace WebRTC's built-in audio processing. The narrower test is:

  • receive a WebRTC Opus track with Pion
  • read RTP packets in OnTrack
  • decode Opus payloads to PCM
  • pipe raw PCM into FFmpeg
  • apply the arnndn RNN noise reduction filter
  • validate the output as a file before considering real-time forwarding

Why this boundary matters

RTP, Opus, PCM, and FFmpeg raw audio input are different boundaries. If the PCM format is wrong, FFmpeg may still produce a file, but the result should not be trusted.

For example, if the Go side writes int16 PCM, the FFmpeg input format should be reviewed as s16le, not casually treated as s32le.

Production concerns

The prototype is useful because it isolates the audio path, but production use needs more work:

  • buffering and latency
  • CPU and memory isolation
  • FFmpeg process lifecycle
  • model choice
  • packet loss and jitter
  • RTP timestamps
  • audio/video sync
  • whether the processed audio is returned to WebRTC or only recorded

The full article has diagrams and the longer explanation:

https://www.lodan.me/posts/server-side-webrtc-noise-reduction-pion-ffmpeg-rnn/

Top comments (0)