DEV Community

monkeymore studio
monkeymore studio

Posted on

Building a Browser-Based AI Noise Reduction Tool with RNNoise

Have you ever recorded audio only to find it's filled with background hum, keyboard clicks, or street noise? Professional noise reduction software can be expensive and often requires uploading your files to the cloud. In this guide, I'll show you how we built an AI-powered noise reduction tool that runs entirely in your browser using RNNoise, the same technology behind our free online noise reduction tool.

Why Browser-Based Noise Reduction?

Before diving into the code, let's talk about why you'd want to process audio locally.

Your Audio Never Leaves Your Device

When you upload audio to cloud services for noise reduction, you're trusting third parties with your recordings. This is problematic for journalists with sensitive interviews, podcasters with unreleased episodes, or businesses with confidential meetings. Browser-based processing keeps everything local.

No Subscription Fees

Professional audio tools like iZotope RX or Adobe Audition cost hundreds of dollars annually. RNNoise is open-source and runs locally, making it completely free to use.

Instant Processing

No upload queues or server wait times. Once the model is loaded, processing happens immediately on your device.

Privacy by Design

Whether you're cleaning up a voice memo, preparing a podcast, or restoring old recordings, your audio stays private.

The Architecture Overview

Our noise reduction tool uses RNNoise, a noise suppression library based on a recurrent neural network (RNN). Here's how the components work together:

Understanding RNNoise: The AI Behind Clean Audio

RNNoise is a noise suppression library developed by Xiph.Org (the creators of Ogg and Opus). It uses a recurrent neural network to distinguish between speech and noise, effectively filtering out unwanted sounds while preserving voice quality.

How RNNoise Works

The model processes audio in small chunks (480 samples = 10ms at 48kHz) and:

  1. Feature Extraction: Converts audio to frequency-domain features
  2. RNN Processing: Feeds features through a recurrent neural network
  3. Gain Estimation: Predicts how much to suppress each frequency band
  4. Signal Reconstruction: Applies gains and converts back to time-domain

The result is audio with significantly reduced background noise while maintaining natural-sounding speech.

Core Data Structures

Let's examine the key data structures in our noise reduction implementation:

Frame Size Constant

// RNNoise processes audio in 480-sample frames (10ms at 48kHz)
const RNNOISE_FRAME_SIZE = 480;
Enter fullscreen mode Exit fullscreen mode

RNNoise is designed to work with 48kHz audio, processing 480 samples at a time (10ms frames).

React State Management

const [audioFile, setAudioFile] = useState<File | null>(null);
const [audioUrl, setAudioUrl] = useState<string | null>(null);
const [processedAudioUrl, setProcessedAudioUrl] = useState<string | null>(null);
const [isProcessing, setIsProcessing] = useState(false);
const [isLoadingModel, setIsLoadingModel] = useState(false);
const [progress, setProgress] = useState(0);
const [error, setError] = useState<string | null>(null);
const [processingTime, setProcessingTime] = useState<number | null>(null);

const rnnoiseRef = useRef<any>(null);
const fileInputRef = useRef<HTMLInputElement>(null);
Enter fullscreen mode Exit fullscreen mode

We track the audio file, processing state, progress (0-100%), and timing information.

The Complete Processing Flow

Here's the entire journey from noisy audio to clean audio:

Loading the RNNoise Model

First, we load the RNNoise WASM module:

useEffect(() => {
  setMounted(true);

  // Load RNNoise WASM module
  const loadRNNoise = async () => {
    try {
      setIsLoadingModel(true);

      // Dynamically import the RNNoise module
      const rnnoiseModule = await import('@shiguredo/rnnoise-wasm');

      // Load RNNoise using the static load() method
      const rnnoise = await rnnoiseModule.Rnnoise.load();
      rnnoiseRef.current = rnnoise;

      setIsLoadingModel(false);
    } catch (err) {
      console.error("Error loading RNNoise:", err);
      setError(t.noiseReductionError || "Failed to load noise reduction model");
      setIsLoadingModel(false);
    }
  };

  loadRNNoise();

  return () => {
    if (audioUrl) {
      URL.revokeObjectURL(audioUrl);
    }
    if (processedAudioUrl) {
      URL.revokeObjectURL(processedAudioUrl);
    }
  };
}, []);
Enter fullscreen mode Exit fullscreen mode

We use dynamic imports to load the WASM module and initialize RNNoise when the component mounts.

The Noise Reduction Process

Here's the complete audio processing pipeline:

const processAudio = async () => {
  if (!audioFile || !audioUrl || !rnnoiseRef.current) return;

  setIsProcessing(true);
  setError(null);
  setProgress(0);

  const startTime = Date.now();

  try {
    // Create denoise state
    const denoiseState = rnnoiseRef.current.createDenoiseState();

    // Load audio file
    const arrayBuffer = await audioFile.arrayBuffer();
    const AudioContextClass = window.AudioContext || (window as unknown as { webkitAudioContext: typeof AudioContext }).webkitAudioContext;
    const audioContext = new AudioContextClass();
    const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);

    // Convert to mono if stereo
    const sampleRate = audioBuffer.sampleRate;
    const numberOfChannels = audioBuffer.numberOfChannels;
    const length = audioBuffer.length;

    // Get mono data
    const inputData = new Float32Array(length);
    for (let i = 0; i < length; i++) {
      let sum = 0;
      for (let ch = 0; ch < numberOfChannels; ch++) {
        sum += audioBuffer.getChannelData(ch)[i];
      }
      inputData[i] = sum / numberOfChannels;
    }

    // Resample to 48kHz if needed (RNNoise works best at 48kHz)
    let resampledData: Float32Array = inputData;
    if (sampleRate !== 48000) {
      resampledData = await resampleAudio(inputData, sampleRate, 48000) as Float32Array;
    }

    // Process with RNNoise
    const denoisedData = new Float32Array(resampledData.length);

    // Process in frames of 480 samples
    const numFrames = Math.ceil(resampledData.length / RNNOISE_FRAME_SIZE);

    for (let frameIndex = 0; frameIndex < numFrames; frameIndex++) {
      const startIdx = frameIndex * RNNOISE_FRAME_SIZE;

      // Extract frame
      const frame = new Float32Array(RNNOISE_FRAME_SIZE);
      for (let i = 0; i < RNNOISE_FRAME_SIZE; i++) {
        frame[i] = resampledData[startIdx + i] || 0;
      }

      // Convert to 16-bit PCM for RNNoise
      const pcmFrame = floatToPCM(frame);

      // Process frame - returns VAD value, modifies pcmFrame in place
      denoiseState.processFrame(pcmFrame);

      // Convert back to float and store
      const denoisedFloat = pcmToFloat(pcmFrame);
      for (let i = 0; i < RNNOISE_FRAME_SIZE && (startIdx + i) < denoisedData.length; i++) {
        denoisedData[startIdx + i] = denoisedFloat[i];
      }

      // Update progress
      setProgress(Math.round((frameIndex / numFrames) * 100));
    }

    // Clean up denoise state
    denoiseState.destroy();

    // Resample back to original sample rate if needed
    let finalData: Float32Array = denoisedData;
    if (sampleRate !== 48000) {
      finalData = await resampleAudio(denoisedData, 48000, sampleRate) as Float32Array;
    }

    // Trim to original length
    finalData = finalData.slice(0, length) as Float32Array;

    // Create output AudioBuffer
    const outputBuffer = audioContext.createBuffer(1, finalData.length, sampleRate);
    (outputBuffer.copyToChannel as any)(finalData, 0);

    // Convert to WAV
    const wavBlob = audioBufferToWav(outputBuffer);
    const processedUrl = URL.createObjectURL(wavBlob);

    setProcessedAudioUrl(processedUrl);
    setProcessingTime((Date.now() - startTime) / 1000);
    setIsProcessing(false);
    setProgress(100);

  } catch (err: unknown) {
    console.error("Error processing audio:", err);
    const errorMessage = err instanceof Error ? err.message : String(err);
    setError(errorMessage || t.noiseReductionError);
    setIsProcessing(false);
  }
};
Enter fullscreen mode Exit fullscreen mode

Key steps:

  1. Create Denoise State: Initializes the RNN for processing
  2. Decode Audio: Converts uploaded file to raw PCM data
  3. Mix to Mono: RNNoise works on mono audio
  4. Resample to 48kHz: RNNoise's optimal sample rate
  5. Frame Processing: Process 480-sample chunks through RNNoise
  6. Resample Back: Return to original sample rate
  7. Export: Convert to WAV for download

Audio Format Conversions

RNNoise expects 16-bit PCM data, so we need conversion functions:

Float32 to PCM (Int16)

const floatToPCM = (floatArray: Float32Array): Int16Array => {
  const pcmArray = new Int16Array(floatArray.length);
  for (let i = 0; i < floatArray.length; i++) {
    const sample = Math.max(-1, Math.min(1, floatArray[i]));
    pcmArray[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
  }
  return pcmArray;
};
Enter fullscreen mode Exit fullscreen mode

This converts floating-point audio (-1.0 to 1.0) to 16-bit integers (-32768 to 32767).

PCM to Float32

const pcmToFloat = (pcmArray: Int16Array): Float32Array => {
  const floatArray = new Float32Array(pcmArray.length);
  for (let i = 0; i < pcmArray.length; i++) {
    floatArray[i] = pcmArray[i] / 32768;
  }
  return floatArray;
};
Enter fullscreen mode Exit fullscreen mode

This converts back from PCM to floating-point for Web Audio API compatibility.

Audio Resampling

RNNoise works best at 48kHz, so we resample if needed:

const resampleAudio = async (input: Float32Array, fromRate: number, toRate: number): Promise<Float32Array> => {
  const ratio = toRate / fromRate;
  const outputLength = Math.floor(input.length * ratio);
  const output = new Float32Array(outputLength);

  for (let i = 0; i < outputLength; i++) {
    const inputIndex = i / ratio;
    const index = Math.floor(inputIndex);
    const frac = inputIndex - index;

    if (index >= input.length - 1) {
      output[i] = input[input.length - 1];
    } else {
      // Linear interpolation
      output[i] = input[index] * (1 - frac) + input[index + 1] * frac;
    }
  }

  return output;
};
Enter fullscreen mode Exit fullscreen mode

We use simple linear interpolation for resampling. For production use, you might want a higher-quality algorithm like sinc interpolation.

Converting to WAV Format

After processing, we export the clean audio as a WAV file:

const audioBufferToWav = (buffer: AudioBuffer): Blob => {
  const numberOfChannels = buffer.numberOfChannels;
  const sampleRate = buffer.sampleRate;
  const format = 1; // PCM
  const bitDepth = 16;

  const bytesPerSample = bitDepth / 8;
  const blockAlign = numberOfChannels * bytesPerSample;

  const dataLength = buffer.length * numberOfChannels * bytesPerSample;
  const bufferLength = 44 + dataLength;

  const arrayBuffer = new ArrayBuffer(bufferLength);
  const view = new DataView(arrayBuffer);

  // Write WAV header
  const writeString = (view: DataView, offset: number, string: string) => {
    for (let i = 0; i < string.length; i++) {
      view.setUint8(offset + i, string.charCodeAt(i));
    }
  };

  writeString(view, 0, "RIFF");
  view.setUint32(4, 36 + dataLength, true);
  writeString(view, 8, "WAVE");
  writeString(view, 12, "fmt ");
  view.setUint32(16, 16, true);
  view.setUint16(20, format, true);
  view.setUint16(22, numberOfChannels, true);
  view.setUint32(24, sampleRate, true);
  view.setUint32(28, sampleRate * blockAlign, true);
  view.setUint16(32, blockAlign, true);
  view.setUint16(34, bitDepth, true);
  writeString(view, 36, "data");
  view.setUint32(40, dataLength, true);

  // Write audio data
  const offset = 44;
  const channels: Float32Array[] = [];
  for (let i = 0; i < numberOfChannels; i++) {
    channels.push(buffer.getChannelData(i));
  }

  let index = 0;
  for (let i = 0; i < buffer.length; i++) {
    for (let channel = 0; channel < numberOfChannels; channel++) {
      const sample = Math.max(-1, Math.min(1, channels[channel][i]));
      const intSample = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
      view.setInt16(offset + index, intSample, true);
      index += 2;
    }
  }

  return new Blob([arrayBuffer], { type: "audio/wav" });
};
Enter fullscreen mode Exit fullscreen mode

This creates a standard WAV file with proper headers and 16-bit PCM data.

Performance Considerations

Frame-by-Frame Processing

RNNoise processes audio in 480-sample frames (10ms at 48kHz). We update progress after each frame to give users feedback:

for (let frameIndex = 0; frameIndex < numFrames; frameIndex++) {
  // ... process frame ...

  // Update progress
  setProgress(Math.round((frameIndex / numFrames) * 100));
}
Enter fullscreen mode Exit fullscreen mode

Memory Management

We clean up resources when done:

denoiseState.destroy();
Enter fullscreen mode Exit fullscreen mode

And revoke object URLs to prevent memory leaks:

const clearAudio = () => {
  if (audioUrl) {
    URL.revokeObjectURL(audioUrl);
  }
  if (processedAudioUrl) {
    URL.revokeObjectURL(processedAudioUrl);
  }
  // ... reset state ...
};
Enter fullscreen mode Exit fullscreen mode

Dynamic Imports

We load the RNNoise module only when needed:

const rnnoiseModule = await import('@shiguredo/rnnoise-wasm');
Enter fullscreen mode Exit fullscreen mode

Browser Compatibility

Our noise reduction tool works in all modern browsers:

  • Chrome/Edge: Full support
  • Firefox: Full support
  • Safari: Full support

Required APIs:

  • AudioContext: Universal support
  • WebAssembly: Universal support
  • fetch: Universal support

What RNNoise Can and Can't Do

Works Well For:

  • Stationary noise: Air conditioning hum, computer fans, consistent background sounds
  • Keyboard typing: Mechanical keyboard clicks during voice recordings
  • Street noise: Distant traffic and urban ambience
  • Microphone hiss: Low-level analog noise

Limitations:

  • Non-stationary noise: Sudden loud sounds, music, speech in background
  • Heavy distortion: Already clipped or heavily compressed audio
  • Very noisy recordings: When SNR (signal-to-noise ratio) is extremely low

Try It Yourself

Ready to clean up your audio? Visit our free online noise reduction tool and try it out. All processing happens locally - your recordings never leave your device.

Conclusion

Building a browser-based noise reduction tool shows how powerful open-source AI can be:

  1. AI in the browser is practical: RNNoise + WASM delivers professional noise suppression without cloud dependencies.

  2. Privacy by default: Local processing means sensitive audio stays on your device.

  3. Frame-based processing: Understanding audio frame sizes and formats is crucial for DSP applications.

  4. Format conversions: Converting between Float32 and PCM is essential for working with audio APIs.

The complete source is available in our repository. Whether you're building a podcast editor, voice messaging app, or audio restoration tool, I hope this guide helps you add noise reduction to your projects.

Happy audio cleaning! ๐ŸŽงโœจ

Top comments (0)