DEV Community

Michael Lip
Michael Lip

Posted on • Originally published at zovo.one

Trim Audio Files Without Installing Software

I needed to extract a 30-second clip from a podcast episode to use as a sample in a presentation. My options: install Audacity (a 90 MB download with 200 features I don't need), use FFmpeg (powerful but I'd need to figure out the timestamp syntax), or open a web-based tool and drag the trim handles. I chose the third option because sometimes the simplest tool is the right one.

But the experience made me curious about how audio trimming actually works, both at the file format level and in the browser. Here's what I learned.

How audio data is stored

Digital audio is a series of numbers representing air pressure samples taken at regular intervals. CD-quality audio samples 44,100 times per second (44.1 kHz sample rate), and each sample is a 16-bit integer (values from -32,768 to 32,767). Stereo audio has two channels, so that's 88,200 samples per second, each 2 bytes, totaling 176,400 bytes per second -- about 10.6 MB per minute.

Trimming uncompressed audio (WAV/PCM) is conceptually simple: find the byte offset corresponding to your start time, find the offset for your end time, and extract everything in between. The math:

byte_offset = time_in_seconds * sample_rate * channels * bytes_per_sample
Enter fullscreen mode Exit fullscreen mode

For a stereo 44.1 kHz 16-bit WAV file, the byte offset for the 30-second mark is:

30 * 44100 * 2 * 2 = 5,292,000 bytes
Enter fullscreen mode Exit fullscreen mode

You'd update the WAV header to reflect the new data length, copy the bytes from start to end, and you have a trimmed file. No re-encoding needed. No quality loss.

Trimming compressed audio is harder

For compressed formats like MP3, AAC, or OGG, you can't just slice at arbitrary byte positions because the data is organized into frames, and each frame is a compressed chunk that decodes to a fixed number of samples.

An MP3 frame contains 1,152 samples (at most common configurations). At 44.1 kHz, that's about 26 milliseconds of audio per frame. Each frame is independently decodable -- this is why you can seek to the middle of an MP3 file and start playing immediately.

To trim an MP3 precisely, you need to:

  1. Find the frame that contains your start time
  2. Find the frame that contains your end time
  3. Copy all frames between them
  4. Optionally re-encode the first and last frames for sample-accurate trimming

If you trim on frame boundaries, you can avoid re-encoding entirely. The trim points won't be perfectly precise (off by up to 26ms), but for most use cases, that's unnoticeable. This is what tools that promise "lossless MP3 cutting" do -- they cut on frame boundaries.

For sample-accurate trimming, the first and last frames need to be decoded, trimmed at the sample level, and re-encoded. This introduces a single generation of lossy compression on those two frames only. The rest of the file passes through untouched.

The Web Audio API approach

Modern browsers can trim audio entirely client-side using the Web Audio API. Here's the core concept:

async function trimAudio(file, startTime, endTime) {
  const audioContext = new AudioContext();
  const arrayBuffer = await file.arrayBuffer();
  const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);

  const sampleRate = audioBuffer.sampleRate;
  const startSample = Math.floor(startTime * sampleRate);
  const endSample = Math.floor(endTime * sampleRate);
  const duration = endSample - startSample;

  const trimmedBuffer = audioContext.createBuffer(
    audioBuffer.numberOfChannels,
    duration,
    sampleRate
  );

  for (let channel = 0; channel < audioBuffer.numberOfChannels; channel++) {
    const sourceData = audioBuffer.getChannelData(channel);
    const targetData = trimmedBuffer.getChannelData(channel);
    for (let i = 0; i < duration; i++) {
      targetData[i] = sourceData[startSample + i];
    }
  }

  return trimmedBuffer;
}
Enter fullscreen mode Exit fullscreen mode

decodeAudioData handles all the format-specific decoding. It takes any audio format the browser supports and gives you raw PCM samples as float32 arrays. You extract the samples you want, create a new buffer, and you have your trimmed audio.

The harder part is encoding the result back to a compressed format for export. The Web Audio API doesn't have a built-in encoder. You need either the MediaRecorder API (which gives you limited format control) or a WebAssembly-based encoder like lamejs for MP3 or opus-encoder for Opus.

Practical tips for cleaner trims

Trim at zero crossings. A zero crossing is where the audio waveform passes through the zero amplitude line. Trimming at a zero crossing avoids audible clicks or pops at the cut point. If you trim at a point where the waveform is at its peak, the sudden jump from full amplitude to silence creates a click.

function findNearestZeroCrossing(samples, position) {
  // Search forward from the target position
  for (let i = position; i < samples.length - 1; i++) {
    if (samples[i] >= 0 && samples[i + 1] < 0 ||
        samples[i] < 0 && samples[i + 1] >= 0) {
      return i;
    }
  }
  return position; // fallback to original position
}
Enter fullscreen mode Exit fullscreen mode

Apply a short fade. Even trimming at a zero crossing can sound abrupt. A 5-10 millisecond fade-in at the start and fade-out at the end makes the trim sound smooth:

function applyFade(samples, fadeSamples) {
  // Fade in
  for (let i = 0; i < fadeSamples; i++) {
    samples[i] *= i / fadeSamples;
  }
  // Fade out
  for (let i = 0; i < fadeSamples; i++) {
    samples[samples.length - 1 - i] *= i / fadeSamples;
  }
}
Enter fullscreen mode Exit fullscreen mode

At 44.1 kHz, 10 milliseconds is 441 samples. It's imperceptible as a fade but eliminates any click artifacts.

Preview before exporting. The Web Audio API lets you play back a buffer without saving it:

const source = audioContext.createBufferSource();
source.buffer = trimmedBuffer;
source.connect(audioContext.destination);
source.start();
Enter fullscreen mode Exit fullscreen mode

This lets you verify the trim points are right before committing to an export.

Watch your file sizes. Decoding a compressed audio file to PCM expands it significantly. A 5 MB MP3 becomes roughly 50 MB of float32 PCM in memory. For long recordings, this can strain browser memory limits. Consider processing in chunks for files longer than a few minutes.

For trimming audio files without writing code or installing desktop software, I built a tool at zovo.one/free-tools/audio-trimmer that runs entirely in the browser. Drag in a file, set your start and end points visually on the waveform, and export the trimmed result. Your audio never leaves your machine.

Audio trimming seems trivial until you understand what's happening at the sample level. The difference between a clean trim and one that clicks, pops, or cuts mid-word is in the details: zero crossings, fade curves, and frame boundaries.


I'm Michael Lip. I build free developer tools at zovo.one. 350+ tools, all private, all free.

Top comments (0)