DEV Community

Elle
Elle

Posted on

3 Node.js Bugs That Silently Corrupt Your Binary Data

Binary data in Node.js looks easy until it isn't. I spent a combined 6 hours debugging three bugs that all had the same root cause: treating binary data like strings.

If you're calling external APIs that return audio, images, or video — read this before you ship.

Bug #1: String concatenation destroys audio files

This looks fine:

const chunks = []
res.on('data', chunk => { chunks.push(chunk.toString()) })
res.on('end', () => {
  const audio = chunks.join('')
  fs.writeFileSync('output.mp3', audio)
})
Enter fullscreen mode Exit fullscreen mode

The file saves. It has the right size. But it won't play.

Why: .toString() defaults to UTF-8 encoding. Binary bytes that aren't valid UTF-8 get replaced with (U+FFFD). Your audio file is now full of replacement characters where real data used to be.

Fix:

const chunks = []
res.on('data', chunk => chunks.push(chunk))
res.on('end', () => {
  const audio = Buffer.concat(chunks)
  fs.writeFileSync('output.mp3', audio)
})
Enter fullscreen mode Exit fullscreen mode

Keep everything as Buffers. Never .toString() binary data.

The frustrating part: the file size looks correct, fs.statSync shows something reasonable, and the first few bytes might even be valid. The corruption is scattered through the middle of the file wherever multi-byte sequences happen to be invalid UTF-8.

Bug #2: response.body isn't what you think in fetch()

If you migrated from https.request to fetch():

const res = await fetch(url)
const buffer = await res.buffer() // ❌ deprecated
Enter fullscreen mode Exit fullscreen mode

In Node 18+, res.buffer() is gone. You might reach for:

const text = await res.text()
const buffer = Buffer.from(text) // ❌ Same UTF-8 corruption
Enter fullscreen mode Exit fullscreen mode

Same bug as #1 — round-tripping through a string.

Fix:

const res = await fetch(url)
const buffer = Buffer.from(await res.arrayBuffer())
Enter fullscreen mode Exit fullscreen mode

arrayBuffer() gives you the raw bytes. No encoding involved.

Bug #3: Checking binary responses with truthiness

After calling a TTS API, you might validate the response like this:

const audio = await getTTS(text)
if (!audio) throw new Error('TTS failed')
fs.writeFileSync('speech.mp3', audio)
Enter fullscreen mode Exit fullscreen mode

The problem: an empty Buffer (Buffer.alloc(0)) is truthy in JavaScript. So is a Buffer containing an error message like {"error": "quota exceeded"}. Both pass your check.

Fix:

const audio = await getTTS(text)
if (!Buffer.isBuffer(audio) || audio.length < 1000) {
  // If it's tiny, it's probably an error message, not audio
  const maybeError = audio?.toString?.() || 'empty response'
  throw new Error(`TTS failed: ${maybeError}`)
}
Enter fullscreen mode Exit fullscreen mode

Check the type AND the size. A valid audio file for even a short sentence will be several KB at minimum. Anything under 1KB is almost certainly an error response, not audio.

The pattern

All three bugs share the same root cause: binary data passing through a text layer.

Binary → String → Binary = corrupted
Binary → Buffer → Binary = safe
Enter fullscreen mode Exit fullscreen mode

This applies to:

  • Audio from TTS APIs
  • Images from generation APIs
  • Video clips from AI video services
  • Any file downloaded over HTTP

If your API integration works for text but produces corrupt files for binary data — check whether a string is sneaking into your data pipeline. It almost always is.

Quick checklist

Before you ship any binary API integration:

  • [ ] Are you using Buffer.concat(), not string concatenation?
  • [ ] Are you using res.arrayBuffer(), not res.text()?
  • [ ] Are you checking Buffer.isBuffer() AND .length, not just truthiness?
  • [ ] Are you never calling .toString() on response data before writing to disk?

These are boring bugs. They don't throw errors. They produce files that look right but aren't. That's what makes them dangerous.


Found this useful? I write about the weird bugs I hit while building automated pipelines with Node.js. Follow for more war stories.

Top comments (0)