DEV Community

Sharmin Sirajudeen
Sharmin Sirajudeen

Posted on • Originally published at drengr.dev

How a Web Worker Fixed My Dying-Battery Audio (And What I Learned About PWAs the Hard Way)

I spent the last week modifying an open-source NES emulator to run in the browser as a PWA. I'm an Android developer by trade — Kotlin, Jetpack Compose, Flutter when the project calls for it. This was my first real dive into Web Workers, SharedArrayBuffer, and turning a browser tab into something that feels like a native app.

Here's what I learned. Some of it was obvious in hindsight. Most of it wasn't.


The Problem That Started Everything

I wanted to add real-time game modification sliders to a browser-based NES emulator. Speed multiplier, firepower boost, infinite lives — the kind of thing that's trivial if you have access to the game's memory. The emulator (JSNES, open-source) gives you direct access to the NES CPU's RAM via JavaScript. Writing a slider that tweaks cpu.mem[0x0487] every frame is maybe 10 lines of code.

I set up a GitHub Codespace, got the emulator running, and tested it in the browser. Everything worked beautifully. Then I opened the same URL on an older Android phone sitting on my desk.

The game visuals were smooth enough. But the audio — the iconic 8-bit music — sounded like a toy running out of battery. Slow, dragging, painful. Like someone was holding the NES's APU underwater.

Why Single-Threaded Was the Root Cause

Here's what was happening. The NES generates audio samples at 44,100 Hz, tied directly to CPU emulation. Each frame of emulation produces ~735 audio samples. The browser's Web Audio API expects those samples delivered at a consistent rate.

On a decent machine, the main thread easily ran the emulator at 60fps + rendered the canvas + fed audio samples. No contention. On the slow Android phone, canvas rendering was choking the main thread. Frames dropped to 30fps. Half the audio samples were generated per second. The Web Audio API played them at the expected rate but ran out halfway — producing that dying-battery sound.

I tried every hack I could think of:

  • Adaptive sample dropping — monitored FPS and dropped audio samples when the device struggled. Result: choppy audio instead of slow audio. Not better.
  • Dynamic Rate Control — stretched available samples via interpolation (the algorithm RetroArch uses). Result: alien communication sounds. The pitch was wrong because you can't stretch 22K samples to fill 44K slots without changing the fundamental frequency.
  • Multi-frame catch-up — ran 2 NES frames per requestAnimationFrame when the device fell behind. Result: even slower, because the device couldn't handle 2 frames if it was already struggling with 1.

None of it worked because I was treating the symptom, not the disease. The disease was: audio generation and canvas rendering were fighting for the same thread.

The Fix: Web Workers

The solution was architecturally simple. Move the NES emulation (CPU + audio generation) to a Web Worker. The main thread only handles canvas rendering, user input, and UI.

Worker Thread (setInterval @ 60fps)
├── JSNES emulation (CPU, PPU, APU)
├── Audio sample generation → SharedArrayBuffer
├── Game mod logic (speed, firepower, lives)
└── Frame pixel conversion → postMessage (Transferable)

Main Thread (requestAnimationFrame)
├── Canvas rendering (receives pixels from Worker)
├── Audio playback (reads from SharedArrayBuffer)
├── Keyboard/touch input → postMessage to Worker
└── UI (sliders, toggles, save/load, fullscreen)
Enter fullscreen mode Exit fullscreen mode

The key insight: setInterval in a Web Worker is not throttled when the tab is backgrounded. requestAnimationFrame on the main thread is. This means the Worker keeps generating audio at a consistent rate regardless of what the renderer is doing. The audio buffer never starves.

SharedArrayBuffer: The Zero-Copy Audio Bridge

This was the part I found most interesting, coming from a mobile background where inter-thread communication usually means Handler.post() or Kotlin coroutine channels.

The Worker generates ~735 audio samples per frame. Those samples need to reach the main thread's ScriptProcessorNode with minimal latency. postMessage adds serialization overhead and scheduling jitter — fine for input events, not great for 44,100 samples per second.

SharedArrayBuffer gives both threads access to the same memory. The Worker writes audio samples into a ring buffer. The main thread's audio processor reads from the same buffer. Zero copy, zero serialization, microsecond access.

The layout is simple:

SharedArrayBuffer:
[0-3]   Int32: write index (Worker writes via Atomics.store)
[4-7]   Int32: read index (Main reads via Atomics.store)
[8+]    Float32[]: interleaved L/R audio samples
Enter fullscreen mode Exit fullscreen mode

The Worker writes samples after each nes.frame() call. The ScriptProcessorNode on the main thread reads them in its onaudioprocess callback. The Atomics operations provide memory ordering guarantees — no locks needed for a single-producer, single-consumer ring buffer.

One gotcha that cost me an hour: interleaved audio samples must always be written in pairs (left + right). If the available buffer space is odd, you write one L sample without its R, and every subsequent read is shifted by one channel. The fix is one line: samplesToWrite = available & ~1 — force even.

SharedArrayBuffer requires specific HTTP headers:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
Enter fullscreen mode Exit fullscreen mode

Without these, typeof SharedArrayBuffer === 'undefined' in every browser. I built a fallback path using postMessage with Transferable Float32Array for environments where the headers can't be set.

Frame Transfer: Transferable Objects

The NES outputs 256×240 pixels per frame. That's ~245KB of pixel data at 60fps. Copying it via postMessage would be expensive. Transferable objects solve this — the ArrayBuffer is moved between threads, not copied. The sending thread loses access to it (it gets "neutered"), but the transfer is essentially free.

// Worker: convert pixels and transfer
const pixels = new Uint32Array(61440);
// ... fill pixels from JSNES frameBuffer ...
postMessage({ type: 'frame', pixels }, [pixels.buffer]);
// pixels.buffer is now neutered — length 0 in Worker
Enter fullscreen mode Exit fullscreen mode

I used double-buffering: two pixel arrays in the Worker, alternating which one gets filled and transferred. In practice, I found that just reallocating a new Uint32Array(61440) after each transfer was simpler and fast enough — 245KB allocation at 60fps is well within V8's comfort zone.

The PWA Part

Turning this into a Progressive Web App was its own education. A few things I learned:

iOS Safari has no Fullscreen API. Not requestFullscreen, not webkitRequestFullscreen, not any variant. I discovered this when the fullscreen button simply did nothing on an iPhone. The only way to get "fullscreen" on iPhone is display: standalone in your web manifest + adding to home screen. Even then, the status bar stays — Apple never lets you hide it.

I ended up building a CSS-simulated fullscreen: toggling a body class that hides everything except the game canvas and touch controls. But then the exit button didn't work. Turns out, on iOS, a position: fixed button placed outside the main touch-responsive container silently fails to receive touch events. The button renders, you can see it, but tapping does nothing. I had to move the exit control inside the same overlay that handles game input. That one cost me a few hours of confused debugging.

PWA icons on iOS must be PNG, not SVG, and RGB not RGBA. Safari ignores SVG apple-touch-icon links entirely. And if your PNG has an alpha channel, iOS sometimes renders a blank or uses its default icon. My custom pixel-art icon only appeared after I converted it from RGBA to RGB using Pillow.

Service Worker caching is aggressive and separate from Safari's cache. Deleting Safari data doesn't clear a PWA's cache. You have to delete the home screen app icon first, then clear Safari data, then re-add. Learned this the hard way when testers kept seeing old versions.

The viewport-fit: cover meta tag is what lets your app extend under the iPhone notch. Without it, you get black bars.

Bonus: Background Execution Control

One thing I didn't expect — the Worker architecture gives you easy control over background behavior. Since the emulation loop runs on setInterval inside a Web Worker (which browsers don't throttle in background tabs), the game keeps running even when the user switches apps or tabs. That's great for audio continuity, but terrible for battery life.

The fix is trivial: listen for visibilitychange on the main thread and send a pause/resume message to the Worker. The emulation stops completely when the app is backgrounded and picks up exactly where it left off when the user returns. No state loss, no audio glitch on resume. If you ever need background execution (say, for a music player or a long-running computation), just don't send the pause — the Worker keeps ticking regardless of what the main thread is doing. Having that as a conscious choice rather than a browser-imposed limitation is a nice side effect of the architecture.

The Result

On the same slow Android phone that produced dying-battery audio with the single-threaded architecture: smooth, consistent, correct-speed audio. The Web Worker generates samples at a steady 60fps via setInterval, completely independent of the main thread's rendering frame rate. The SharedArrayBuffer bridge adds effectively zero latency.

The visual frames might drop to 30fps on a slow device — the game looks a bit less smooth — but the audio is untouched. That's the right tradeoff. Humans tolerate choppy video far better than choppy audio.

Takeaways

Thread architecture is a day-one decision, not an optimization. I built the single-threaded version first because it was faster to prototype. Then I spent more time patching audio hacks than the Worker migration ultimately took. If your app does real-time audio/video processing, put the producer on a separate thread from the start.

SharedArrayBuffer is the right tool for high-frequency inter-thread data. For audio at 44,100 samples/second, postMessage adds too much jitter. For input events at 10-30/second, postMessage is perfectly fine. Match the tool to the frequency.

Transferable objects are free. If you're passing large ArrayBuffers between threads via postMessage, mark them as transferable. Zero copy, zero overhead. Just remember the sender loses access.

PWAs on iOS are a different platform entirely. Don't assume web APIs work the same. The Fullscreen API doesn't exist. Touch events behave differently for fixed-position elements. Icons need specific formats. Test on an actual iPhone, not just Chrome DevTools mobile emulation.

Test on the slowest device first. If I'd tested on the old Android phone on day one, I would have designed for Workers from the start. Testing only on fast hardware hides architectural problems that become very expensive to fix later.


I'm a mobile developer (Android/Kotlin, Flutter) exploring the browser as a platform for real-time applications. If you've dealt with Web Workers, SharedArrayBuffer, or PWA quirks on iOS, I'd love to hear about your experiences in the comments.

Top comments (0)