I have been meaning to write this post for months. I kept putting it off because every time I sat down, the answer felt more nuanced than I wanted it to be — and nuanced answers don't get clicks. But here we are.
So: I spent about six weeks last fall doing a real evaluation of WebAssembly across several different projects. Not toy benchmarks. Actual work problems, on a five-person team building an AI-assisted code review product. My takeaway is not "Wasm is finally ready" or "Wasm is overhyped." It's more specific than either of those, and specificity is the whole point.
The Compute-Bound Cases Where Wasm Actually Delivers
The honest answer to "where does Wasm win?" is: wherever you're doing heavy computation that doesn't need to touch the DOM, and where startup time isn't your bottleneck.
Image processing is the canonical example and it holds up. We use a Wasm-compiled custom image pipeline for thumbnail generation in our CI preview system. Pure Rust, compiled with wasm-pack 0.13.1, targeting wasm32-unknown-unknown. Compared to the equivalent JS implementation, we see roughly 4–5x throughput improvement on the image manipulation steps. That matches what I've seen elsewhere, so no surprise there.
What's less talked about is compression. We switched from a JS implementation of zstd to a Wasm-compiled version late last year. Before: decompressing large artifact bundles took around 340ms in profiling. After: 80–90ms. Real numbers, real workload — your mileage may vary depending on hardware. The main artifact is a ~2.4MB compressed blob, if that gives context.
Cryptographic operations are another clean win, if you need algorithms outside what the Web Crypto API covers. For everything in SubtleCrypto, just use that. But we had a specific need for a custom hash function for deduplication (legacy system, long story), and the Wasm version was night-and-day faster.
// wasm-exposed hash function, simplified
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub fn compute_dedup_hash(data: &[u8]) -> String {
// custom parameterized blake3 variant
// this was the bottleneck — ~20ms per call in the JS version
let hash = my_custom_hasher::hash_with_params(data, DEDUP_PARAMS);
hex::encode(hash)
}
The call overhead from JS into Wasm is real — I'll get to that — but when the computation per call is expensive enough, the boundary cost gets swamped. That threshold is somewhere around 5–10ms of actual computation in my experience. Below that, you start paying more in overhead than you save.
Practical takeaway: if you're doing image processing, codec work, compression, or custom crypto in the browser or Node, Wasm is a straightforward choice. The tooling (wasm-pack, Emscripten 3.x) is mature enough that the setup cost is real but predictable.
The AI Inference Story Is Better Than I Expected — With One Catch
This is where I expected to get burned, and I sort of did, but not in the way I anticipated.
We run small classification models client-side for a few features in our product. Mostly for privacy reasons: code snippets contain customer IP and we'd rather not round-trip them. I tested ONNX Runtime Web's Wasm backend against the WebGPU backend for our specific models — a couple of fine-tuned DistilBERT-class things, nothing huge, largest is about 45MB.
The Wasm backend was more consistent than I expected. Not fast-fast, but consistent. WebGPU is theoretically faster, but in practice — at least when we evaluated, running ONNX Runtime Web v1.18 — the WebGPU backend had initialization variance that made it hard to use for features where latency predictability matters more than raw throughput. One model showed WebGPU at 3x faster steady state, but with a 1.2s cold start on Chrome and a 4s cold start on Firefox 134. That's a real problem when the feature is supposed to feel instant.
The Wasm backend had roughly 200–300ms cold start and ~110ms inference per call on our test machine (M2 MacBook Air — not exactly representative of user hardware, I know). Slower, but predictable.
One thing I noticed: ONNX Runtime Web's Wasm build as of v1.18 includes SIMD support by default, and that made a meaningful difference — roughly 30–40% on our workloads compared to the non-SIMD build. If you're on an older version, check whether you're getting ort-wasm-simd-threaded.wasm or the fallback. It matters.
Here's where I got genuinely caught out though. I thought startup time would be my main problem. It turned out that memory pressure — running multiple inference sessions simultaneously — was the thing that actually bit us. Wasm linear memory doesn't interact with the browser's GC the same way the JS heap does. We had workers leaking memory across sessions, which took me an embarrassingly long time to track down. The fix was trivially simple: call session.release() after inference completes. Obvious in retrospect, completely invisible in the moment.
Bottom line for client-side ML: Wasm is the pragmatic choice right now if you need broad browser support and predictable latency. WebGPU will probably flip that calculus by late 2026, but it's not there yet across the range of hardware and browser versions our users actually run.
Where Wasm Keeps Disappointing Me
Right, so — the wins are real but narrowly scoped. Here's where I've repeatedly reached for Wasm and pulled my hand back.
Startup time is the killer for anything that needs to feel fast. Our first Wasm attempt was for syntax highlighting. I thought: the existing JS implementation is a hotspot in our profiles, Wasm should help. I pushed this on a Friday afternoon and had to revert before end of day. The .wasm binary was 480KB (after wasm-opt), initialization was adding 60–80ms to first meaningful paint (measured with PerformanceObserver, not a gut feeling), and users noticed. The JS library we were replacing was tree-shakeable to ~15KB with zero startup cost. That comparison was humiliating.
The JS-to-Wasm boundary tax is under-discussed. Every call from JS into Wasm has overhead. For primitives, it's small. But passing strings or arrays — anything requiring serialization into linear memory — adds up fast. We benchmarked passing 50KB of text data into a Wasm function 100 times: serialization overhead alone was about 8ms per call. For a function computing in 2ms, that's catastrophic.
// naive: serialization overhead dominates
for (const chunk of chunks) {
result.push(wasmModule.processChunk(chunk)); // serializes chunk on every call
}
// better: one boundary crossing, process everything in Wasm
const merged = mergeChunks(chunks); // one JS allocation
const out = wasmModule.processAll(merged); // one boundary crossing
// parse out back into JS objects
DOM manipulation — obvious, but I'll say it anyway. If your code touches the DOM at all, Wasm doesn't help. There's no shortcut here regardless of what you've read about Wasm Components. The DOM lives in JS land. If you're trying to speed up rendering or layout, look at virtualization, CSS containment, content-visibility. Wasm won't touch those problems.
AssemblyScript is worth mentioning because TypeScript-to-Wasm sounds genuinely appealing. I experimented with it for about two weeks. The experience was rough — tooling gaps, limited stdlib, and the mental model mismatch between AS and TypeScript is bigger than the syntax similarity suggests. I haven't gone back since.
The Developer Experience Tax That Doesn't Show Up in Benchmarks
Shipping Wasm adds real ongoing maintenance overhead, and I don't see this acknowledged often enough.
Debugging Wasm in browser devtools is passable if you have DWARF symbols embedded, but nowhere near JS debugging ergonomics. Wasm-pack's source map output is inconsistent across my team's setups — we have a mix of Intel and ARM Macs plus Linux in CI, and getting consistent debug symbols across all three took more configuration than I'd like to admit. I spent an afternoon tracking down why a panicking Rust function was producing an opaque error in the browser console. Turned out console_error_panic_hook wasn't enabled in the release build. Obvious once you know. Invisible before.
Binary size is a whole separate project. Our Rust→Wasm binary before optimization: 1.1MB. After wasm-opt -O3: 380KB. After brotli: 95KB. That's fine — but getting there required adding wasm-opt to the build pipeline and figuring out the right optimization flags. O3 regressed one function's performance due to inlining decisions, and I'm still not entirely sure why. We ended up using --optimize-level 2 for that specific module.
Wasm GC shipped across all major browsers in late 2023/early 2024, which opens up Kotlin and Dart for Wasm targets without the historical overhead of bundling a full GC. That matters if you're evaluating language options for a non-Rust team. For us it's irrelevant, but worth knowing.
What I'd Actually Tell My Team If We Were Starting From Scratch
Stop asking "should we use Wasm?" and start asking: what specific computation do I need to run, and what are the latency and startup constraints on it?
My heuristic, not a framework — just how I think about it now:
Reach for Wasm if the operation is compute-bound (not I/O-bound), takes more than ~5ms in pure JS, doesn't need frequent DOM access, and startup latency is either amortized over a long session or can be hidden behind a loading state. Image processing, compression, custom crypto, ML inference — yes.
Don't reach for Wasm if the code touches the DOM, the operation is under 2ms (boundary cost will dominate), startup time matters and can't be hidden, or the existing JS library is already optimized and tree-shakeable. Syntax highlighting, string formatting, most UI logic, simple JSON parsing — stay in JS.
The case I'm watching more closely than the browser story: WASI 0.2 and the Component Model are enabling server-side Wasm in ways that actually interest me. Cloudflare Workers runs Wasm natively; Fastly's Compute platform is built on it. Running Wasm as a compute unit at the edge removes most of the startup-time and binary-size concerns. I'm more bullish on that story than the in-browser replacement narrative at this point.
The honest answer to "is Wasm ready to replace JavaScript?" is no — not as a general replacement, and framing it that way was always the wrong question. Wasm fills specific, well-defined gaps. In 2026, those gaps are real, the tooling is workable, and the performance wins are genuine. But you have to know exactly what problem you're solving before you pay the complexity tax. Most web app code shouldn't touch Wasm. The parts that should — you'll know them by their profiler traces.
Top comments (0)