Wasm is fast, there’s no question.
But can I, with conscious optimizations, beat it with Node.js?
The results surprised even me.
I’ve been building a profiler for Node.js as we speak, sitting at sub 100 nanoseconds latency. For perspective: native C++ Tracy profiling lives around 5-40 nanoseconds. Hitting ~100ns from Node is honestly insane.
And that process taught me one thing very clearly:
Aggressive optimizations, and real command of Node.js, change everything.
So I kept having this idea floating around:
Who can simulate 125,000 particles faster? between wasm and node.js
Starting with basic, unoptimized JavaScript. No tricks yet.
But first, we need to agree on something.
Wasm Is Insanely Fast
There’s no debate here.
Wasm is fast because it:
- exposes raw linear memory
- is fully typed
And anytime you hear typed, most of the time you should think fast, for one simple reason:
It’s easier to optimize for the devil you know.
Imagine living somewhere where the weather changes every hour. Worse: it can go from extremely hot to freezing cold, or hot to extremely hot, completely unpredictably.
How do you optimally dress for that?
That’s dynamic code to a compiler.
That’s JavaScript.
In JavaScript, a variable can go from a string, to a number, to an object, in the same context, in milliseconds:
// rapid type churn - triggers polymorphism / deopts
let x = 0;
for (let i = 0; i < 1e6; i++) {
x = i; // number
x = String(x); // string
x = { i: x }; // object
x = x.i.length ? 0 : i; // back to number (maybe)
}
How does a compiler optimize for that?
So yes, the basis for Wasm being fast is not in question.
The real question is this:
Can a few conscious optimization tricks in Node.js beat Wasm?
To test that, there’s nothing better than a brutal workload:
a field of 125,000 particles.
Setup
You can clone the repo and run everything yourself:
git clone https://github.com/sklyt/wasmvsnode.git
For visualization, I’m using my own C++ N-API renderer, tessera.js.
You can learn more about it here:
How I built a renderer for Node.js
125k JS Objects vs Wasm
This setup is already insane.
JS objects are:
- scattered all over memory
- cache-unfriendly
- expensive to access and update
- made worse by prototype chains
Objects in JS are the epitome of unpredictable weather.
So how do they fare against Wasm?
Results
wasm avg (60fps): 0.512ms
js objects avg: 1.684ms
A few notes:
These results are slightly affected by the recording setup (Windows being Windows), but the conclusion is clear: Wasm is ~3× faster.
Also note this line in obj.js:
const wasmStart = performance.now();
wasmUpdate(0, PARTICLE_COUNT, WIDTH, HEIGHT, Math.floor(dt * 1000));
wasmTime = performance.now() - wasmStart;
wasmUpdate crosses the JS → Wasm boundary. That includes call overhead, yet Wasm is still blazing fast.
This result is expected.
But now comes one of the most beautiful optimization concepts I’ve ever learned.
Data-Oriented Design
Instead of representing entities like this (line 43 in obj.js):
const jsParticles = [];
for (let i = 0; i < PARTICLE_COUNT; i++) {
jsParticles.push({
x: Math.random() * WIDTH,
y: Math.random() * HEIGHT,
vx: (Math.random() - 0.5) * 200,
vy: Math.random() * -100
});
}
What if we flatten the data?
Use the most cache-friendly structure possible, a linear array:
particle 1 particle 2 ...
[x, y, vx, vy, x, y, vx, vy]
Now memory access is linear.
No pointer chasing.
No jumping around memory.
This idea is everywhere in game engines:
- sparse sets
- ECS
- struct-of-arrays
This is how engines do ridiculous amounts of work in sub-0.16ms.
So I flattened the data, and went further.
I used typed arrays.
const jsParticles = new Float32Array(PARTICLE_COUNT * 4);
Types. Predictability.
The weather just got calmer.
(I have no idea why dev to is de-optimizing the gif so bad so here's a picture):
Results
wasm avg (60fps): 0.520ms
js typed avg: 0.684ms
Insane speed up for node, now Wasm is only ~1× faster.
We’re close.
But can we do more?
Throwing Threads at It
Node.js isn’t just “single-threaded”.
That one thread is:
- running timers
- handling callbacks
- scheduling work
- cleaning garbage
- synchronizing async tasks
Meanwhile, most machines have 4-16 cores, and we’re hammering one.
So let’s split the work.
Using SharedArrayBuffer, we get zero-copy sharing:
const controlBuffer = new SharedArrayBuffer(Int32Array.BYTES_PER_ELEMENT);
const control = new Int32Array(controlBuffer);
All I’m doing is dividing the particle rows between four workers, updating physics in parallel.
Worker code (line 61):
const CHUNK = workerData.chunkSize;
That’s it.
The results are insane.
Results
wasm avg (60fps): 0.520ms
js typed + workers avg: 0.400ms
JavaScript is now 1.26× faster on average.
Final Thoughts
Does this mean Wasm is slow?
Absolutely not.
Just like Node.js, Wasm can be aggressively optimized too. And honestly, the Node optimizations here aren’t even extreme, I could still push:
- branchless programming
- tighter batching
- SIMD-friendly layouts
The point is simple:
Node.js can be insanely fast.
It’s C++ under the hood, you just need to know how to talk to it.
And once you do, the gap between “high-level” and “low-level” starts to look a lot smaller than people think.




Top comments (1)
The wasm example compiles WAT(web assembly text format):