DEV Community

Cover image for I Tried to Beat WebAssembly With Node.js
Sk
Sk

Posted on

I Tried to Beat WebAssembly With Node.js

Wasm is fast, there’s no question.
But can I, with conscious optimizations, beat it with Node.js?

The results surprised even me.

I’ve been building a profiler for Node.js as we speak, sitting at sub 100 nanoseconds latency. For perspective: native C++ Tracy profiling lives around 5-40 nanoseconds. Hitting ~100ns from Node is honestly insane.

And that process taught me one thing very clearly:

Aggressive optimizations, and real command of Node.js, change everything.

So I kept having this idea floating around:

Who can simulate 125,000 particles faster? between wasm and node.js

Starting with basic, unoptimized JavaScript. No tricks yet.

But first, we need to agree on something.


Wasm Is Insanely Fast

There’s no debate here.

Wasm is fast because it:

  • exposes raw linear memory
  • is fully typed

And anytime you hear typed, most of the time you should think fast, for one simple reason:

It’s easier to optimize for the devil you know.

Imagine living somewhere where the weather changes every hour. Worse: it can go from extremely hot to freezing cold, or hot to extremely hot, completely unpredictably.

How do you optimally dress for that?

That’s dynamic code to a compiler.
That’s JavaScript.

In JavaScript, a variable can go from a string, to a number, to an object, in the same context, in milliseconds:

// rapid type churn - triggers polymorphism / deopts
let x = 0;
for (let i = 0; i < 1e6; i++) {
  x = i;               // number
  x = String(x);      // string
  x = { i: x };       // object
  x = x.i.length ? 0 : i; // back to number (maybe)
}
Enter fullscreen mode Exit fullscreen mode

How does a compiler optimize for that?

So yes, the basis for Wasm being fast is not in question.

The real question is this:

Can a few conscious optimization tricks in Node.js beat Wasm?

To test that, there’s nothing better than a brutal workload:
a field of 125,000 particles.


Setup

You can clone the repo and run everything yourself:

git clone https://github.com/sklyt/wasmvsnode.git
Enter fullscreen mode Exit fullscreen mode

For visualization, I’m using my own C++ N-API renderer, tessera.js.
You can learn more about it here:
How I built a renderer for Node.js


125k JS Objects vs Wasm

This setup is already insane.

JS objects are:

  • scattered all over memory
  • cache-unfriendly
  • expensive to access and update
  • made worse by prototype chains

Objects in JS are the epitome of unpredictable weather.

So how do they fare against Wasm?

wasm vs js objects

Results

wasm avg (60fps):     0.512ms
js objects avg:       1.684ms
Enter fullscreen mode Exit fullscreen mode

A few notes:

These results are slightly affected by the recording setup (Windows being Windows), but the conclusion is clear: Wasm is ~3× faster.

Also note this line in obj.js:

const wasmStart = performance.now();

wasmUpdate(0, PARTICLE_COUNT, WIDTH, HEIGHT, Math.floor(dt * 1000));

wasmTime = performance.now() - wasmStart;
Enter fullscreen mode Exit fullscreen mode

wasmUpdate crosses the JS → Wasm boundary. That includes call overhead, yet Wasm is still blazing fast.

This result is expected.

But now comes one of the most beautiful optimization concepts I’ve ever learned.


Data-Oriented Design

Instead of representing entities like this (line 43 in obj.js):

const jsParticles = [];

for (let i = 0; i < PARTICLE_COUNT; i++) {
  jsParticles.push({
    x: Math.random() * WIDTH,
    y: Math.random() * HEIGHT,
    vx: (Math.random() - 0.5) * 200,
    vy: Math.random() * -100
  });
}
Enter fullscreen mode Exit fullscreen mode

What if we flatten the data?

Use the most cache-friendly structure possible, a linear array:

particle 1      particle 2      ...
[x, y, vx, vy,  x, y, vx, vy]
Enter fullscreen mode Exit fullscreen mode

Now memory access is linear.
No pointer chasing.
No jumping around memory.

This idea is everywhere in game engines:

  • sparse sets
  • ECS
  • struct-of-arrays

This is how engines do ridiculous amounts of work in sub-0.16ms.

So I flattened the data, and went further.

I used typed arrays.

const jsParticles = new Float32Array(PARTICLE_COUNT * 4);
Enter fullscreen mode Exit fullscreen mode

Types. Predictability.
The weather just got calmer.

wasm vs typed arrays

(I have no idea why dev to is de-optimizing the gif so bad so here's a picture):

Results

wasm avg (60fps):     0.520ms
js typed avg:         0.684ms
Enter fullscreen mode Exit fullscreen mode

Insane speed up for node, now Wasm is only ~1× faster.

We’re close.

But can we do more?


Throwing Threads at It

Node.js isn’t just “single-threaded”.

That one thread is:

  • running timers
  • handling callbacks
  • scheduling work
  • cleaning garbage
  • synchronizing async tasks

Meanwhile, most machines have 4-16 cores, and we’re hammering one.

So let’s split the work.

Using SharedArrayBuffer, we get zero-copy sharing:

const controlBuffer = new SharedArrayBuffer(Int32Array.BYTES_PER_ELEMENT);
const control = new Int32Array(controlBuffer);
Enter fullscreen mode Exit fullscreen mode

All I’m doing is dividing the particle rows between four workers, updating physics in parallel.

Worker code (line 61):

const CHUNK = workerData.chunkSize;
Enter fullscreen mode Exit fullscreen mode

That’s it.

The results are insane.

workers vs wasm

Results

wasm avg (60fps):               0.520ms
js typed + workers avg:         0.400ms
Enter fullscreen mode Exit fullscreen mode

JavaScript is now 1.26× faster on average.


Final Thoughts

Does this mean Wasm is slow?

Absolutely not.

Just like Node.js, Wasm can be aggressively optimized too. And honestly, the Node optimizations here aren’t even extreme, I could still push:

  • branchless programming
  • tighter batching
  • SIMD-friendly layouts

The point is simple:

Node.js can be insanely fast.
It’s C++ under the hood, you just need to know how to talk to it.

And once you do, the gap between “high-level” and “low-level” starts to look a lot smaller than people think.

Top comments (1)

Collapse
 
sfundomhlungu profile image
Sk

The wasm example compiles WAT(web assembly text format):