Rust + WebAssembly Performance: Pure JS vs. wasm-bindgen vs. Raw WASM with SIMD
When writing code for the web, JavaScript is the default choice: lightweight, interpreted, and heavily optimized by modern engines such as V8 (Chrome, Node.js) and SpiderMonkey (Firefox). But for compute-heavy tasks, JS may not always deliver the performance we need.
Rust is memory-safe, fast, and integrates well with WebAssembly (WASM) (via wasm-bindgen
and wasm-pack
). WebAssembly itself is designed for high-performance, portable, and secure execution inside browsers.
Introduction
In this article, I’ll benchmark four approaches to solving the same problems:
- Pure JavaScript
- Rust compiled to WASM with
wasm-bindgen
- Raw WASM exports (
extern "C"
) - Raw WASM with SIMD instructions
We’ll measure them on two tasks:
-
Array modification: Add
+2.0
to every element. > This highlights memory management and iteration speed. - Fibonacci calculation: Iterative version. > Recursion was intentionally excluded because it performs more slowly.
Explanation
wasm-bindgen
- Library + tool that lets Rust talk to JavaScript (and vice versa).
- It servs as the bridge between Rust and JS.
wasm-pack
- A build tool (a CLI) that uses wasm-bindgen under the hood.
- Automates the whole process: compiling Rust to WebAssembly, running wasm-bindgen, packaging everything as an npm package, and making it ready to publish or consume from a JS project.
- It ensures that the correct target (wasm32-unknown-unknown) is used.
- Creating package.json.
- Running tests in headless browsers.
- Building with release optimizations.
- Think of it as the workflow manager for Rust + WASM projects.
Float32Array
- The difference between a
Float32Array
and basic JavaScript array mainly comes down to type, memory, and performance. - A typed array that can only store 32-bit floating-point numbers.
- Stores numbers in contiguous memory, meaning all numbers are tightly packed as 32-bit floats. This makes it faster for numeric computations and more memory-efficient.
- Ideal for WebGL, audio processing, or numerical calculations.
Implementations
1. Pure JavaScript
function pure_js_modify_array(floatArray) {
for (let i = 0; i < floatArray.length; i++) {
floatArray[i] += 2.0;
}
}
function pure_js_fibon(n) {
if (n <= 1) return n;
let a = 0, b = 1;
for (let i = 2; i <= n; i++) {
[a, b] = [b, a + b];
}
return b;
}
2. Wasm-bindgen crate
We can create a Float32Array
in JavaScript, which is highly efficient and can be passed to Rust:
const floatArray = new Float32Array(len);
In Rust, using wasm-bindgen
, we can receive this array as a slice (&[f32])
. However, this copies the data from JavaScript memory into Rust memory. This overhead is negligible for small arrays but can become significant for large arrays or performance-critical workloads.
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub fn wasm_modify_array(arr : &[f32]) -> Vec<f32> {
arr.iter().map(|&x| x+2.0f32).collect()
}
#[wasm_bindgen]
pub fn wasm_fibon(n:u32) -> u64{
match n {
0 => 0,
1 => 1,
_ => {
let mut a = 0;
let mut b = 1;
for _ in 2..=n {
let tmp = a + b;
a = b;
b = tmp;
}
b
}
}
}
3. Raw Wasm
This section demonstrates a low-level, raw WebAssembly approach to modifying arrays in Rust, including both a scalar version and a SIMD-accelerated version. In Rust, the functions use raw pointers (*mut f32) and slice conversion (from_raw_parts_mut) to manipulate memory directly.
Note: To enable SIMD,
RUSTFLAGS="-C target-feature=+simd128"
is required at compile time.Note: unsafe blocks are required because this approach bypasses Rust’s usual memory safety guarantees.
Note: Not all browsers support SIMD by default, though support is widespread in modern versions of major browsers like Chrome, Firefox, Edge, and Safari. Older browser versions or certain less common browsers may lack support.
#[unsafe(no_mangle)]
pub unsafe extern "C" fn pure_wasm_modify_array(ptr: *mut f32, len:usize ) {
let slice: &mut [f32] = unsafe { std::slice::from_raw_parts_mut(ptr, len) };
for item in slice.iter_mut() {
*item += 2.0f32
}
}
#[cfg(target_arch = "wasm32")]
use std::arch::wasm32::*;
#[cfg(target_arch = "wasm32")]
#[target_feature(enable = "simd128")]
#[unsafe(no_mangle)]
pub unsafe fn pure_wasm_modify_array_simd(ptr: *mut f32, len: usize) {
let slice = unsafe { std::slice::from_raw_parts_mut(ptr, len)};
let chunks = len & !3;
for i in (0..chunks).step_by(4) {
let v = unsafe { v128_load(slice.as_ptr().add(i) as *const v128) };
let two = f32x4_splat(2.0);
let res = f32x4_add(v, two);
unsafe { v128_store(slice.as_mut_ptr().add(i) as *mut v128, res) };
}
for i in chunks..len {
unsafe { *slice.get_unchecked_mut(i) += 2.0f32 };
}
}
#[unsafe(no_mangle)]
pub unsafe extern "C" fn pure_wasm_fibon(n:u32) -> u64{
match n {
0 => 0,
1 => 1,
_ => {
let mut a = 0;
let mut b = 1;
for _ in 2..=n {
let tmp = a + b;
a = b;
b = tmp;
}
b
}
}
}
Results
These results are plotted in the graph, illustrating how raw WASM, especially when combined with SIMD, consistently outperforms higher-level interfaces for computationally intensive tasks.
Figure 1: The average execution time, calculated from 1,000 iterations across 50 runs, shows that pure WASM achieved the shortest execution time.
Note: Numbers for Fibonacci are very small, so these differences are not meaningful.
Approach Task Avg (ms) Min (ms) Max (ms) Pure JS Modify Array 1.403 1.341 1.643 wasm-bindgen Modify Array 1.623 1.550 1.850 Raw WASM Modify Array 0.353 0.353 0.357 Raw WASM (SIMD) Modify Array 0.231 0.230 0.233 Pure JS Fibonacci 0.00120 0.00060 0.00365 wasm-bindgen Fibonacci 0.00021 0.00015 0.00030 Raw WASM Fibonacci 0.00019 0.00015 0.00025 Table 1: Low-level WASM implementations provide both high speed and stable execution times across repeated runs.
Discussion
In array modification, Raw WASM (0.353 ms) is ~$4\times$ faster than Pure JS (1.403 ms). With SIMD (0.231 ms), performance improves to ~$6\times$ faster then Pure JS. For the Fibonacci calculation, all approaches complete extremely quickly, with differences being practically negligible.
Conclusion
Note: These conclusions are based on a small-scale benchmark and should be interpreted cautiously.
- Use pure JavaScript for simplicity when performance is “good enough.”
- Use wasm-bindgen for convenience when integrating Rust logic into JS-heavy projects.
- Raw WASM exports offer higher performance, particularly for large or compute-intensive operations, but require more careful memory management and lower-level coding.
- SIMD instructions further improve performance for workloads that are highly parallelizable.
Top comments (1)
Interesting comparison. Can you tell what made the wasm-bindgen solution perform so bad? Is it just the data copying or is the tool doing some more magic that makes the performance degrade?