DEV Community

Cover image for WebAssembly in Production: Memory, Threading, and Performance Techniques That Actually Work
Nithin Bharadwaj
Nithin Bharadwaj

Posted on

WebAssembly in Production: Memory, Threading, and Performance Techniques That Actually Work

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

I remember the first time I tried to run heavy image processing in a browser. The JavaScript loop stuttered, the page froze, and the user had already closed the tab. That’s when I discovered WebAssembly – a way to run near-native code right inside the browser. But loading a .wasm file is only the beginning. To really speed things up, you need to handle memory, threading, and error cases. Over the years I’ve built a handful of techniques that turn WebAssembly from a fancy demo into a reliable part of any web application. Let me walk you through them, one by one.


First, you have to manage the module’s entire life cycle. If your page loads five different WASM components, you don’t want to fetch and compile them every time the user navigates. I built a simple manager that caches compiled modules and tracks their memory usage. It checks if WebAssembly is supported and falls back to a pure JavaScript version if not. The code below also sets a timeout for slow networks – because nobody wants a spinning loader forever.

class WasmModuleManager {
  constructor() {
    this.cache = new Map();
    this.instances = new Map();
    this.ready = false;
    this.fallbackMode = false;
  }

  async loadModule(url, imports = {}) {
    if (this.cache.has(url)) {
      return this.cache.get(url);
    }

    if (!WebAssembly || !WebAssembly.instantiateStreaming) {
      console.warn('WebAssembly not supported, falling back to JS fallback');
      this.fallbackMode = true;
      return null;
    }

    try {
      const fetchPromise = fetch(url);
      const instantiatePromise = WebAssembly.instantiateStreaming(
        fetchPromise,
        this.buildImports(imports)
      );

      const timeoutPromise = new Promise((_, reject) =>
        setTimeout(() => reject(new Error('WASM load timeout')), 30000)
      );

      const result = await Promise.race([instantiatePromise, timeoutPromise]);
      const { instance, module } = result;

      this.cache.set(url, { instance, module });
      this.instances.set(url, instance);

      this.initMemoryMonitoring(instance);
      this.ready = true;

      console.log(`WASM module loaded: ${url} (${instance.exports.memory.buffer.byteLength} bytes heap)`);
      return instance;
    } catch (error) {
      console.error('WASM load failed:', error);
      return null;
    }
  }

  buildImports(userImports) {
    return {
      env: {
        abort: (msg, file, line, column) => {
          console.error(`WASM abort: ${msg} at ${file}:${line}:${column}`);
        },
        log: (ptr, len) => {
          const bytes = new Uint8Array(this.instance.exports.memory.buffer, ptr, len);
          const msg = new TextDecoder().decode(bytes);
          console.log('[WASM]', msg);
        },
        ...userImports.env,
      },
      ...userImports,
    };
  }

  initMemoryMonitoring(instance) {
    const memory = instance.exports.memory;
    let lastSize = memory.buffer.byteLength;

    setInterval(() => {
      const currentSize = memory.buffer.byteLength;
      if (currentSize > lastSize) {
        console.log(`WASM memory grew: ${lastSize} -> ${currentSize} bytes`);
        lastSize = currentSize;
      }
    }, 2000);
  }

  getInstance(url) {
    return this.instances.get(url);
  }

  dispose(url) {
    const instance = this.instances.get(url);
    if (instance) {
      instance.exports = null;
      this.instances.delete(url);
      this.cache.delete(url);
    }
  }
}

const wasmManager = new WasmModuleManager();
const instance = await wasmManager.loadModule('/compute.wasm');
Enter fullscreen mode Exit fullscreen mode

Now, the biggest mistake people make is copying data back and forth between JavaScript and WASM memory. I used to create new arrays every time I wanted to pass a large buffer. That killed all the speed gains. Instead, you can create a typed array that points directly into the WASM linear memory. No copies. No garbage collector thrashing. The bridge below handles strings, arrays of numbers, and automatically frees allocated memory when you’re done.

class WasmDataBridge {
  constructor(instance) {
    this.instance = instance;
    this.memory = instance.exports.memory;
    this.encoder = new TextEncoder();
    this.decoder = new TextDecoder();
    this.allocated = new Set();
  }

  malloc(size) {
    const ptr = this.instance.exports.malloc(size);
    this.allocated.add(ptr);
    return ptr;
  }

  free(ptr) {
    if (this.allocated.has(ptr)) {
      this.instance.exports.free(ptr);
      this.allocated.delete(ptr);
    }
  }

  writeFloat64Array(arr) {
    const byteLength = arr.length * 8;
    const ptr = this.malloc(byteLength);
    const view = new Float64Array(this.memory.buffer, ptr, arr.length);
    view.set(arr);
    return { ptr, length: arr.length };
  }

  readFloat64Array(ptr, length) {
    const view = new Float64Array(this.memory.buffer, ptr, length);
    return Array.from(view);
  }

  writeString(str) {
    const bytes = this.encoder.encode(str);
    const ptr = this.malloc(bytes.length + 1);
    const view = new Uint8Array(this.memory.buffer, ptr, bytes.length + 1);
    view.set(bytes);
    view[bytes.length] = 0;
    return ptr;
  }

  readString(ptr) {
    const bytes = [];
    const view = new Uint8Array(this.memory.buffer);
    let offset = ptr;
    while (view[offset] !== 0) {
      bytes.push(view[offset]);
      offset++;
    }
    return this.decoder.decode(new Uint8Array(bytes));
  }

  dispose() {
    for (const ptr of this.allocated) {
      this.instance.exports.free(ptr);
    }
    this.allocated.clear();
  }
}
Enter fullscreen mode Exit fullscreen mode

Some WASM functions run for seconds – matrix multiplication, physics simulations, video encoding. If you call them on the main thread, your whole page freezes. I learned to push those computations into a Web Worker. The worker loads the same WASM module and communicates with the main thread using postMessage. With SharedArrayBuffer and cross-origin isolation, you can even share the same memory buffer without copying. Here is a worker pool that reuses workers and handles multiple requests.

// wasm-worker.js
self.addEventListener('message', async (event) => {
  const { id, url, method, args, buffers } = event.data;

  try {
    const { instance } = await WebAssembly.instantiateStreaming(fetch(url));
    const bridge = new WasmDataBridge(instance);

    const wasmArgs = args.map(arg => {
      if (typeof arg === 'string') return bridge.writeString(arg);
      if (Array.isArray(arg)) return bridge.writeFloat64Array(arg);
      return arg;
    });

    const result = instance.exports[method](...wasmArgs);

    let response;
    if (typeof result === 'number' && result !== 0) {
      response = bridge.readFloat64Array(result, 1);
      bridge.free(result);
    } else {
      response = result;
    }

    wasmArgs.forEach((arg, i) => {
      if (typeof arg === 'object' && arg.ptr) {
        bridge.free(arg.ptr);
      }
    });

    self.postMessage({ id, result: response });
  } catch (error) {
    self.postMessage({ id, error: error.message });
  }
});

// main.js
class WasmWorkerPool {
  constructor(workerUrl, maxWorkers = navigator.hardwareConcurrency || 4) {
    this.workers = [];
    this.queue = [];
    this.pending = new Map();
    this.idCounter = 0;

    for (let i = 0; i < maxWorkers; i++) {
      const worker = new Worker(workerUrl);
      worker.onmessage = (e) => this.handleResult(e.data);
      this.workers.push({ worker, busy: false });
    }
  }

  call(method, args = [], buffers = []) {
    return new Promise((resolve, reject) => {
      const id = this.idCounter++;
      this.pending.set(id, { resolve, reject });
      this.queue.push({ id, method, args, buffers, url: this.moduleUrl });
      this.processQueue();
    });
  }

  processQueue() {
    while (this.queue.length > 0) {
      const workerEntry = this.workers.find(w => !w.busy);
      if (!workerEntry) break;

      const task = this.queue.shift();
      workerEntry.busy = true;
      workerEntry.worker.postMessage(task);
    }
  }

  handleResult(data) {
    const { id, result, error } = data;
    const pending = this.pending.get(id);
    if (pending) {
      this.pending.delete(id);
      if (error) pending.reject(new Error(error));
      else pending.resolve(result);
    }

    const workerEntry = this.workers.find(w => w.worker.id === data._workerId);
    if (workerEntry) {
      workerEntry.busy = false;
      this.processQueue();
    }
  }

  terminate() {
    this.workers.forEach(entry => entry.worker.terminate());
    this.workers = [];
    this.queue = [];
    this.pending.clear();
  }
}
Enter fullscreen mode Exit fullscreen mode

WASM doesn’t have garbage collection. Every byte you allocate has to be freed. For many small allocations, like building a mesh or processing a frame, the standard malloc can fragment your heap. I switched to an arena allocator inside the WASM module. You reserve a big block of memory and hand out pointers sequentially. When the frame ends, you reset the offset – instant cleanup.

// arena.c – compile with emcc -O3 -s STANDALONE_WASM -o arena.wasm
#include <stdint.h>

static uint8_t arena[1024 * 1024]; // 1 MB
static uint32_t arena_offset = 0;

void* arena_alloc(uint32_t size) {
    uint32_t aligned = (size + 7) & ~7;
    if (arena_offset + aligned > sizeof(arena)) return 0;
    void* ptr = &arena[arena_offset];
    arena_offset += aligned;
    return ptr;
}

void arena_reset() {
    arena_offset = 0;
}
Enter fullscreen mode Exit fullscreen mode

From JavaScript you call arena_reset() after each frame or batch of work. No memory leaks, no fragmentation.


When you pass a pointer from WASM back to JavaScript, one wrong number can corrupt your whole page. I always validate pointers before reading. The function below checks if the export exists, if arguments are sensible, and if the returned pointer falls inside the memory buffer. It saved me countless hours of debugging.

function safeCall(instance, methodName, ...args) {
  const fn = instance.exports[methodName];
  if (typeof fn !== 'function') {
    throw new Error(`WASM export "${methodName}" not found`);
  }

  args.forEach((arg, i) => {
    if (typeof arg === 'number' && (arg < 0 || arg > 1e9)) {
      throw new Error(`Argument ${i} out of valid range: ${arg}`);
    }
  });

  const result = fn(...args);

  if (typeof result === 'number' && result > 0) {
    const memory = instance.exports.memory;
    if (result >= memory.buffer.byteLength) {
      throw new Error(`WASM returned invalid pointer: ${result}`);
    }
  }

  return result;
}
Enter fullscreen mode Exit fullscreen mode

Not every browser supports WebAssembly. And even if it does, the module might fail to load because of a Content Security Policy or a network error. I keep a pure JavaScript version of every WASM function. A simple feature check at startup decides which one to use. The FastMath class below shows a matrix multiply that falls back when WASM isn’t available.

class FastMath {
  constructor() {
    this.useWasm = false;
    this.instance = null;
  }

  async init() {
    try {
      this.instance = await wasmManager.loadModule('/math.wasm');
      this.useWasm = true;
    } catch {
      console.log('Using JavaScript fallback for math operations');
    }
  }

  matrixMultiply(A, B) {
    if (this.useWasm && this.instance) {
      const bridge = new WasmDataBridge(this.instance);
      const aPtr = bridge.writeFloat64Array(A);
      const bPtr = bridge.writeFloat64Array(B);
      const resultPtr = this.instance.exports.matrixMultiply(
        aPtr.ptr, bPtr.ptr, A.length, B.length / A.length
      );
      const result = bridge.readFloat64Array(resultPtr, A.length * (B.length / A.length));
      bridge.free(aPtr.ptr);
      bridge.free(bPtr.ptr);
      bridge.free(resultPtr);
      return result;
    }

    // JavaScript fallback
    const rowsA = A.length;
    const colsA = A[0] ? A[0].length : 1;
    const colsB = B[0] ? B[0].length : 1;
    const result = new Array(rowsA * colsB).fill(0);
    for (let i = 0; i < rowsA; i++) {
      for (let j = 0; j < colsB; j++) {
        for (let k = 0; k < colsA; k++) {
          result[i * colsB + j] += A[i * colsA + k] * B[k * colsB + j];
        }
      }
    }
    return result;
  }
}
Enter fullscreen mode Exit fullscreen mode

Large WASM modules can take seconds to compile. Streaming compilation lets the browser start compiling while the file is still downloading. I show a progress bar based on the Content-Length header. The code below reads the response as a stream, fires a progress callback, and then compiles the full buffer.

async function loadWasmWithProgress(url, onProgress) {
  const response = await fetch(url);
  const contentLength = response.headers.get('Content-Length');
  const total = parseInt(contentLength, 10);

  const reader = response.body.getReader();
  const chunks = [];
  let received = 0;

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    chunks.push(value);
    received += value.length;
    if (onProgress) onProgress(received / total);
  }

  const blob = new Blob(chunks);
  const arrayBuffer = await blob.arrayBuffer();

  const module = await WebAssembly.compile(arrayBuffer);
  const instance = await WebAssembly.instantiate(module, {});
  return instance;
}
Enter fullscreen mode Exit fullscreen mode

How do you know if the WASM version is actually faster? I profile every call with performance.now(). If a single call takes more than a millisecond, I log it. Over many calls I compute averages. The profiler below tracks totals and counts, so you can compare different implementations.

class WasmProfiler {
  constructor() {
    this.totals = new Map();
    this.counts = new Map();
  }

  call(name, fn, ...args) {
    const start = performance.now();
    const result = fn(...args);
    const elapsed = performance.now() - start;

    this.totals.set(name, (this.totals.get(name) || 0) + elapsed);
    this.counts.set(name, (this.counts.get(name) || 0) + 1);

    return result;
  }

  report() {
    for (const [name, total] of this.totals) {
      const avg = total / this.counts.get(name);
      console.log(`${name}: avg ${avg.toFixed(3)}ms, calls ${this.counts.get(name)}`);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

With SharedArrayBuffer you can avoid copies between workers entirely. Both the main thread and the worker see the same memory. This is perfect for real-time audio processing or physics. But you need cross-origin isolation headers (COOP and COEP). Once they’re set, you pass the buffer to the worker and both sides can write and read directly. A simple pattern:

// main.js
const sab = new SharedArrayBuffer(1024 * 1024); // 1 MB shared
const worker = new Worker('worker.js');
worker.postMessage(sab);

// worker.js
self.onmessage = (e) => {
  const sab = e.data;
  const view = new Float64Array(sab);
  // read/write directly
};
Enter fullscreen mode Exit fullscreen mode

Finally, I keep the WASM layer clean. JavaScript handles DOM, events, and user interaction. WASM only does number crunching. I create a facade object that the rest of my application talks to. It loads the WASM module once, translates high-level calls into low-level pointer operations, and returns results as plain JavaScript objects.

class WasmFacade {
  constructor(moduleUrl) {
    this.moduleUrl = moduleUrl;
    this.instance = null;
    this.ready = false;
  }

  async init() {
    this.instance = await wasmManager.loadModule(this.moduleUrl);
    this.ready = true;
  }

  async processImage(pixels, width, height) {
    if (!this.ready) await this.init();

    const bridge = new WasmDataBridge(this.instance);
    const pixelPtr = bridge.malloc(pixels.length);
    const pixelView = new Uint8Array(this.instance.exports.memory.buffer, pixelPtr, pixels.length);
    pixelView.set(pixels);

    const resultPtr = this.instance.exports.applyFilter(pixelPtr, width, height);
    const resultLength = width * height * 4;
    const resultView = new Uint8Array(this.instance.exports.memory.buffer, resultPtr, resultLength);
    const result = new Uint8Array(resultView);

    bridge.free(pixelPtr);
    return result;
  }
}
Enter fullscreen mode Exit fullscreen mode

I use this pattern in all my side projects. It keeps the code readable and the performance predictable. Each technique here solved a real problem I faced – slow loads, crashes, frozen UI, memory leaks. Combine them, and you get a WebAssembly integration that feels like part of the platform, not a fragile add-on.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | Java Elite Dev | Golang Elite Dev | Python Elite Dev | JS Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)