DEV Community

Cover image for ๐Ÿš€ Building AI Agents in the Browser with WebAssembly (WASM) + Web Workers + LLM APIs โ€” A Game-Changer for Web Apps
Yevhen Kozachenko ๐Ÿ‡บ๐Ÿ‡ฆ
Yevhen Kozachenko ๐Ÿ‡บ๐Ÿ‡ฆ

Posted on • Originally published at ekwoster.dev

๐Ÿš€ Building AI Agents in the Browser with WebAssembly (WASM) + Web Workers + LLM APIs โ€” A Game-Changer for Web Apps

Introduction: A Brave New Web

Imagine this: an AI agent that analyzes user behavior, chats with visitors, and triggers workflowsโ€”right from your browser. No backend. No serverless functions. Just a blazing-fast, snappy, privacy-friendly AI that lives entirely inside your user's device. Sci-fi? Nope. Welcome to the power of WebAssembly (WASM), Web Workers, and LLM APIs.

In this post, weโ€™ll explore a supercharged, forward-thinking architecture for building intelligent in-browser agents that use AI models (via APIs like OpenAI, Mistral, or llama.cpp) to drastically enhance UX and create autonomous workflowsโ€”all without breaking your client-side performance.

๐Ÿ‘‰ Spoiler Alert: Weโ€™ll build a prototype AI agent in WASM using Rust, run it inside a Web Worker, and connect it to an LLM API to create a contextual assistant for any SPA.


Why This Matters: The Problem with Typical Chatbot Setups ๐Ÿค–

Traditional chatbots, even the AI-powered ones, are bloated with backend dependencies:

  • Server-based logic means slower response times
  • They often need constant internet access
  • They donโ€™t scale well for edge use or offline support

If youโ€™re building modern web apps, these limitations hurt UX and performance BIG time.

But with WASM, we can compile fast, compiled code like Rust right into the browser, giving us control over local execution, memory, and async processing, making the AI feel native.

Letโ€™s dive deep into this.


Architecture Overview ๐Ÿ—๏ธ

Here's what weโ€™re building:

+---------------------+
|  User Browser       |
| +----------------+ |
| | React App      | |
| +----------------+ |
|         |           
|  +----------------+ 
|  | WebWorker     | 
|  | +----------+  |
|  | | WASM     |  | <--- Compiled Rust module
|  | +----------+  |
|  |   LLM API     |
|  +----------------+ 
+---------------------+
Enter fullscreen mode Exit fullscreen mode
  • ๐Ÿง  AI runs inside a WASM-compiled module (Rust-based)
  • ๐Ÿงต Multithreading & offloading via WebWorkers
  • ๐ŸŒ LLM API (like OpenAI's GPT-4) provides language intelligence
  • ๐Ÿ•ธ๏ธ React app as the user interface

Step 1: Building an AI Agent Core in Rust + WASM ๐Ÿš€

First, weโ€™ll write an agent in Rust. This example uses wasm-bindgen and reqwest for API calls.

๐Ÿ“ agent/src/lib.rs

use wasm_bindgen::prelude::*;
use serde::{Serialize, Deserialize};
use reqwest::Client;

#[wasm_bindgen]
pub async fn query_llm(prompt: String) -> Result<JsValue, JsValue> {
    let client = Client::new();

    let api_url = "https://api.openai.com/v1/chat/completions";
    let body = serde_json::json!({
        "model": "gpt-4",
        "messages": [{ "role": "user", "content": prompt }],
        "max_tokens": 150
    });

    let resp = client.post(api_url)
        .bearer_auth("your-api-key") // Use secure secrets IRL
        .json(&body)
        .send()
        .await
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    let resp_text = resp.text().await.map_err(|e| JsValue::from_str(&e.to_string()))?;

    Ok(JsValue::from_str(&resp_text))
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ› ๏ธ Compile to WASM:

wasm-pack build --target web
Enter fullscreen mode Exit fullscreen mode

Step 2: Integrating the Agent in a Web Worker ๐Ÿ’ผ

๐Ÿ“ ai.worker.js

import init, { query_llm } from './pkg/agent.js';

self.onmessage = async (event) => {
  await init();
  const prompt = event.data;
  const result = await query_llm(prompt);
  self.postMessage(result);
};
Enter fullscreen mode Exit fullscreen mode

Use Torboy or Vite to load this into a React App.


Step 3: Connecting to React โ€” Front-End Integration โš›๏ธ

๐Ÿ“ App.jsx

import { useEffect, useState, useRef } from 'react';

function App() {
  const [response, setResponse] = useState('');
  const workerRef = useRef(null);

  useEffect(() => {
    workerRef.current = new Worker(new URL('./ai.worker.js', import.meta.url), { type: 'module' });
    workerRef.current.onmessage = (e) => setResponse(e.data);
  }, []);

  const sendPrompt = () => {
    const prompt = "What is the capital of France?";
    workerRef.current.postMessage(prompt);
  };

  return (
    <div className="p-8 font-sans">
      <h1 className="text-2xl font-bold">Browser AI Agent ๐Ÿง </h1>
      <button className="mt-4 bg-green-600 px-4 py-2 text-white" onClick={sendPrompt}>Ask AI</button>
      <p className="mt-4">Response: <pre>{response}</pre></p>
    </div>
  )
}

export default App;
Enter fullscreen mode Exit fullscreen mode

Why This Rocks ๐Ÿ”ฅ

โœ… It runs in-browser (privacy-safe!)

โœ… WASM gives you native-like speeds

โœ… You offload heavy lifting off the JS thread

โœ… Works offline (with a local model!) or connects securely to LLM APIs

โœ… Great for chatbots, autonomous UI agents, real-time assistants ๐Ÿš€


Going Further: Local Models with llama.cpp + WASM

You can run llama2 inside WASM using ports of llama.cpp! This makes your agent fully functional offline. Memory limits exist, so we recommend:

  • Use quantized 4-bit models
  • Limit context
  • You can even build Chrome Extensions with in-browser LLMs โšก

๐Ÿ‘‰ Resources:


Final Thoughts: The Future is Local, Smart, and Fast ๐Ÿง 

Youโ€™ve just built a futuristic AI agent that runs in-browser using WASM, Rust, Web Workers and LLM APIsโ€”all inside a React app. This pattern is insanely useful for building blazing-fast assistants, privacy-first automation, and hybrid online/offline tools.

The convergence of WASM, edge computing, and AI is redefining what web applications can do. Start building your agents nowโ€”because the web is no longer dumb. Itโ€™s autonomous.

๐Ÿ™Œ Happy Coding!


Gotchas & Tips ๐Ÿงฉ

  • Some LLM APIs donโ€™t support CORSโ€”for that, use a proxy
  • Running WASM inside Workers means no direct DOM access
  • Bundle size can grow; optimize and tree-shake!
  • Use Rustโ€™s wee_alloc and opt-level = "z" for minified builds

Like this post? ๐ŸŒŸ

  • Follow for more creative AI + Web Dev content
  • Share with your dev community ๐Ÿ’Œ
  • Try building your own on Replit or Vercel!

๐Ÿง  If you're building next-gen browser agents or chatbot interfaces using WebAssembly, Rust and LLM APIsโ€”we offer professional help. Check out our AI Chatbot Development services.

Top comments (0)