Ozgun

Posted on Mar 20

How I Built a Browser-Native AI Agent Platform with Pyodide (No Backend Required)

#python #webassembly #ai #javascript

I built AgentOp — a platform where you can
create AI agents and export them as single standalone HTML files that
run entirely in the browser. No server. No Docker. Open the file,
and your Python-powered AI agent is live.

Here's the technical story of how that actually works.

The Core Idea

Most AI agent platforms are server-heavy. You need a backend, a database,
a deployment pipeline. I wanted something different: an agent you could
email to someone as an .html attachment and it would just work.

The key insight: Pyodide 0.29.0 (CPython 3.12 compiled to WebAssembly) lets
you run real Python in the browser. Pair that with LangChain's Python
package and you have a fully capable AI agent runtime with zero backend.

The Architecture

Each generated agent is a self-contained HTML file with three layers:

1. Python runtime (Pyodide + LangChain)
The agent's tool functions are real Python — loaded into the browser
via Pyodide at runtime. LangChain handles the agent loop, tool calling,
and memory.

2. LLM provider layer (switchable at download time)
The user picks their provider when they download the agent:

OpenAI / Anthropic → direct browser fetch() calls to the API
Local WebLLM → @mlc-ai/web-llm runs a quantized model entirely in the browser using WebGPU

3. A universal callLLM() bridge
A single JavaScript function handles all three providers, reading
window.PROVIDER and window.API_KEY at call time:

async function callLLM(prompt, systemPrompt) {
  if (window.PROVIDER === 'local') {
    const response = await window.agentManager.engine
      .chat.completions.create({ messages: [...], temperature: 0.2 });
    return response.choices[0].message.content;
  }
  if (window.PROVIDER === 'openai') {
    const res = await fetch('https://api.openai.com/v1/chat/completions', {
      method: 'POST',
      headers: { 'Authorization': 'Bearer ' + window.API_KEY, ... },
      body: JSON.stringify({ model: 'gpt-4o-mini', messages: [...] })
    });
    return (await res.json()).choices[0].message.content;
  }
  // Anthropic works too — supports direct browser access since Aug 2024
}

The HTML Generator

The backend (Django 5.2, Python 3.12) has an AgentHTMLGenerator class that
assembles the final HTML by injecting components into a Mustache template:

class AgentHTMLGenerator:
    def generate_html(self) -> str:
        template_content = self._get_template_content()
        context = self._build_template_context()
        rendered_html = Template(template_content).render(Context(context))

        if self.provider != "local":
            rendered_html = self._inject_encryption(rendered_html)
            rendered_html = self._inject_pyodide_auto_init(rendered_html)
        else:
            # WebLLM path: LangChain.js + Pyodide bridge
            rendered_html = self._inject_webllm(rendered_html)
            rendered_html = self._inject_langchain_webllm_infrastructure(rendered_html)

        rendered_html = self._inject_runtime_provider_switcher(rendered_html)
        return rendered_html

Important privacy detail: agent HTML is generated dynamically at download
time and never written to disk. No agent code is ever stored on the server.

For the local WebLLM path, it's a hybrid architecture:

LangChain.js handles inference (runs JS natively)
Pyodide handles Python tool execution
A PyodideToolBridge passes tool calls back and forth between the two runtimes via window globals

WebLLM Dual-Mode Function Calling

This was one of the more interesting engineering problems. Not all models
support the same function calling interface, so the platform handles two modes
automatically based on the model selected:

Mode 1 — OpenAI API style (Hermes models)
Hermes models (e.g. Hermes-2-Pro-Mistral-7B) support native tool calling
via LangChain.js. The flow is clean:

User Input → WebLLMAgentManager → LangChain.js Agent → ChatWebLLM
                                          ↓
                                    Tool Call (Native)
                                          ↓
                                  PyodideToolBridge (JS↔Python)
                                          ↓
                                    Python Tools (Pyodide)

Mode 2 — Manual parsing (Llama 3.1 and others)
Llama 3.1 doesn't speak the OpenAI tool format, so the platform injects a
custom system prompt and parses <function> or <tool_call> XML tags from
the raw model output. It's single-shot — the model must emit a valid tool
call on its first response:

User Input → ManualFunctionCallingAgent → Raw WebLLM Engine
                                          ↓
                             Custom System Prompt + Raw Response
                                          ↓
                             Parse <function> / <tool_call> Tags
                                          ↓
                                  PyodideToolBridge (JS↔Python)

Mode is detected automatically from function_calling_method in the model
registry (webllm_models.py). Agent authors don't have to think about it.

Package Management Inside the Browser

This was the trickiest part. Pyodide has its own package ecosystem.
For cloud providers (OpenAI/Anthropic), the agent needs langchain_openai
or langchain_anthropic — but these have to be installed at runtime via
micropip inside the browser.

The generator handles this automatically:

def _filter_packages_by_provider(self, merged_packages):
    if self.provider == "local":
        # No Python LangChain needed — using LangChain.js instead
        return (builtins, filtered_pypi, [])
    elif self.provider == "openai":
        required = {"langchain_openai": "1.0.0.a3"}
        custom_wheels = ["https://www.agentop.com/static/packages/uuid_utils-...whl"]
        return (builtins, {**user_pypi, **required}, custom_wheels)

Yes — some packages needed a custom Pyodide-compatible .whl wheel
(compiled for wasm32). uuid_utils is one example. Building these
took a fair amount of Emscripten time.

API Key Security

A standalone HTML file can't have a server to protect secrets. The
solution: client-side AES-256-GCM encryption via the Web Crypto API.
The user sets a master password once. Key derivation uses
PBKDF2 (100,000 iterations) — the ciphertext is what gets embedded
in the downloaded file, not the raw key.

User enters master password → PBKDF2 key derivation → AES-256-GCM decryption → window.API_KEY populated

The encryption/decryption logic is inlined into the HTML on generation,
so the file works completely offline. Local WebLLM agents skip this entirely —
no API key needed.

What Didn't Work

gRPC in the browser: I tried adding Gemini as a provider via LangChain's Google integration. It uses gRPC under the hood — which doesn't work in browser environments at all. Had to skip it for now.
Not all Python packages compile to WASM: If your agent needs something like numpy for heavy computation, you're usually fine (it's in Pyodide's standard set). But niche packages with C extensions are often missing. You have to compile them with Emscripten yourself.
WebGPU availability: Local WebLLM requires WebGPU. It works great in Chrome/Chromium (tested on desktop and even Steam Deck), but Firefox support is still limited.

The Result

A full marketplace (Django 5.2 + PostgreSQL) where you can browse, rate,
fork, and collect agents — then download any of them as a single .html file
that runs entirely client-side:

Runs Python in the browser via Pyodide 0.29.0 (Python 3.12)
Calls OpenAI/Anthropic APIs directly from JS
Or runs a local LLM entirely offline with WebLLM + WebGPU
Encrypts your API key with AES-256-GCM + PBKDF2 — never touches the server
Works from file:// — no server needed

(https://github.com/ozgunay/agentop)
Try it live: agentop.com

Built with: Django 5.2, Python 3.12, PostgreSQL 15, Pyodide 0.29.0,
LangChain, WebLLM (@mlc-ai/web-llm), LangChain.js, Alpine.js, HTMX,
Mustache (chevron)

Top comments (1)

klement Gunndu • Mar 21

Running Python tools via Pyodide while the LLM inference stays in JS is a neat split — keeps the heavy compute where each runtime is strongest. The client-side key encryption with AES-256-GCM is a smart call for standalone files.