Nathan Cheng

Posted on Mar 3

The Browser Is All You Need

#ai #security #webassembly #browser

AI assistants are exploding in capability. The modern browser, a sandboxed VM already installed on every device on earth, is an underappreciated piece of the puzzle.

I wrote my first ISAPI DLL in the late 90s. If you know what that means, you know how far the browser has come. If you don't, that's kind of the point.

The browser started as a dumb document viewer. It's now a sandboxed virtual machine with near-native compute, GPU access, cameras, sensors, its own filesystem, and its own threading model. Installed on every device on earth.

The AI explosion is already happening. The browser is how it reaches everyone. And nobody's talking about it.

AI assistants aren't coming, they're already here. Claude writes production code. ChatGPT drafts legal briefs. Copilot autocompletes entire functions. Everyone is focused on the fuel: bigger models, more compute, better training data.

Few are talking about delivery. How do AI assistants actually reach the billions of devices where people need them? And once they get there, how do you give them real autonomy to act on your behalf while keeping them within limits they physically cannot break?

This isn't hypothetical. Projects like OpenClaw are putting AI assistants on people's personal machines right now, with access to files, messages, credentials, and devices. The promise is compelling: your data stays local, your assistant runs under your control. But the reality has been messier. In early 2026, researchers found 135,000 OpenClaw instances running with no authentication, exposing plaintext API keys, conversation histories, and full system access. Malicious skills flooded the marketplace. A critical RCE vulnerability gave attackers full admin access even on localhost-bound setups. CrowdStrike, Cisco, and Trend Micro all published security advisories.

And OpenClaw isn't alone. Microsoft Copilot has leaked emails via zero-click exfiltration. Google's Gemini was tricked into leaking Calendar data through a malicious invite. Replit's AI agent deleted a production database despite explicit instructions not to. 77% of employees are leaking corporate data through AI tools on personal accounts.

The core problem: today's AI assistants respect their limits because they're trained to, not because they're unable to cross them. An AI on a VM can technically access anything on that machine. The constraints are behavioral, not structural. That's a trust problem, not a security model.

The answer has been sitting in front of us the entire time.

Every time you open a browser tab, you provision a fully functioning virtual machine. It has its own memory space, execution environment, storage system, network stack, and threading model. It's sandboxed, secure, and isolated from every other tab.

Thanks to WebAssembly (a W3C standard since 2019, supported by every major browser) that virtual machine runs code at near-native speeds. Not just "fast for a browser" but competitive with compiled C++ running directly on your hardware.

AI assistants are already powerful. Recognizing the browser as a VM can dramatically change (and accelerate) how they reach the world.

The Accidental Operating System

The browser wasn't originally designed to be a virtual machine. It was designed to render documents. But through decades of competitive pressure — browser wars, the rise of web apps, the mobile revolution — it evolved into something far more powerful than its original creators intended. And through decades of adversarial pressure (every hacker, scammer, and malware author trying to break out of the sandbox to steal grandma's banking credentials) it became one of the most hardened security boundaries in computing. The browser sandbox wasn't designed in a lab. It was forged in battle.

A modern browser tab provides:

Isolated execution: each tab runs in its own sandboxed process, can't access other tabs' memory
A compute engine: JavaScript for general-purpose logic, WebAssembly for high-performance workloads
A filesystem: the Origin Private File System (OPFS) gives each origin fast, persistent, private storage
A threading model: Web Workers and SharedArrayBuffer enable real parallel computation
A network stack: fetch, WebSocket, WebRTC for HTTP, real-time, and peer-to-peer communication
A GPU: WebGPU provides direct access to graphics hardware for rendering and compute
Cameras and microphones: getUserMedia gives direct access to video and audio input
Sensors: geolocation, accelerometer, gyroscope, ambient light, all via standard APIs
Bluetooth and USB: Web Bluetooth and WebUSB provide direct peripheral access
A security model: same-origin policy, Content Security Policy, and process isolation baked in

That's not a document renderer. That's an operating system. And unlike traditional operating systems, the VM is essentially identical across browsers. Chrome, Firefox, Safari, Edge: they all implement the same standards, run the same JavaScript, execute the same Wasm bytecode, expose the same Web APIs. Write for one, it runs on all of them. No porting, no compatibility layer, no "works on Chrome but not Safari" (for the most part). This is the most interchangeable computing environment ever created.

And it's already installed on every device connected to the internet. Every phone, laptop, tablet, smart TV, Chromebook in every school. Billions of devices, already provisioned, already capable.

No app store approval. No installation step. No update mechanism to manage. Just a URL.

WebAssembly: The Universal Instruction Set

The missing piece for years was performance. JavaScript is fast for an interpreted language, but not fast enough for serious computational workloads. You couldn't run a physics simulation or a machine learning model in a browser tab and expect it to compete with native code.

WebAssembly (Wasm) changed that.

Wasm is a binary instruction format: machine code for a virtual processor that every browser implements. Write your application in C, C++, Rust, Go, Python, or nearly any other language, compile it to Wasm, and it runs in the browser only 10-20% slower than native execution. In a sandboxed browser tab. On any device.

Real-world applications already prove it:

Figma runs its entire design engine in Wasm, used by millions of designers, entirely in a browser tab
AutoCAD ported their decades-old C++ codebase to the web via Wasm
Google Earth runs complex 3D rendering through Wasm and WebGL
StackBlitz WebContainers run full Node.js environments (npm install, dev servers, the works) entirely in-browser

The browser-as-VM is production infrastructure.

The Missing Piece: The Localhost Proxy

The browser VM already comes with GPU, cameras, microphones, sensors, Bluetooth, and USB. But it is, by design, sandboxed. There are things it deliberately can't do: hold API keys securely, access the local filesystem, talk to arbitrary network services, or manage persistent identity across origins.

A lightweight localhost proxy can fill these gaps (here's one I generated using Claude Code and a couple drops of elbow grease). Run a small service on your machine (not a remote server, a local process under your control or under the control of your corporate IT department) that bridges anything the sandbox intentionally excludes:

API key management: keys never enter the browser's JavaScript context, never appear in source code, never transit a remote server. The proxy makes authenticated calls on behalf of the browser app.
Local system access: filesystem, IoT devices, home automation, proprietary hardware.
Persistent identity: maintained across sessions, apps, and tabs.
External service orchestration: LLM APIs, databases, email, calendar, all routed through the proxy with proper authentication.

The browser provides the sandboxed VM. The proxy fills the deliberate gaps. Together, they form a complete computing platform. And because the proxy runs locally or on your corporate intranet, your data and credentials never leave your direct control unless you choose.

Why This Matters for AI Assistants

The reason tools like OpenClaw and Claude Code have exploded in popularity isn't just that they're smart. It's that they have full access to the machine they're sitting on.

That access is what makes them so powerful. An AI assistant with system access can install packages, run scripts, read and write files, chain together tools, pipe data between programs, and orchestrate workflows that the creators of those tools never imagined. The entire universe of open source software becomes available as a toolkit. Need to process a video? The assistant can install ffmpeg. Need to analyze data? It can pull down pandas and numpy. Need to deploy something? It has git, ssh, docker, whatever the task requires.

This is a fundamentally different category of capability than a chatbot behind an API. It's why developers are so enthusiastic about these tools. The flexibility is extraordinary. The extensibility is nearly infinite.

It's also terrifying.

Giving an AI assistant full control of your machine means exactly that: full control. It can read your SSH keys. It can access your browser cookies. It can see your environment variables, your credentials files, your private repos. Even well-intentioned assistants can make mistakes. And as we've seen with the security incidents above, the attack surface is enormous when the AI has the run of the house.

This is the tension at the heart of the AI assistant revolution: the thing that makes them powerful (system access) is the same thing that makes them dangerous.

Most people resolve this tension by just... hoping for the best. Trusting the AI to stay in its lane. Running it on their main machine because that's where the tools are.

The browser-as-VM resolves it structurally.

Put the AI assistant inside a browser VM and it still has a full computing environment: filesystem (OPFS), compute (Wasm), network access (fetch, WebSocket), GPU (WebGPU), threading (Web Workers). It can still run code, build things, chain tools together. The environment is real and capable.

And it goes further than you might expect. Projects like WebVM already run unmodified Debian Linux inside a browser tab, complete with apt-get, gcc, Python, Node.js, and the full standard Linux toolchain. Container2wasm converts Docker images to run in the browser. v86 emulates a full x86 PC. There's even an early project compiling the Linux kernel directly to WebAssembly without emulation.

The entire open source software ecosystem, millions of packages, is becoming accessible from a browser tab. An AI assistant running in this environment could apt-get install whatever tools it needs, without any of it touching your actual machine.

But everything outside the sandbox is physically unreachable. Your SSH keys, your cookies, your credentials, your personal files: the AI can't touch them. Not because it's been asked nicely. Not because of a system prompt. Because the browser sandbox makes it architecturally impossible.

You get the power of full system access within the VM and the safety of structural isolation from everything else. That's not a tradeoff. That's the whole point.

The Enterprise Problem

This scenario is probably playing out in enterprises everywhere right now.

Imagine a management consultant — not a developer, hasn't written production code in years — who needs to build a working demo for a client. Not a mockup. Not a slide deck. A functioning prototype. With modern AI tools like Claude, this person knows they could build it. AI has collapsed the gap between "understands what needs to be built" and "can actually build it."

But there's nowhere to do it.

The "right" way would require a few things from corporate IT:

A cloud instance. Request a provisioned machine, get it set up, install your IDE, configure git, connect to the corporate network. Timeline? Weeks to months. Security review. Architecture review. Budget approval. Manager sign-off. More security review.

AI tool access approval. Data classification reviews. Vendor security assessments. Legal sign-off on terms of service. Policy committees that meet monthly. Even in organizations that want to adopt AI, the approval pipeline can take quarters.

So what actually happens? The consultant doesn't wait. They run the AI assistant on their work laptop. Their actual machine, with their email, their credentials, their SSH keys, their browser sessions, everything.

And that's where the blast radius gets enormous. This isn't a disposable cloud instance provisioned for one task. It's the machine they use for everything. Every file, every environment variable, every stored credential, every network connection is within reach. The AI might be well-behaved. It probably is. But the attack surface is the consultant's entire digital life.

"But containers!" Sure. Docker, Kubernetes, sandboxed environments: these solve isolation elegantly. Put the AI in a container, limit what it can access, constrain the blast radius. Great in theory.

Except containers also need corporate IT. Someone has to provision the infrastructure, configure network policies, approve base images. You're back in the approval queue, maybe shorter, but still waiting.

Meanwhile, the client is waiting.

What if this consultant looked at the problem differently?

Their browser was already a VM. Right there: sandboxed, secure, isolated, running on their corporate laptop without anyone's permission because it's a browser. Same execution environment, storage, and network stack. Same security model that lets anyone safely visit untrusted websites.

What if they built a browser-based development environment (file system, code editor, terminal emulation, git integration) and loaded an AI assistant into it? They could build the client demo. They could write code and commit to Azure DevOps repositories. All from a browser tab. No cloud instance request. No IT ticket.

The security insight: it would actually be safer than running the AI on their laptop directly. The AI would operate inside the browser sandbox. It couldn't access the local filesystem, read environment variables, or exfiltrate credentials. It could only do what the browser environment explicitly allowed. The sandbox isn't a limitation to work around. It's a feature to leverage.

Every enterprise has people waiting for IT to provision AI development environments. Every IT department has valid concerns about giving AI agents access to real machines. The browser resolves both: instant availability, structural security, zero provisioning.

Nobody needs permission to open a browser tab. And that browser tab might be all they need.

Where This Is Heading

The browser-as-VM isn't the final destination. It's a waypoint, likely a critical one, on a longer journey.

Elon Musk has offered one vision of where AI ends up: photons in, photons out. The AI perceives reality and acts on it directly, no intermediary languages. In February 2026, he suggested an intermediate step: "Code itself will go away in favor of just making the binary directly."

That's one opinion about how the timeline might unfold. But the underlying observation is worth considering.

Right now, AI writes code for humans. Python, JavaScript, TypeScript: languages designed for human programmers to read, debug, and maintain. The AI translates its understanding into our abstraction layer, then we run that code on machines.

AI doesn't actually need human-readable code. Variable names, comments, design patterns: those are concessions to human cognition. As AI gets more capable, it may increasingly skip the human-readable step.

If that happens, WebAssembly is an interesting candidate for the output format. It's a universal bytecode that runs in every major browser on every operating system. It's not the only possibility, but it's the most widely deployed option today.

Human developers won't disappear in this transition. But their role may shift from writing code to defining intent, reviewing behavior, and verifying outcomes. The browser is ready now to support such a scenario.

The Distribution Revolution

Software distribution has been a solved problem masquerading as an unsolved one. We've built elaborate systems (app stores, package managers, container registries, enterprise deployment pipelines) to solve what the browser solved twenty years ago: open a URL, the software runs.

The browser-as-VM model makes this explicit:

Instantly available: share a link, the application is running
Always up to date: no version management, no update prompts
Cross-platform by default: same code on Windows, Mac, Linux, iOS, Android, ChromeOS
Zero-trust by design: the sandbox means you don't need to trust the application with your entire system

For enterprises: no more desktop application deployments, compatibility matrices, or "works on my machine." The browser is the machine, and every employee already has one.

For developers: write once, deploy to a URL, reach every device on earth. Wasm for compute, HTML/CSS for UI, HTTP for distribution. All open standards. No vendor lock-in.

What Needs to Happen

The technology mostly exists today. But let's be honest about the gaps:

1. Browser security needs continued hardening. I've argued the browser sandbox is strong, and it is, but it's not bulletproof. In 2024, Google's threat intelligence team tracked 11 browser zero-days actively exploited in the wild. Chrome alone patched 10 zero-days that year. In 2025, Chrome had 8 more, including sandbox escapes linked to commercial spyware. Major browsers each patch hundreds of CVEs per year. The sandbox is battle-tested, but the battle is ongoing. For the browser-as-VM thesis to hold, browser vendors need to keep treating sandbox integrity as an existential priority.

2. The proxy layer needs standardization. Every project builds its own localhost bridge. An open standard for browser-to-local-system communication would accelerate adoption.

3. Developers need to stop thinking in "native vs. web." A Wasm application in a browser tab with WebGPU access is not meaningfully different from a native application — except that it's more secure, more portable, and more distributable.

4. Wasm and the browser platform need to keep maturing. Wasm 3.0 became a W3C standard in September 2025, and garbage collection now ships in all major browsers. But there are still real limitations around threading, memory management, and Web API access that need work. The trajectory is promising, the foundation is solid, but it's not finished.

The Bigger Picture

AI assistants are getting more capable. The browser has quietly become a real computing platform. Those two things are on a collision course, and the result might look something like what I've described here.

I could be wrong about parts of this. Maybe Wasm isn't the bytecode that matters. Maybe the proxy layer evolves into something I haven't imagined. Maybe browser vendors make decisions that slow this down. But the basic insight feels durable: there's a universal, sandboxed, capable VM already installed on every device on earth, and we're not using it for what it could be.

In 2017, a team at Google published "Attention Is All You Need." The transformer architecture was the right idea at the right moment. It unlocked GPT, Claude, Gemini, everything that followed. But attention wasn't the end of the story. Today's best models have moved well beyond it: mixture of experts, RLHF, chain-of-thought, retrieval augmentation, state space models. Attention was a waypoint. A critical one, but not the destination.

The browser-as-VM might be the same kind of thing. The right insight for right now. And like attention, probably not the final answer.

But right now? The browser is all you need.

DEV Community