WebMCP in Chrome 149: web pages get a tool API for AI agents

#ai #agents #webdev #explained

On May 18, 2026 Chrome shipped a developer trial for WebMCP. A day later, at Google I/O, the team announced the public origin trial in Chrome 149. WebMCP is a proposed web standard that lets a page declare what an AI agent can do on it, instead of leaving the agent to figure that out by reading the DOM. If you build for agents, or you build websites agents might one day visit, this is the most consequential standards news of the year so far.

TL;DR

WebMCP adds a navigator.modelContext.registerTool API to the browser. A page can register named, typed functions an agent can call directly.
Two APIs: an imperative JavaScript one (call registerTool with a JSON Schema and an execute callback) and a declarative one (annotate HTML forms).
Chrome 149 origin trial is open. Only Gemini in Chrome consumes the tools today. The standard is in the W3C Web Machine Learning group on GitHub.

Background — why agents struggle on the open web

Every "browser agent" you have seen demonstrated in the last two years works the same way underneath. The agent opens a real browser. It reads the page through the accessibility tree, or a screenshot, or a serialized DOM. It plans a sequence of actions. It then simulates a human: move the cursor here, click that button, type into that input, wait for layout, read the new page, repeat.

This is called actuation, and it has three problems.

It is slow. Every step is a round trip through layout and paint.

It is brittle. A class name change, a re-ordered modal, an A/B-tested CTA — any of these can break the script the agent built five seconds ago.

It is expensive in tokens. The agent has to read and reason about the page on every step, because the page does not advertise what it can do.

The workaround for years has been "give the agent an API." But most websites do not have a public API. When they do, the API is usually a different surface from the UI — different fields, different validation, different auth.

What WebMCP actually is

WebMCP collapses that gap. The page exposes tools through a browser API. The agent calls them.

Here is the smallest possible example, lifted from the Chrome docs. The page is a to-do app. It registers one tool.

navigator.modelContext.registerTool({
  name: "addTodo",
  description: "Add a new item to the todo list",
  inputSchema: {
    type: "object",
    properties: { text: { type: "string" } },
    required: ["text"],
  },
  execute: ({ text }) => {
    // The page's own todo-add logic, whatever that is.
    appState.todos.push({ text, done: false });
    return `Added todo: ${text}`;
  },
});

That is the whole imperative API. A tool is a name, a description, a JSON Schema for inputs, and a function the page runs. The agent sees a typed call. The page runs its own code. The UI updates the way it already updates. There is no DOM scraping in the middle of any of this.

The declarative form

The same idea, but for a site that does not want to write JavaScript: put annotations on an HTML form, and Chrome wires up a tool from it. The agent sees a submit_application tool that takes the form's fields as inputs. This is the WebMCP variant aimed at progressive enhancement of existing sites.

Discovery, schemas, state

The standard covers three pieces beyond the call itself. Discovery is how an agent knows a page has tools — currently it has to visit, then introspect navigator.modelContext. JSON Schemas on the inputs and outputs reduce the agent's room to hallucinate field names. State is a shared context the page can update so the agent knows what is on screen right now. Together, these turn the page from "an opaque rectangle the agent has to read" into "a typed surface the agent can call into."

What this changes for builders

If you ship a SaaS app, WebMCP gives you a way to make your product useful to an agent without standing up a public REST API. The same JavaScript that runs your features can be exposed as tools. Sensitive actions can pop a confirmation dialog before they fire.

If you build agents, the math changes in the other direction. You can stop paying tokens to make the agent read the page. On a WebMCP-aware site, the agent reads the tool list once and then calls functions. The behavior is more reliable because each step has a fixed contract instead of a free-form click.

Both sides have to wait for the same thing: distribution. Today only Gemini in Chrome consumes WebMCP tools. The hope is that Safari and Firefox engage; the proposal lives in the W3C Web Machine Learning group on GitHub.

Caveats and open questions

Discovery is unsolved. An agent only learns a site has tools once it visits. There is no equivalent of robots.txt for tools yet.
It needs a visible browser context. Tools run in JavaScript on a real page, so headless agents cannot use this. That is a deliberate design choice — keeping the user in the loop, with the page UI visible — but it limits the agent shapes you can build.
It is not the only attempt. JSON-LD Action schema, Adept's "ACT-1" tools, LangChain's per-site adapters all tried to give pages a tool surface. WebMCP's case for being different is standardization, not novelty.
Permissions and abuse. Tools run with the page's own privileges, gated by a tools Permissions Policy. The thread to watch is what happens when a malicious page registers delete_my_account on a third-party iframe — the cross-origin defaults handle the simplest version, but the long tail is still being argued.

Where to go next

The WebMCP get-started page has the full API. The imperative API reference has more tool examples, including how to unregister with AbortSignal. The standard itself lives on GitHub. To try it locally, enable chrome://flags/#enable-webmcp-testing in Chrome 149 and install the Model Context Tool Inspector extension. The origin trial token is the path to ship it on a real site.