This is a submission for the Google I/O Writing Challenge
Every Google I/O follows the exact same script. The keynote runs for two hours, your timeline catches fire over whichever shiny model just dropped, and three days later you're back to your day job pretending it didn't all happen.
This year, the headline grabbers were Gemini 3.5 Flash (which is wild—it's faster than 3.1 Pro and roughly 4x faster than other frontier models), Antigravity 2.0 dropping as a real desktop app, and Project Aura.
I watched all of it. And then I rewound the dev keynote to a slide that got maybe 90 seconds of airtime: WebMCP is now in an origin trial in Chrome 149.
I spent the last two weekends actually building with it. Here is what stuck.
What WebMCP actually is
If you missed it, Anthropic shipped the Model Context Protocol (MCP) in late 2024 to give AI agents a standard way to call tools. WebMCP basically takes that exact concept and jams it natively into the browser.
If you run a website, you can now register a list of tools—JavaScript functions with JSON schemas, or literally just annotated HTML forms. An in-browser agent (like Gemini in Chrome) can call those tools directly.
No DOM scraping. No brittle screenshot loops. No "click at pixel 482, 916 and hope the button hasn't moved." If you've ever watched an agent try to book a flight by looking at screenshots, you know exactly why this matters.
Here is what an agent interacting with the new API looks like in practice. Notice how it completely bypasses the UI layer and invokes the tool directly:
The declarative API is stupidly simple
The easiest way to use this doesn't even require JavaScript. You take an HTML form you already have, and you just slap some attributes on it.
<form action="/todos" method="post"
tool-name="add-todo"
tool-description="Add a new item to the user's todo list">
<input type="text" name="description" required
tool-param-description="The text of the todo item">
<button type="submit">Add Todo</button>
</form>
That's it. Chrome parses the tool-* attributes, builds a schema in the background, and Gemini can now call add-todo({ description: "buy milk" }). The browser literally submits the form on the agent's behalf and hands the response back.
What threw me at first was that it really does derive the tool from the exact same form your users see. You aren't maintaining a separate API surface just for bots. The progressive-enhancement-bro in me did a little happy dance. If they don't have Chrome 149? It's just a form.
The imperative API (for everything else)
For anything stateful—live search, React/Vue SPAs, multi-step flows—you reach for navigator.modelContext:
if ("modelContext" in navigator) {
navigator.modelContext.registerTool({
name: "addTodo",
description: "Add a new item to the todo list",
inputSchema: {
type: "object",
properties: { text: { type: "string" } },
required: ["text"],
},
execute: ({ text }) => {
addItemToList(text);
return { content: [{ type: "text", text: `Added todo: ${text}` }] };
},
});
}
(Feature-detect, always. Don't blow up your app for users on older browsers.)
The bit I underestimated here is context switching. Let's say a user logs in and a bunch of authenticated actions suddenly unlock. You don't have to manually unregister and re-register tools. You just call navigator.modelContext.provideContext() and it swaps out the entire toolset in one go. It's basically React state for AI tools.
A few things that surprised me
After poking at this for a few days, a couple things stood out:
The agent doesn't fight the form.
If you leave tool-auto-submit turned off, Gemini just fills out the fields and waits for the human to actually click "Submit." That tiny default sets a really humane tone for the whole API. The user stays in the loop for destructive actions, and I didn't have to wire that up myself.
It composes perfectly with server-side MCP.
WebMCP tools live right alongside regular local MCP servers. Your in-browser add-todo form can sit next to a local Postgres MCP, and the model seamlessly decides which one to invoke. I fully expected friction here, but there isn't any.
Schemas keep the agent honest.
I gave one tool a deliberately lazy description ("does the thing") and Gemini flat-out refused to call it with any confidence. Good descriptions are now a core part of your UX. All the energy you used to put into aria-labels now needs to go into your tool descriptions.
So... should you actually ship this?
Look, origin trials always feel like theoretical science fiction right up until the moment they become table stakes. Web Push, Service Workers, View Transitions—they all had this exact "yeah I'll get to it eventually" energy six months before everyone had to implement them.
If your product has any kind of "do a thing" surface—add to cart, schedule a meeting, file a ticket—I'd start sketching out a tool list now.
The agent-readable web is going to be an SEO-shaped event. The first wave of sites with sane, well-described tools is going to look magical to users running browser agents. If your competitor's checkout takes one WebMCP call and yours takes "scroll until you find the button," it's pretty obvious whose conversion flow the agent is going to prioritize.
It's not all green fields yet. The spec discussions on GitHub are still aggressively chewing on the hard questions: How do you authenticate a tool call from an agent? How do you rate-limit a bot that just discovered your search endpoint?
But this is the part of an I/O announcement I love the most. The hype is small, the surface area is real, and you can actually build with it today.
Grab an origin trial token, install Chrome 149, and give yourself an afternoon to tag one form. It's worth it.
Anyway, that was my I/O. Now back to my day job.

Top comments (0)