DEV Community

Cover image for WebMCP: A new contract between AI agents and websites
Razvan for Playful Programming

Posted on

WebMCP: A new contract between AI agents and websites

You've probably already heard of MCP from Anthropic, the protocol bridging the gap between AI and our data. Now Google is taking that idea one step further, bringing it directly into the browser. Meet WebMCP.

If you've watched an AI agent trying to interact with a website, you've probably seen it scraping the entire DOM, downloading a dump of the entire HTML page, taking screenshots, and only after that understanding what it can do. This operation takes time and it's a waste of precious tokens.

WebMCP is proposing a new way for developers to define a public interface for their website that AI agents can discover and use to interact with it. This is definitely quicker and cheaper.

I built a small shopping cart demo to explore the proposed APIs. You can find it here.

What is WebMCP, exactly?

The core idea is that a website can publish a structured contract (a set of named tools with explicit schemas) that any AI agent can discover and use reliably. Instead of the agent reverse-engineering your UI, your site simply says: "Here are the things you can do, and here's exactly how to do them."

The spec defines three pillars:

  • Discovery: a standard way for agents to query which tools a page exposes
  • Schemas: explicit input/output definitions
  • State: a shared understanding of what's currently available on the page

Think of it as an MCP server, but baked into the browser and tied directly to the page's live DOM.

⚠️ Availability

WebMCP is an experimental proposed standard behind a Chrome flag. If you want to play with it, you need:

  • Chrome: Version 146.0.7672.0 or higher
  • Flags: The "WebMCP for testing" flag must be enabled (chrome://flags/#enable-webmcp-testing)

ℹ️ To inspect registered functions, execute them manually or test them with an agent, install the Model Context Tool Inspector Chrome extension.

Two APIs, two philosophies

WebMCP offers two complementary ways to expose tools.

The Imperative API

The Imperative API is JavaScript-first approach. You register tools programmatically via window.navigator.modelContext, giving each one a name, a description, a JSON Schema for its inputs, and an execute function that runs when an agent calls it.

window.navigator.modelContext.registerTool({
  name: "addToCart",
  description: "Add a product to the shopping cart",
  inputSchema: {
    type: "object",
    properties: {
      product_id: { type: "string" },
      quantity:   { type: "number" }
    },
    required: ["product_id"]
  },
  execute: ({ product_id, quantity }) => {
    // your logic here
    return { content: [{ type: "text", text: `Added ${quantity}x ${product_id}` }] };
  }
});
Enter fullscreen mode Exit fullscreen mode

registerTool registers a single tool without touching the others. When you need to register multiple tools at once, provideContext is the right call:

window.navigator.modelContext.provideContext({
  tools: [
    {
      name: "getCart",
      // ...
    },
    // ...
  ],
});
Enter fullscreen mode Exit fullscreen mode

‼️ Watch out: provideContext replaces ALL registered tools — including declarative ones.

This is a subtle but important gotcha. When you call provideContext, it doesn't just add tools, it wipes out the entire existing set and replaces it. This includes any tools the browser has already registered from your declarative HTML forms.

There is already an open discussion about this function and it looks like the developers are not very happy about this API.

If you need to remove a specific tool at any point, you can use unregisterTool:

window.navigator.modelContext.unregisterTool("addTodo");
Enter fullscreen mode Exit fullscreen mode

The Declarative API

The Declarative API takes a different approach: rather than writing JavaScript, you annotate your existing HTML forms with a small set of custom attributes and let the browser do the heavy lifting.

<form
  toolname="applyCoupon"
  tooldescription="Apply a discount coupon code to the cart"
  id="coupon-form"
>
  <label for="coupon_code">Coupon Code</label>
  <input
    type="text"
    id="coupon_code"
    name="coupon_code"
    placeholder="e.g. SAVE10"
    toolparamtitle="Coupon Code"
    toolparamdescription="A valid discount coupon code"
  />
  <button type="submit">Apply Coupon</button>
</form>
Enter fullscreen mode Exit fullscreen mode

The browser reads these annotations at parse time and internally constructs the same JSON Schema that the Imperative API produces manually. When an agent invokes the tool, the browser focuses the form, populates the fields and, if toolautosubmit is set submits it automatically without requiring user interaction.

The attributes available are:

  • toolname: the registered name of the tool that the AI agent can call
  • tooldescription: human readable description of what the tool does
  • toolautosubmit: lets the AI agent submit the form automatically
  • toolparamtitle: maps to the JSON Schema property key. If omitted, the browser defaults to the input element's name
  • toolparamdescription: maps to the property description within the JSON Schema. If omitted, the browser uses the text content of the associated <label> element, or the aria-description if no label exists

Knowing when an agent is in charge

The SubmitEvent introduces a new agentInvoked boolean attribute. This lets you understand who triggered the action, the agent or the user and adapt your app's behaviour accordingly.

Additionally, SubmitEvent now includes a respondWith(Promise<any>) method which lets you pass a promise to the browser that resolves with the form's result data. That value is then serialized and returned to the model as the tool's output. You must first call preventDefault() to stop the browser's standard form submission.

form.addEventListener("submit", (e) => {
  e.preventDefault();
  const result = doTheWork();
  if (e.agentInvoked) {
    e.respondWith(Promise.resolve(result));
  }
});
Enter fullscreen mode Exit fullscreen mode

Visual feedback

There are also CSS pseudo-classes for visual feedback when an agent is interacting with a form: :tool-form-active is applied to the form itself, and :tool-submit-active to its submit button. These are useful for showing the user that an agent is in control. Pair them with the toolactivated and toolcancel window events to hook into the full tool lifecycle:

window.addEventListener('toolactivated', ({ toolName }) => {
  console.log(`${toolName} was activated by an agent`);
});

window.addEventListener('toolcancel', ({ toolName }) => {
  console.log(`${toolName} was cancelled`);
});
Enter fullscreen mode Exit fullscreen mode

ℹ️ Practical note: you must serve over HTTP

Opening an HTML file directly via file:// will silently fail — window.navigator.modelContext simply won't initialize. This is a standard browser security boundary, not a WebMCP bug. Any local server works: npx http-server, python3 -m http.server 3000, or npx vite.

Conclusion

AI agents are increasingly the ones interacting with the software we build, but our websites were never designed for them. Web scraping and visual UI parsing are fragile, slow, and expensive in tokens. WebMCP could solve this by letting us define a clear, explicit contract between our website and any agent that wants to interact with it, without rewriting everything from scratch. The Declarative API in particular is compelling precisely because it progressively enhances forms that already exist.

It's still an experimental proposed standard, but the core idea is solid. I hope it sees the light soon.

Top comments (0)