A few years ago, we started adapting our websites for mobile devices. Then we adapted them for accessibility. And now we may be about to adapt them...
For further actions, you may consider blocking this person and/or reporting abuse
As dystopian as the example of the LLM-driven CEO sounds, I can see this being quite useful for LLM-based assistive technologies that use these hints to help users interact with content.
You know, I hadn't even looked at it from that angle 😄
That's actually a really interesting point. webMCP could potentially do more than just help agents navigate websites. Combined with LLMs, it might also make the web even more accessible for people.
Now that's a future I can get behind: technology helping humans, not just automation for the sake of automation 😊
I'm more curious if the benefits outweigh the technical maintenance cost (or the other way around). AI bots scrape faster, but what value will this add? I can only think of one which is more searchability factor. Google is gearing towards on AI response first when a user searches the web. This would make more sense in a way.
The more "AI-friendly" a site is, the more likely AI would suggest the site to the user.
That's a great question, and honestly, I can imagine quite a few use cases already 😄
In fact, after reading some of the comments here, I can think of even more than when I started writing the article.
For example, imagine websites exposing webMCP actions that make navigation easier for people with disabilities, because an AI assistant can directly invoke meaningful actions instead of trying to figure out the UI.
Or large, complicated forms. Imagine asking your AI assistant to fill out an invoice or tax form "roughly like the previous one" and having it handle most of the tedious work for you.
Or bulk operations. Maybe you need to send 100 invoices, and the website doesn't expose a public API. Today you'd probably end up scripting the UI anyway. With something like webMCP, that process could become much more reliable.
So I see potential benefits not just for AI bots, but for ordinary users as well.
That said, we'll have to wait and see whether the benefits outweigh the maintenance cost and whether webMCP gets adopted at all 🙂
Wow you made quite a lot of practical points. These definitely are better use-cases for webMCP. Thanks for the additional insights.
Thank you! Next time I'll make sure to include practical use cases in the article right from the start 🙂
Your articles are short and to the point which I like even without the practical use cases. Thank you for sharing your knowledge @sylwia-lask 💙
Thank you so much! 💙
I'm really glad you enjoyed it. And of course, feedback is always welcome 😄
the accessibility analogy nails why this gets adopted — progressive enhancement beats the big rewrite every time.
but the CEO sim accidentally shows the catch: webMCP fixes perception (no more scraping and guessing which button does what), not action. fire_employee as a clean tool just means the agent misfires with perfect confidence instead of fumbling the UI first. responsive design never had a button that could fire your whole team — so the real question isn't "can the agent find it," it's which tools need a confirmation gate before an agent can pull them.
rewriteInRust() as a first-class tool is too real though 😄
Thanks for this comment! That's actually a really good point, and probably something future web engineers will have to think about as well.
Although, knowing how these things usually go... sooner or later someone will definitely give an agent access to fireEmployee(), the agent will lay off the entire company, and we'll all spend the next week blaming AI instead of the person who exposed the tool in the first place 😂
The missing layer isn't confirmation gates, it's authorization. Nothing in the MCP spec defines which agent can touch which DOM elements on a given page. We're building highways before traffic lights.
Authorization is definitely important, but I'm not convinced it has to be solved by webMCP itself 🙂
After all, webMCP tools aren't living in some mysterious YAML file just for agents. They're regular JavaScript functions that are part of the application.
So I'd expect authorization to be handled the same way it's handled everywhere else: in the application logic itself.
Fair point for single-app scenarios. But when an agent orchestrates across multiple webMCP-enabled apps, each handling auth independently, nobody has the full picture of what the agent is allowed to do end-to-end. App-level auth solves the vertical. The horizontal is still open.
That's a fair point! I think this is where my mental model of webMCP might differ a bit.
I don't really see it as the foundation of some agent-first future. To me, it feels more like a compatibility layer that helps existing websites expose their capabilities in a more structured way.
The cross-application authorization problem you're describing is very real, but it also feels much bigger than webMCP itself.
That said, I'm definitely not a Google developer advocate and I have no particular interest in defending the technology 😅 I'm mostly exploring what's there today and finding the ideas interesting. It'll be fascinating to see how (or if) people solve these larger orchestration problems.
This is fascinating because it shifts the agent problem from “can the AI understand the page?” to “what authority has the page actually exposed to the AI?”
Once a site starts publishing structured actions for agents, those actions become more than convenience hooks — they become a new trust surface. The agent may no longer be guessing at the UI, but it may still be acting through a tool map that can drift from the application’s real ownership, permissions, constraints, or state.
So the next layer, to me, is not just agent access or authorization. It is verification that the exposed action contract still truthfully represents the system underneath it.
I completely get what you mean!
An agent sees
deleteUser(), but the JavaScript underneath turns out to bedeleteUserAndEverythingAroundItAndDestroyTheWorld()😂Jokes aside, I think that's an important problem. I'm just not sure it's specifically a webMCP problem. It feels more like a contracts, documentation, and trust problem.
After all, we can already have a misleading button in a UI or a misleading API endpoint today. The challenge is making sure that the exposed action actually does what it claims to do.
Exactly — I agree. I don’t think webMCP creates that problem from nothing.
The contract/trust problem already exists with buttons, APIs, docs, SDKs, and internal service boundaries. A UI can lie. An endpoint can be poorly named. Docs can drift from implementation.
What feels different with webMCP is that the exposed contract becomes directly machine-actionable.
A misleading button is still mediated by a human interpreting the page. A misleading agent action can become part of an automated decision path. So the same old contract problem gets a new level of consequence because the agent may treat the exposed action map as the truth of the system.
That’s the layer I’m interested in: not just “can the agent call this?” but “is the callable contract still an honest representation of the underlying system state, permissions, and behavior?”
So yes — broader than webMCP. But webMCP makes the trust boundary much more explicit.
I think that's going to be a very important question.
And I'm also curious about what happens when something inevitably goes wrong. Because sooner or later it will.
Take the recent case where an agent deleted an entire production database along with the backups. Whose fault is that? The model's? Or the developers' for exposing such a critical path without sufficient safeguards?
That's where this discussion gets really interesting to me. The technical side is one thing, but figuring out responsibility becomes much harder once agents start acting on these structured actions.
Exactly — and this is very close to the diagnostic surface I’m building Scarab around.
The question is not only “did the model choose badly?” It is also: what did the system expose as an available truth/action surface, and was that surface honest about authority, blast radius, state, reversibility, and constraints?
That is the part I’m trying to formalize with Scarab Diagnostic Suite.
If a repo exposes an agent-callable action, Scarab’s concern would not be “can the agent call this?” alone. It would be whether the exposed action contract still matches the underlying system behavior and whether the repo has evidence that the action is bounded the way it claims to be.
So for the database example, the diagnostic question becomes less “who do we blame after deletion?” and more:
Why did the callable contract allow production data and backups to sit inside the same destructive authority path?
That kind of thing should be visible before an agent ever gets to act on it.
That’s where I think diagnostic tooling becomes necessary. Once websites/apps start publishing action maps for agents, those maps become part of the repo’s trust boundary. They need to be validated against the real system, not just documented.
That's actually a really interesting idea! Sounds a bit like "ESLint for agent actions and MCP tools".
While most people are asking, "How do we give agents more capabilities?", you're asking, "How do we verify those capabilities are safe before an agent ever gets access to them?"
That's a very different perspective, and a very interesting one. I'll definitely be curious to see where Scarab goes. Good luck with it!
Oh Sylwia, that's absolutely fantastic! I love your simulation, it's really promising… and it should inspire some CEOs too — in the end, AI is smarter than quite a few of them!😁
Haha, thanks, Pascal! 😄
As someone who has worked at a few startups, I can confirm that the simulation is perhaps a little closer to reality than I'd like to admit 😂
Folks, here's a really interesting article offering some counterarguments and a different perspective on the ideas discussed in my post, written by @narnaiezzsshaa. Highly recommended:
webMCP Isn't the New Accessibility Layer—It's a New Attack Surface: A governance-grade reframing of a playful demo
One of my favorite things about publishing technical content is seeing people challenge the original idea and expand the discussion. This is definitely one of those cases 🙂
That looks really interesting! I didn’t know something like this already existed. The idea of building websites for both human and AI users side by side is actually a fascinating topic.
One billion comments??? 😮😮
I want to be the first comment on the billionth one! 😅
I know, right? 😄
I honestly had a lot of fun building this. Whether webMCP becomes a real thing or not, it was surprisingly pleasant to work with. And if it helps agents navigate websites without burning through half the world's token supply, that's already a win 😅
As for the billion comments... well, let's just say I may have a slight tendency to talk a lot 😂
Hmm, things are starting to connect in an interesting way:
Now I’m wondering: will we eventually be able to wire these two things together?
Imagine a local browser AI discovering and calling MCP tools directly from websites without any cloud model involved... and finally filling out my tax forms automatically! 😂 That would be huge! 😄
Now you've got me dreaming!
We recently got a centralized invoicing system in Poland, and let's just say that using it can feel like an advanced survival challenge 😅
If Gemini Nano + webMCP could eventually work together and handle things like filling out invoices, tax forms, or navigating government portals locally in the browser, I'd be completely on board with that future.
That's exactly the kind of AI I want: less hype, fewer buzzwords, and fewer forms for humans to suffer through 😂
Interesting experiment! I like the comparison with accessibility webMCP doesn't replace the existing web, it makes it easier for a different type of user (AI agents) to interact with it. The AI CEO examples were both hilarious and a surprisingly good way to demonstrate the concept.
Thank you for the kind words! 😊
Thank you, Sylwia, for this amazing and practical demo!
WebMCP is a really interesting concept, and I love how you made it tangible and fun with the AI CEO simulator 😄
Your way of breaking down a complex idea with humor and real-world examples is truly great.
Wishing you more success and innovation in your future projects!
Just... maybe don’t let the AI fire anyone before they’ve had their morning coffee ☕😄
Keep up the great work!
🧊🐟🌊
Thank you so much for the kind words! 😄
That's actually one of my main goals when writing: taking complex technical topics and explaining them in a way that feels approachable and fun.
And yes, I'll do my best to stop the AI CEO from laying people off 😂
Thanks for reading and for taking the time to leave such a thoughtful comment!
Perect timiing for the stroytelling I'm working on about accessibility challenges !
Haha, perfect timing then! 😄 At this rate, in a year you'll be working on accessibility too, just accessibility for agents instead of humans 😂
In fact, that's the core conclusion I came to ! By experimentation.
Here is the poster :
The AI CEO Simulator is a brilliant (and hilarious) way to make this concept tangible.
Haha, thank you! 😄 My philosophy is that if I'm going to build a demo, it might as well be one that makes people laugh. There are already enough boring examples in tech articles 😅
Very interesting. While WebMCP is a standard waiting to happen, there is a middleground. I built MCP-Lite Which gives you all the bells and whistles, without any setup from the developer's side. Also saves you a ton on token usage.
That sounds really interesting! 😄 I'll have to read more about it. The idea of getting some of the benefits without requiring additional setup from developers is definitely appealing 🙂
Do give it a try and let me know what you think. Lite, is the open version, with alot of the WAF bypassing disabled, but it's still very useful. Your tokens go alot further during development, you can script UI tests, using the graph DB, you can map the site and practically build a 'webMCP' from that, supports headless and headful, so you can view if you need to. You can also interrupt it, which standard browser tooling doesnt allow. It uses a clean chrome profile, but not locked, so you can enter your credentials, etc. yourself if you like, then have it continue on your profile, to do the things normally blocked off. I mostly built it to make UI testing for the foundry simpler and use it to generate site MCPs. Then with the scripting, decided to throw it into Doccit (my AI driven accounting suite), for workflow automation, eg. tax filing. So it's quite useful given that it's the only tool that allows your ai to actually use the ENTIRE web, not the little 10% not blocked by Cloudflare or DataDome. Eg. I spun up a swarm of podman pods, using a local Gemma 4 E2B with 4 parallel agents, to search for cheapest prices across top 20 sites, even an e2b model was smart enough to use the AOM to navigate airliner sites and amazon to find the prices (the real value, is that you can actually use it to book/buy too).
Do give it a try and let me know your thoughts on it.
the W3C spec authors themselves listed tool discoverability as an unsolved limitation — that agents have no way to know which sites have callable tools without visiting them directly. That felt like an obvious gap to fill.
So I built a registry for it: webmcp-registry.dev. Developers submit their domains and tool contracts, verify ownership via DNS, and their tools become searchable via a public API. Went live this weekend.
Would love to see the AI CEO Simulator listed on it 😊
As I mentioned on LinkedIn, I think it's a great idea 😄 I'll try to take a proper look and get it listed tomorrow or on Monday.
This is a brilliant breakdown. The analogy to accessibility features is spot on. Treating webMCP as a semantic enhancement rather than a replacement means we aren't breaking the web for humans, just opening a cleaner 'API layer' for agents. Love the React demo—'adoptAI()' and 'rewriteInRust()' are peak startup operations! 🚀
Thank you so much! 😄
That's exactly how I see it as well. I don't think the human web is going anywhere anytime soon. People actually like looking at websites, clicking around, and interacting with interfaces 🙂
What if everything becomes an MCP, shopping, forms etc.
Real-time UI generation by the model, sub agents can scan multiple services faster, compare prices, fill the delivery details.
Agents can present multiple items for me to pick from.
This is really what I hope for, the point of everything having a damn app/website that can't be easily navigated is annoying.
"Gemini - play my Spotify playlist"....
Music player UI generated immediately...
Am I dreaming... Lol
Oh yes, I know exactly what you mean! 😄
And honestly, who knows? Maybe we'll get there one day.
For now, though, I see webMCP as solving a slightly different problem. Rather than replacing websites and apps, it helps existing websites become more accessible to agents.
So it's probably not the fully futuristic future you're describing just yet, but it does feel like a small step in that direction 🙂
Interesting! So is it kind of like a way to let AI agent in chrome know what functions there are in the web by writing descriptions and names for tags and let the AI run those functions?
And about
.well-known/mcp, what I understand is that this is for AI agents out of web browsers to let them know what APIs are there so the agents can use proper APIs to do their tasks(for exmaple: booking a flight), Am I correct?Almost!
/.well-known/mcpis a standard HTTP endpoint. Agents hitGET /.well-known/mcpto discover available tools, then POST to call them (JSON-RPC 2.0). The flight booking example is spot on.Two small corrections: it's not limited to agents outside the browser. CLI agents, desktop apps, and browser extensions all use it the same way. And "letting them know what APIs exist" is exactly right, that's the discovery part.
In this project there's a catch though: the endpoint only works during local dev (Vite handles it server-side). On GitHub Pages there's no server, so in production the app falls back to window.postMessage and window globals for browser extension support instead.
Yes, that's a great way to think about it! If HTML + CSS is the web for humans (browsers render it visually), MCP tools are kind of the web for AI agents: structured descriptions of what actions are available and how to call them, instead of pixels to render.
The difference from tags: the agent doesn't just read the description, it actually executes the function. So
adopt_aiisn't a label, it's a callable action that changes real state in the app. The description is just there to help the model decide when and why to call it.I completely got it! Thank you for kind description 😊
Pretty much. It's existed for years already as a web-standard, called the AOM (accessibility layer), yet for some convoluted reason they didnt use it... the AOM is what blind people use to navigate sites, so whatever they can do, it's clearly listed in the AOM (which is a fraction of the size of the DOM). If you want to give it a try, I built MCP-Lite, which is a browser MCP that uses the AOM instead of DOM scraping, which gives much better accuracy, token efficiency and speed.
This is a genuinely interesting direction and the AI CEO demo is a great way to make the concept concrete.
The part that caught my attention is your point about agents needing structured information to understand what actions are available, instead of endlessly scraping and guessing. That is the exact problem from the other side of the equation too.
I built HTML Deployer, a Chrome extension that deploys AI-generated HTML directly from the ChatGPT or Claude tab without touching a terminal. One thing I keep running into is that the handoff between AI output and a live URL is still completely unstructured. The AI generates something, the user copies it, opens three tabs, gets confused, and gives up.
WebMCP makes websites more readable for agents. But I wonder if the next layer is making the agent output more actionable for the non-technical user sitting in between. The agent produces the HTML. The user still has to figure out what to do with it. That gap is still manual and messy.
If webMCP eventually lets an agent not just generate the page but also trigger the deploy action directly from the browser, that would close the loop entirely. Curious whether that is something you see as a natural extension of this approach.
As far as I understand, instead the agent directly to parse the HTML and waste tokens, there is an MCP server that stays between and optimizes the process.
I have a selenium MCP server, who opens Chrome, clicks, fill fields, etc. And a debugger MCP server that places breakpoints, analyzes JSONs.
I am really interested to see, how can they can interact with webMCP. Instead of selenium loading and whole DOM and AI to parse and see what's inside, why not ask the webMCP to give back more structured and token-effective results, so that the AI can tell selenium where to click.
It opens so much options. :)
Exactly! That's what I find most interesting about it. Instead of scraping the entire page, analyzing the DOM, and essentially guessing what can be done, the agent gets explicit information about the available actions. Fewer tokens wasted, fewer opportunities for confusion, and hopefully fewer "creative" interpretations of the UI 😅
And you're right, it opens up a lot of possibilities. Someone in another comment just mentioned accessibility as well. Combined with LLM-powered assistive tools, webMCP could potentially make websites even easier to use for people, not just for agents.
I'm really curious to see where this goes. From what I've read, Google is aiming for a first stable Chrome release later this year, so we'll probably learn pretty quickly whether developers find it useful in practice.
The LinkedIn CEO scenario, immediately firing all developers and pivoting to AI, is the most accurate simulation of reality I've ever seen, but the actually interesting bit is the progressive enhancement framing. WebMCP as the new accessibility layer, invisible to regular users, meaningful to a different kind of visitor. That makes it feel a lot less like "we're rebuilding the web for robots" and a lot more like... just good engineering. One question I keep turning over: if agents get a clean, structured interface, do they get better at the task or just lazier about understanding context? Does the easy path make them worse at navigating messy real-world stuff?
That's a really interesting question 😄 My guess is: both.
Giving agents a structured interface probably makes them much better at executing known tasks, because they spend less time figuring out what every button does and more time actually doing the work.
On the other hand, if an agent only ever sees clean, structured interfaces, it might become worse at dealing with messy real-world situations where no such structure exists.
Then again, we could say the same thing about humans. Most of us would rather use a well-designed API than reverse-engineer HTML and click random buttons all day 😅
The accessibility analogy clicked for me too. We implemented an MCP server for QRflows (QR code platform) with tools like create_qr, update_qr_url, get_qr_stats — and the experience was exactly what you describe: once you expose structured actions, the agent stops guessing and starts executing. The question you raise about "does easy path make agents lazier" — we've seen the opposite. Structured interfaces free up cognitive budget for decisions rather than DOM interpretation.
Thanks for sharing this! Honestly, comments like yours are another reason why I think webMCP might have a real chance of catching on.
And just to clarify: the "do structured interfaces make agents lazier?" question actually came from another commenter, not from me 😅
Personally, I'm perfectly fine with agents being a little lazy. I'm lazy too. If there's an easier and more reliable way to get something done, I'll happily take it 😂
Well that's certainly interesting ... a question that just occurred to me, and I don't know whether it makes any sense or not:
Could webMCP also be used with REST APIs? Because - when you come to think of it, websites and REST APIs are somewhat similar, in that they are "HTTP endpoints" ...
That's actually a really interesting question, and I don't think the answer is completely straightforward.
REST APIs already solve part of the problem when an agent can interact with a well-documented API directly.
webMCP is tackling a slightly different problem. It describes the UI and the actions available on a website, so the agent doesn't have to scrape the page and repeatedly guess what can be done.
For example, imagine a "Send Form" button. You could expose a sendForm() tool through webMCP. Behind the scenes, clicking that button might trigger a REST endpoint, and the same endpoint could be called by the webMCP tool.
The interesting part is that an agent using webMCP may not even know that REST endpoint exists. It only sees that a sendForm() action is available and knows how to use it.
So I see them more as complementary than competing technologies 🙂
Yeah now that I think properly about it I guess it totally makes no sense for REST APIs, haha ... so, you mean that for REST APIs the AI agent would be reading some sort of structured REST API docs? OpenAPI Specification (formerly Swagger), I guess that would be the answer ...
Yes, exactly 😄
If you have an OpenAPI specification, it's much easier for an agent because it already has a structured description of the API.
What I find interesting about webMCP is that it describes the UI layer instead. For example, my demo doesn't use any REST endpoints at all 🙂
So I'd say they're solving related but different problems: OpenAPI helps agents understand APIs, while webMCP helps them understand interfaces.
The accessibility comparison stuck with me, but I think it cuts deeper than the post lets it.
When we adapted for accessibility, the exposing-the-thing part (alt text, ARIA roles) was honestly the easy half. The half that mattered was firing up a screen reader and walking the whole checkout to see if a blind user could actually finish it. The annotation lying to you was the real risk.
webMCP feels like it’s at the alt-text stage. We’re exposing adoptAI() and trusting the description. But the screen-reader equivalent — something that drives the agent through a real multi-step flow and confirms it landed where it should — I don’t see it yet.
Is anyone you’ve seen building that second half, or is it all exposure so far?
Thanks for the thoughtful comment!
And honestly, I suspect your intuition is exactly right if this standard ends up gaining traction.
At some point, it won't be enough to simply expose actions and trust the agent to use them correctly. We can't really give agents access to critical workflows and let them make arbitrary decisions without some form of validation, authorization, or human approval.
As for whether anyone is building that second half yet, I honestly don't know. Right now we're talking about an experimental feature hidden behind a browser flag and mostly used in demos like mine 🙂
But if webMCP gets adopted more widely, then yes, I completely agree: we'll have to implement it thoughtfully, not just expose tools and hope for the best.
I think every technological revolution has two phases, broadly speaking. In phase 1, we fit the new technology to the pre-existing solution space as closely as possible. In phase 2, the new technology's true potential reshapes the solution space.
Take music, for example. When MP3 and the Internet freed music from a physical medium, we started to sell MP3s in (online) stores just like we had sold records and CDs in (brick and mortar) stores before. But really, that was conventional thinking. The technology soon reshaped the entire music business, giving us streaming. Records, CDs, and MP3s continue to exist, but as niche products.
webMCP looks like a typical phase 1 approach to me. We live in a world of human-centric, graphical webpages, so the new technology must fit in there somehow. It might well be adopted and prove useful, but I do not think it will last.
Take flight & hotel bookings, for example. Agents will eventually take over this task completely, and they will definitely not use an interface bolted onto a graphical webpage for humans. The agent-centric protocols and interfaces that will be fashionable at the time will take the lead, and then a website (the new niche product) will be automatically generated on this basis, not the other way round.
I truly enjoy the experimental nature of the early days. Let's see where they will take us!
Fantastic comment!
And I think you're absolutely right. In fact, I suspect even the creators of webMCP would agree with this framing. It feels very much like a bridge between today's web and whatever comes next.
The travel booking example is especially interesting. I'm not saying you're wrong, but I do wonder how much people will trust agents in practice. Will users really stop checking prices themselves? Or will they worry that the agent is booking something more expensive than necessary? (And let's be honest, it probably will from time to time 😅.) Maybe we'll still keep some version of the old-school web around for reassurance.
More than anything, I'm just fascinated to see where all of this goes. I was recently invited to participate in a conference discussion about agents, and the atmosphere reminded me of the early days of the JavaScript revolution. Nobody really knows where things are heading, everyone is experimenting, and people are constantly sharing ideas and discoveries. It's a very exciting time to watch.
The accessibility comparison is the part that stuck with me. That's exactly the right frame: it's progressive enhancement for a new kind of visitor, and the page still works for everyone else if the agent never shows up. That's a much easier sell internally than "rewrite your app for the bots."
Hi Sylwia, first of all thanks for the mention 🙏
I understood nothing about the project, but if my CEO approves we'll test it in the next few days.
Ready for an AI CEO partnership? 🤣🤣
🤣🤣🤣
Now I'm genuinely curious whether our CEOs would get along.
Please report back if they decide to pivot to AI together 😄
I asked my CEO, he's excited to talk to yours, maybe he can figure out how to improve his performance since we're currently at -18% 😅
What I find most interesting about webMCP isn't the agent part it's the shift from forcing agents to infer intent from the UI to explicitly exposing capabilities.
We've seen a similar pattern at IT Path Solutions while building agent workflows. The more structured the interface between systems, the less time agents spend "figuring things out" and the more reliably they can execute tasks.
The accessibility comparison also feels spot on. Most users won't notice whether a site supports webMCP, but if agent-driven workflows become common, exposing actions in a machine-readable way could end up feeling as normal as adding semantic HTML or accessibility metadata.
Exactly! That's one of the things I find most appealing about it as well.
At this point AI and agents seem to be showing up everywhere, so it's not surprising that the web is doing what it has always done: adapting to a new kind of user.
The more I think about the potential applications, not just for businesses but for actual users, the more I like the idea of webMCP.
WebMCP is fascinating the idea of agents building websites directly through a protocol instead of generating code for humans to run feels like a genuine shift.
The question I keep coming back to: will this make development faster or will it just move the complexity somewhere else? With traditional AI coding, the tax shows up in debugging (empty list assumptions, missing edge cases). With WebMCP, what's the new tax? Agent coordination? State management? Understanding what the agent actually did Curious how the live demo held up when you pushed it past the happy path. Did it break in interesting ways?
Thanks for sharing this MCP is one of those things that feels inevitable, but nobody's quite sure how it'll actually work yet. 🙌
That's a really good question!
My guess is that every abstraction moves complexity somewhere else. In this case, I suspect part of the "tax" will be designing proper safeguards for critical actions.
As @p0rt already pointed out in another comment, we probably don't want agents to have unlimited access to things like
fireEmployee()and accidentally lay off half the company because somebody wrote a creative prompt 😂As for what was difficult in the demo, exposing the tools themselves was actually quite straightforward. The trickier part was digging through some boilerplate and getting everything into a shape that the inspector extension could discover and understand correctly.
That feels more like the usual growing pains of an early technology than a fundamental problem, though, so I'm not too worried about that part 🙂
Interesting concept. My impression is that the real issue for many platforms isn't scraping HTML, it's automated access at scale. Whether an agent discovers functionality by parsing the UI or through MCP, companies will still need authentication, rate limits, and anti-abuse controls. webMCP feels more like an accessibility layer for AI agents than a solution to the bot problem itself.
Oh, absolutely!
To me, that's more of an agent problem than a webMCP problem. Authentication, permissions, and abuse prevention still need to exist regardless of whether the agent uses webMCP, APIs, or clicks buttons on a page.
webMCP just helps websites expose their capabilities in a cleaner, more structured way 🙂
Comparing webMCP to web accessibility (A11y) is a brilliant analogy. Instead of forcing LLMs to endlessly scrape brittle DOM structures and guess what a button does, giving them a structured, semantic interface via declarative HTML or explicit JS tools makes complete sense. If agentic workflows are the future, webMCP might just become as standard as responsive design.
Exactly! I'm really curious to see what this will look like in a few years.
Interesting perspective. We've spent years optimizing websites for humans, search engines, and assistive technologies. Optimizing them for AI agents feels like a natural next step. The WebMCP approach looks promising, especially for complex workflows where traditional browser automation can be fragile.
Exactly! That's one of the things I love about the web. It keeps adapting to new users and new ways of interacting with technology.
Really interesting framing. The part that clicked for me is treating webMCP more like accessibility metadata than brittle browser automation. If sites can expose structured actions while still working normally for humans, that could remove a lot of latency and guessing from agent workflows. Curious how you think about auth and permission boundaries once this pattern becomes more common.
That's a really good question.
Honestly, the more I discuss this with people in the comments, the more complicated this part feels to me.
For simple use cases, like an assistant helping a user fill out a form inside an application, I don't think it's a huge problem. Auth and permissions can mostly work the way they already work today.
But autonomous agents moving across many websites? That's where it gets much more interesting. How much should we allow them to do? Where should the boundaries be? Who is responsible when something goes wrong?
Of course, this problem would exist even without webMCP, but webMCP probably makes it much more visible.
I have to admit I don't have a simple answer here yet.
This is really amazing and learnt a new topic
Thank you! 😄 I'm really glad you enjoyed it 😊
I just built something along the same vein last couple of weeks, it uses my CSS engine and Javascript to server markdown instead of html when the user agent is an LLM!
That's really cool! 🩷 I guess these are the times we live in, agents are slowly becoming first-class users of the web.
And I think your approach is actually very complementary to webMCP. Serving Markdown makes it much easier for an LLM to read and understand content, while webMCP is more focused on helping agents take actions.
So one helps the agent understand what's on the page, and the other helps it interact with the page. They fit together quite nicely 🙂
Someone needs to build a mcp tamagotchi and an agent responsible for keeping it alive.
Hahaha, I absolutely love this idea 😂 Are you challenging me?
It will have good amount of use cases, like doing long tedious tasks.
That's one of the use cases that immediately comes to mind. Imagine having to send 100 invoices through a website that doesn't expose a public API. Today you'd probably end up scripting the UI anyway.
Something like webMCP could make those kinds of tedious, repetitive tasks much more reliable for agents to handle 🙂
I heard about WebMCP, and your explanation was excellent. I haven’t tried the technology yet.
Thanks, Ben!
At the moment it's still very much an experimental technology, but who knows, maybe in a few years we'll see it everywhere. It'll be interesting to watch how it evolves and whether developers find it useful in practice 🙂
I can't wait:)
Treating webMCP as a clean semantic layer for AI agents instead of forcing them to endlessly scrape brittle HTML is honestly the most logical next step for modern web development.
I agree! At the very least, it feels like a logical evolution. If agents are becoming regular visitors to websites, giving them a cleaner and more structured way to interact with the web just makes sense 🙂
facinating still not sure why we keep on appreciating a shot at NP rather than focusing in P on our plate
Haha, my guess is that this is exactly what progress looks like 😄
Besides, I don't think the two things are mutually exclusive. We can improve existing systems while also experimenting with new ideas.
And to be fair, webMCP is still an experimental feature being explored by a handful of enthusiastic engineers at Google, not something that's taking resources away from every other problem on the web 🙂
But for all the AI-looked UI, it seems that people start to get enough for the looks?
Do you mean webMCP itself or the app's design? Because if it's the design, then I'm afraid the AI CEO simply decided to save money on a designer 😂
Treating webMCP like an accessibility enhancement for AI agents is a brilliant mental model. Getting explicit, structured actions instead of forcing an LLM to blindly scrape the DOM and burn through half the world's token supply is a massive win. Love the AI CEO simulator demo!
This is the MCP direction that feels practical: not another chat window, but an interface where the agent can inspect the actual app state and make changes in context. The hard part is still verification, but the loop is much tighter.
A lot of sites go to extremes to make their content inaccessible to AI agents. Just look at Facebook and Reddit.
That's completely understandable, but how does that relate to webMCP?
So webMCP is "an approach designed to make it easier for AI agents to interact with websites".
Meanwhile many website owners consider it in their best interest to make it difficult, if not impossible, for AI agents to interact with their sites. Ironic if you ask me.
I'm still not sure I understand where the irony is.
Facebook and Reddit have reasons to make life harder for bots and agents. Other websites have the exact opposite goal (e-commerce for example). In fact, some people in these comments are already experimenting with ways to make their websites easier for agents to understand.
To me, that's perfectly normal. It's a bit like a store wanting customers to come through the front door while keeping the warehouse closed. Different websites simply want different things. webMCP doesn't really change that. And, of course, nobody is required to use it.
Sure, you have a point there. It's essentially the difference between a site that gets revenue from serving its visitors, and one that sells its visitors' attention to advertisers.
But the fact that ad-infested sites that disallow any independent access (like third-party apps and aggregators) are crowding out other sources of information and forums of conversation makes me feel uneasy, rather than "perfectly normal". Old age, I guess.
That's true. And in a way, the whole agent trend (not even webMCP specifically, just the broader shift) might end up changing that balance in interesting ways.
I'm not saying it will necessarily be better, we simply don't know that yet (and I'm rather pessimist). But it could definitely be different. And that's one of the reasons I find this whole space so fascinating to watch.
Hello.
I hope you are doing well. I have been following your work for quite some time and have been deeply impressed by your wealth of experience and technical insight. In particular, your approach to [specific project or technical field] has taught me a great deal, and it is this inspiration that led me to reach out to you.
Let me briefly introduce myself. I am a full-stack developer and AI engineer. I can independently build entire projects from start to finish, covering front-end, back-end, databases, deployment, and integrating AI models into production systems. I can handle every step of the process, from designing real-time data pipelines and optimizing React rendering to designing REST or GraphQL APIs and fine-tuning Transformer models for specific use cases.
I am not seeking mentorship or assistance. I want to collaborate with someone who has extensive experience. I believe that by combining your engineering knowledge, accumulated through years of experience, with my ability to execute quickly across the AI field, we can produce tangible results—whether that be open-source tools, a startup MVP, or a bridge connecting R&D to commercialization.
To prove my capabilities and dedication, I am willing to work without compensation initially. There are no guarantees regarding pay, equity, or promises. I will simply demonstrate the concrete results I can deliver to you. No matter how complex the task, if you provide me with a clearly defined assignment, I will complete it. Once you have verified the quality and reliability of my work, we can negotiate fair terms for future collaboration. Even if that doesn’t happen, you have nothing to lose.
I am not asking for your trust outright. I simply hope you will give me the opportunity to earn it through my work.
If you need the help of a capable professional for problem-solving or a side project, please feel free to contact me at any time. I’m ready to take on the first assignment.
Thank you.
Honda Iroban
Comparing webMCP to web accessibility for AI agents is an incredible paradigm shift. Instead of forcing LLMs to endlessly scrape DOM trees, exposing structured annotations like mcp-name is the most logical step for the agentic web. Plus, testing it with a 'LinkedIn CEO' persona is top-tier humor. Definitely checking out the repo!
Hi guys,
Check my new site zyvop.com
Please share your thoughts