Almost 20 years of experience, mostly, but not limited to Java. I started from JDK 1.5 ;)
Certified GoLang, and Scala developer, even though I don't remember much. Working as contractor, mostly
As far as I understand, instead the agent directly to parse the HTML and waste tokens, there is an MCP server that stays between and optimizes the process.
I have a selenium MCP server, who opens Chrome, clicks, fill fields, etc. And a debugger MCP server that places breakpoints, analyzes JSONs.
I am really interested to see, how can they can interact with webMCP. Instead of selenium loading and whole DOM and AI to parse and see what's inside, why not ask the webMCP to give back more structured and token-effective results, so that the AI can tell selenium where to click.
Exactly! That's what I find most interesting about it. Instead of scraping the entire page, analyzing the DOM, and essentially guessing what can be done, the agent gets explicit information about the available actions. Fewer tokens wasted, fewer opportunities for confusion, and hopefully fewer "creative" interpretations of the UI 😅
And you're right, it opens up a lot of possibilities. Someone in another comment just mentioned accessibility as well. Combined with LLM-powered assistive tools, webMCP could potentially make websites even easier to use for people, not just for agents.
I'm really curious to see where this goes. From what I've read, Google is aiming for a first stable Chrome release later this year, so we'll probably learn pretty quickly whether developers find it useful in practice.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
As far as I understand, instead the agent directly to parse the HTML and waste tokens, there is an MCP server that stays between and optimizes the process.
I have a selenium MCP server, who opens Chrome, clicks, fill fields, etc. And a debugger MCP server that places breakpoints, analyzes JSONs.
I am really interested to see, how can they can interact with webMCP. Instead of selenium loading and whole DOM and AI to parse and see what's inside, why not ask the webMCP to give back more structured and token-effective results, so that the AI can tell selenium where to click.
It opens so much options. :)
Exactly! That's what I find most interesting about it. Instead of scraping the entire page, analyzing the DOM, and essentially guessing what can be done, the agent gets explicit information about the available actions. Fewer tokens wasted, fewer opportunities for confusion, and hopefully fewer "creative" interpretations of the UI 😅
And you're right, it opens up a lot of possibilities. Someone in another comment just mentioned accessibility as well. Combined with LLM-powered assistive tools, webMCP could potentially make websites even easier to use for people, not just for agents.
I'm really curious to see where this goes. From what I've read, Google is aiming for a first stable Chrome release later this year, so we'll probably learn pretty quickly whether developers find it useful in practice.