<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Papa</title>
    <description>The latest articles on DEV Community by Papa (@papajams).</description>
    <link>https://dev.to/papajams</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3756591%2F7668ad38-c403-4b29-812f-52c067981580.png</url>
      <title>DEV Community: Papa</title>
      <link>https://dev.to/papajams</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/papajams"/>
    <language>en</language>
    <item>
      <title>Cognivern - Spend OS For Agent Teams</title>
      <dc:creator>Papa</dc:creator>
      <pubDate>Mon, 08 Jun 2026 06:47:42 +0000</pubDate>
      <link>https://dev.to/papajams/cognivern-spend-os-for-agent-teams-1852</link>
      <guid>https://dev.to/papajams/cognivern-spend-os-for-agent-teams-1852</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://cognivern.vercel.app" rel="noopener noreferrer"&gt;Cognivern&lt;/a&gt;&lt;/strong&gt; is a control plane for agent operations — a SpendOS for agent teams.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ljj5r4u03j4wote2d8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ljj5r4u03j4wote2d8m.png" alt=" " width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As AI agents proliferate across development workflows, a quiet crisis is brewing: no one really controls what agents spend, on what, on behalf of whom, or why. Every agent gets what amounts to a blank check — against model APIs, against wallets, against third-party services. Cognivern exists to fix that.&lt;/p&gt;

&lt;p&gt;The platform unifies governed wallet spend and AI spend governance across IDE, CLI, and agent workflows into a single auditable control layer. The core promise is simple: &lt;strong&gt;move fast without blank checks&lt;/strong&gt;. Every spend decision can be policy-checked, privacy-preserving, efficiency-aware, and audit-ready — before it executes.&lt;/p&gt;

&lt;p&gt;This matters especially in emerging markets and for teams building on-chain infrastructure, where cost overruns from runaway agents aren't just annoying — they're existential. You don't burn budget you don't have chasing a misconfigured prompt loop.&lt;/p&gt;

&lt;p&gt;At its core, Cognivern provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Policy evaluation&lt;/strong&gt; — enforce who/what/when rules before any spend executes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy-native operations&lt;/strong&gt; — evaluate sensitive policy context via confidential paths using Fhenix FHE (Fully Homomorphic Encryption)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI spend governance&lt;/strong&gt; — model/runtime usage visibility and optimization alongside financial controls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trails&lt;/strong&gt; — persist decision evidence (&lt;code&gt;decisionId&lt;/code&gt;, attestation, run context) for continuous accountability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-provider AI routing&lt;/strong&gt; — ChainGPT as the primary Web3-native LLM, with Fireworks, OpenAI, Gemini, Anthropic, and others as fallbacks
The stack is TypeScript + Solidity, deployed across X Layer Testnet (execution and policy), Filecoin Calibration (audit storage), and Fhenix (confidential policy state). The frontend lives at &lt;a href="https://cognivern.vercel.app" rel="noopener noreferrer"&gt;cognivern.vercel.app&lt;/a&gt; and includes a PromptOS terminal for natural-language governance interaction.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;Live app:&lt;/strong&gt; &lt;a href="https://cognivern.vercel.app" rel="noopener noreferrer"&gt;cognivern.vercel.app&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🔗 &lt;strong&gt;API:&lt;/strong&gt; &lt;a href="https://cognivern.thisyearnofear.com" rel="noopener noreferrer"&gt;cognivern.thisyearnofear.com&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🔗 &lt;strong&gt;PromptOS Terminal:&lt;/strong&gt; &lt;a href="https://cognivern.vercel.app/os" rel="noopener noreferrer"&gt;cognivern.vercel.app/os&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🔗 &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/thisyearnofear/cognivern" rel="noopener noreferrer"&gt;github.com/thisyearnofear/cognivern&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key flows you can explore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Submit a spend request through the dashboard and watch policy evaluation fire in real time&lt;/li&gt;
&lt;li&gt;Use the PromptOS terminal to interact with governance rules in natural language&lt;/li&gt;
&lt;li&gt;Inspect the audit log — every decision has a &lt;code&gt;decisionId&lt;/code&gt; and attestation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - Try the encrypted spend path (Fhenix), where policy is evaluated over encrypted inputs — the server never sees the raw values
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;p&gt;Cognivern started as a hackathon project with a clear thesis but rough edges everywhere. The core governance loop worked, but it was held together with duct tape: no proper workspace isolation, no rate limiting, brittle contract interactions, and a frontend that was functional but not something you'd confidently hand to an operator.&lt;/p&gt;

&lt;p&gt;Here's what changed during the finish-up:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure hardening&lt;/strong&gt; — Added per-workspace and per-API-key rate limiters with sliding windows, deep health checks, and circuit-breaker patterns. Moved to TypeScript strict mode throughout. Built out a unified CI pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;172 tests&lt;/strong&gt; — Unit, integration, and E2E via Playwright. The project went from "it works on my machine" to something with real coverage guarantees.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-workspace and policy versioning&lt;/strong&gt; — Each workspace now has independent API keys, rate limits, and a full policy version history. This was the feature that turned a demo into something a real team could adopt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fhenix Wave 5–7&lt;/strong&gt; — The FHE integration went from a proof-of-concept to a full institutional demo: encrypted policies, MEV-protected execution, selective auditor disclosure, two-phase FHE resolution with &lt;code&gt;resolveDecision&lt;/code&gt;, sealed-bid vendor selection, and a Privara confidential payroll flow. Also migrated from Helium testnet to Arbitrum Sepolia.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ChainGPT integration&lt;/strong&gt; — Brought in ChainGPT as the primary AI provider for Web3-native governance queries, with the Smart Contract Auditor running as runtime pre-spend defense. This felt like the missing piece — governance AI that actually understands on-chain context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operator UX&lt;/strong&gt; — PromptOS terminal integrated into the sidebar, voice input via ElevenLabs STT, self-service onboarding flow, animated workspace mode toggles, full mobile responsiveness.&lt;/p&gt;

&lt;p&gt;The project went from ~60% production-ready to ~93%. The remaining 7% is mostly production key management and a few contract audit items before mainnet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx30cvuw5p8sgsh1j8c5v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx30cvuw5p8sgsh1j8c5v.png" alt=" " width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  My Experience with GitHub Copilot
&lt;/h2&gt;

&lt;p&gt;Cognivern is a project with a lot of moving parts — Solidity contracts, TypeScript APIs, multi-chain deployment scripts, FHE integration, and a React frontend — often all in motion at the same time. Copilot was the connective tissue that kept things moving without constant context-switching tax.&lt;/p&gt;

&lt;p&gt;A few specific ways it earned its keep:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boilerplate elimination for the governance endpoints.&lt;/strong&gt; The API has 12+ endpoints with consistent patterns — request validation, policy lookup, decision logging, response shaping. Writing the first one from scratch was fine; Copilot handled the rest, often getting the full shape right on the first suggestion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solidity contract work.&lt;/strong&gt; The &lt;code&gt;ConfidentialSpendPolicy&lt;/code&gt; contract for Fhenix was genuinely novel — FHE operations aren't something most developers have pattern-matched on. Copilot's suggestions weren't always right, but they were useful scaffolding that surfaced the right questions. The back-and-forth of accepting, rejecting, and editing suggestions was faster than writing from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test generation.&lt;/strong&gt; Getting to 172 tests would have taken much longer without Copilot helping generate test cases from the function signatures and existing test patterns. It's particularly good at the "write 10 edge case tests for this validator" kind of ask.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;README and documentation.&lt;/strong&gt; The architecture docs, developer guide, and deployment docs are detailed. Copilot helped maintain consistent voice and structure across them, and was surprisingly good at inferring the right level of technical detail for each audience.&lt;/p&gt;

&lt;p&gt;The honest take: Copilot didn't make hard architectural decisions easier. The FHE integration design, the multi-chain deployment strategy, the policy versioning data model — those required real thinking. But it absorbed a huge amount of the mechanical work and kept me in flow during the push to get this finished.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Find me on &lt;a href="https://farcaster.xyz/papa" rel="noopener noreferrer"&gt;Farcaster&lt;/a&gt; and &lt;a href="https://palus.app/u/papajams" rel="noopener noreferrer"&gt;Lens&lt;/a&gt; — always building at the intersection of AI, emerging markets, and on-chain infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>githubchallenge</category>
      <category>githubcopilot</category>
    </item>
    <item>
      <title>DiversiFi — Finishing What Inflation Started</title>
      <dc:creator>Papa</dc:creator>
      <pubDate>Mon, 08 Jun 2026 06:34:44 +0000</pubDate>
      <link>https://dev.to/papajams/diversifi-finishing-what-inflation-started-9mb</link>
      <guid>https://dev.to/papajams/diversifi-finishing-what-inflation-started-9mb</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://diversifiapp.vercel.app" rel="noopener noreferrer"&gt;DiversiFi&lt;/a&gt; is an AI-powered stablecoin diversification app built on Celo and Arbitrum. The premise is simple but personal: your stablecoins shouldn't all be pegged to the dollar.&lt;/p&gt;

&lt;p&gt;If you live in Kenya — as I do — inflation isn't an abstract macroeconomic concept. It's the gap between what you earned last year and what that money buys today. It's the reason holding savings in a local currency account quietly destroys purchasing power, and why stablecoins feel like a genuine unlock: your savings can actually compound instead of erode.&lt;/p&gt;

&lt;p&gt;But even dollar-pegged stables have their own exposure. And if you care about your continent — about African economies developing their own financial infrastructure, about emerging markets building on-chain alternatives to broken legacy rails — then a portfolio that's 100% cUSD is both financially incomplete and ideologically inconsistent.&lt;/p&gt;

&lt;p&gt;DiversiFi tries to fix both problems at once. Connect a wallet, pick a financial philosophy, deposit stablecoins into a non-custodial Safe smart account, and let an AI agent rebalance your holdings across regional stablecoins — cUSD (US), cEUR (EU), KESm (Kenya), COPm (Colombia), PHPm (Philippines), cREAL (Brazil) — based on live inflation and economic data.&lt;/p&gt;

&lt;p&gt;The agent doesn't just chase yield. It reads governance forums, World Bank inflation feeds, and economic signals to make allocation decisions that reflect both the numbers and the philosophy you've chosen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Africapitalism&lt;/strong&gt; — keep wealth circulating in African economies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Islamic Finance&lt;/strong&gt; — Sharia-compliant, no interest-bearing assets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buen Vivir&lt;/strong&gt; — LatAm philosophy balancing material wealth with community wellbeing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Global Diversification&lt;/strong&gt; — maximum geographic spread&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom&lt;/strong&gt; — define your own allocation targets
This isn't cosmetic. Each philosophy filters which assets the agent can touch, how it weights rebalancing recommendations, and what it rules out entirely. The goal is a tool that reflects how real people in real places actually think about money — not just a generic robo-advisor with a world-map splash screen.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built by &lt;a href="https://farcaster.xyz/papa" rel="noopener noreferrer"&gt;@papajams&lt;/a&gt; · &lt;a href="https://palus.app/u/papajams" rel="noopener noreferrer"&gt;Lens&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;Live app:&lt;/strong&gt; &lt;a href="https://diversifiapp.vercel.app" rel="noopener noreferrer"&gt;diversifiapp.vercel.app&lt;/a&gt;&lt;br&gt;
📦 &lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/thisyearnofear/diversify" rel="noopener noreferrer"&gt;github.com/thisyearnofear/diversify&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvtgoc56byz3ubkqsmwq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvtgoc56byz3ubkqsmwq.png" alt=" " width="800" height="839"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbl7ddx9ibvcol97dzp4k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbl7ddx9ibvcol97dzp4k.png" alt=" " width="800" height="1307"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;p&gt;DiversiFi started as a hackathon prototype — the kind that works well enough for a 3-minute pitch but quietly falls apart the moment you try to actually use it.&lt;/p&gt;

&lt;p&gt;The core flows were broken. The agent could recommend rebalances but couldn't reliably execute them. The permission system — the piece that makes this non-custodial and therefore trustworthy — was wired up but unenforced, which defeated the whole point. The UI showed allocation targets but gave no real-time feedback on what the agent was actually doing. And the financial strategy layer was mostly decorative; it influenced the copy, not the code.&lt;/p&gt;

&lt;p&gt;The push to actually finish it came from submitting to the &lt;a href="https://luma.com/ETHMX2026" rel="noopener noreferrer"&gt;Ethereum México x Bitso Hackathon&lt;/a&gt; — a 5-week global build sprint at the intersection of AI, stablecoins, and payments, with Bitso as a key integration partner and 20% of judging weighted on LATAM real-world impact. Having real mentors and a live demo day in front of regulators and fund managers has a way of clarifying what "done" actually means.&lt;/p&gt;

&lt;p&gt;Here's what changed:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Execution layer fixed.&lt;/strong&gt; &lt;code&gt;_executor.ts&lt;/code&gt; now correctly bridges the vault service to the chain via Privy smart accounts, with a local dev fallback that doesn't require a full smart account setup to test against.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permission model enforced.&lt;/strong&gt; Session signer policies now actually gate what the agent can spend, on which contracts, within what time bounds. The agent cannot exceed user-defined limits. This is the difference between "non-custodial" as a marketing claim and non-custodial as an architectural guarantee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategy wired into agent behaviour.&lt;/strong&gt; Each financial philosophy now filters and weights rebalance recommendations at the &lt;code&gt;vault.service.ts&lt;/code&gt; level. Africapitalism doesn't just change the UI label — it changes which assets the agent will and won't touch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real transaction receipts.&lt;/strong&gt; Transactions now log through OpenClaw with human-readable summaries. Users can see exactly what the agent did, why, and when — not just a tx hash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bitso integration.&lt;/strong&gt; Added Bitso as a payment rail, bridging fiat on-ramps to on-chain stablecoin positions. For LATAM users this matters enormously: getting funds into the protocol shouldn't require already being crypto-native.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Expanded to Arbitrum.&lt;/strong&gt; Extended beyond Celo to support Arbitrum, broadening the asset universe and giving users access to deeper liquidity pools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fee model stabilised.&lt;/strong&gt; 1% annual management + 10% performance above high-water mark + 0.10% swap spread, now calculated and settled correctly at withdrawal rather than estimated and forgotten.&lt;/p&gt;

&lt;p&gt;The project went from a prototype that made a good pitch to something I'd actually trust with a real deposit.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Experience with GitHub Copilot
&lt;/h2&gt;

&lt;p&gt;I used Copilot Chat throughout the finishing process — primarily for architecture and debugging, less as a code generator and more as a thinking partner when things got tangled.&lt;/p&gt;

&lt;p&gt;The most valuable moments were in the permission and execution layers, which are genuinely non-trivial. ERC-4337 smart accounts, session signer policies, Privy's secure enclave model — these interact in ways that aren't obvious, and when something breaks the error messages are often unhelpfully cryptic. Being able to paste a stack trace or policy config into Copilot Chat and get a focused hypothesis about what was failing saved real time that would otherwise have gone into reading SDK internals line by line.&lt;/p&gt;

&lt;p&gt;I also used it to pressure-test the security model. Walking through the architecture — user controls Safe, agent signs within policy, no private key on server — and asking Copilot to look for holes surfaced a few edge cases around policy expiry and fallback signing I hadn't thought through carefully enough. Having something push back on your assumptions is underrated.&lt;/p&gt;

&lt;p&gt;It's not magic. It didn't know Mento Protocol's quirks or Celo's specific bundler constraints out of the box. But as a tool for reasoning through complex, interlocking systems — rather than just autocompleting boilerplate — Copilot Chat earned its place in this build.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Find me on &lt;a href="https://farcaster.xyz/papa" rel="noopener noreferrer"&gt;Farcaster&lt;/a&gt; and &lt;a href="https://palus.app/u/papajams" rel="noopener noreferrer"&gt;Lens&lt;/a&gt; — always building at the intersection of AI, emerging markets, and on-chain infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
    </item>
    <item>
      <title>WebMCP Might Be the Most Important Announcement at Google I/O 2026</title>
      <dc:creator>Papa</dc:creator>
      <pubDate>Mon, 25 May 2026 00:49:35 +0000</pubDate>
      <link>https://dev.to/papajams/webmcp-might-be-the-most-important-announcement-at-google-io-2026-1gfh</link>
      <guid>https://dev.to/papajams/webmcp-might-be-the-most-important-announcement-at-google-io-2026-1gfh</guid>
      <description>&lt;p&gt;Every few years a technology shows up that looks like a product but is actually a protocol. When that happens, the product gets forgotten and the protocol becomes infrastructure. Google I/O 2026 had one of those moments. It just didn't get treated like one.&lt;/p&gt;

&lt;p&gt;The models were impressive. Gemini 3.5 Flash is four times faster than its predecessors. Antigravity 2.0 makes agent orchestration feel like something you'd actually ship. AI Studio now deploys to Cloud Run in one click. None of it was architecturally surprising. But buried in the developer sessions was something different: WebMCP, a proposed open standard for exposing structured tools to browser-based AI agents.&lt;/p&gt;

&lt;p&gt;That one is worth sitting with.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Failure Mode Everyone Already Knows
&lt;/h2&gt;

&lt;p&gt;If you have ever maintained Selenium automation for more than six months, you already understand the problem WebMCP is trying to solve.&lt;/p&gt;

&lt;p&gt;The automation works until the product team redesigns the checkout page. Then the selector breaks. You fix it. Three weeks later the login flow changes. You fix it again. You are not engineering anything — you are running a permanent rearguard action against a UI that was never designed to stay still. The automation is fragile because it is built on inference: your code is guessing at intent by reading presentation.&lt;/p&gt;

&lt;p&gt;The first generation of browser AI agents have exactly this problem, at larger scale and higher stakes. They can see buttons and forms and navigation menus, and they can click on things, but they are always one redesign away from failing. They are imitating human behavior because the web has never offered them an alternative.&lt;/p&gt;

&lt;p&gt;Imagine booking a flight through an agent today. The agent visually searches for departure fields, date pickers, seat selectors, and payment buttons. Every redesign risks breaking the workflow. Under WebMCP, the airline could expose booking itself as a structured capability: destination, dates, passenger count, seat preferences, payment authorization. The agent stops navigating the interface and starts interacting with the system underneath it.&lt;/p&gt;

&lt;p&gt;WebMCP is the alternative.&lt;/p&gt;

&lt;p&gt;The standard lets web developers expose structured tools — JavaScript functions, typed parameters, form interactions — as machine-readable capabilities. Instead of an agent inferring "this is probably a search box" by parsing the DOM, the site simply declares: here is a search function, here are its inputs, here is what it returns. Declarative for standard interactions, imperative for anything requiring runtime JavaScript. Chrome's experimental origin trial starts in Chrome 149.&lt;/p&gt;

&lt;p&gt;The immediate gain is reliability. But that is not the interesting part.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changes Under the Surface
&lt;/h2&gt;

&lt;p&gt;Websites have always been designed around visibility. If a human could see and operate something, the web had succeeded. That assumption ran so deep it was invisible — interfaces were presentation layers, and making them look right was the whole job.&lt;/p&gt;

&lt;p&gt;WebMCP introduces a different assumption: systems may not need to be visually navigable to be operationally useful. The interface stops being primarily a presentation layer and starts being a capability surface.&lt;/p&gt;

&lt;p&gt;That is a significant mutation.&lt;/p&gt;

&lt;p&gt;An airline site exposing a structured booking capability is no longer just a place you visit. It becomes a service an agent can call directly. The distinction between website and API starts to blur at the protocol level, not just for developers, but for the web itself.&lt;/p&gt;

&lt;p&gt;There is historical precedent for this shift.&lt;/p&gt;

&lt;p&gt;RSS made web content machine-readable. A feed reader did not have to scrape a blog and guess where the article title ended and the sidebar began. The site simply exposed structure directly. RSS eventually collapsed as a consumer technology, but the idea it proved — that structured syndication beats scraping — became foundational to modern content APIs.&lt;/p&gt;

&lt;p&gt;WebMCP does for actions what RSS did for content.&lt;/p&gt;

&lt;p&gt;That distinction matters enormously.&lt;/p&gt;

&lt;p&gt;Content syndication is passive. The machine reads what a human wrote. Action exposure is active — the machine performs operations on a user's behalf, with real-world consequences. The jump from "readable" to "actionable" changes the ontology of the web itself.&lt;/p&gt;

&lt;p&gt;This is what Google is quietly building toward.&lt;/p&gt;

&lt;p&gt;Antigravity 2.0 orchestrates agents. Gemini Spark acts across Gmail, Calendar, and eventually third-party tools via MCP. But agent workflows are only as reliable as the surfaces they operate on. The whole agentic stack presupposes that websites will eventually expose structured interfaces for machine consumption.&lt;/p&gt;

&lt;p&gt;WebMCP is the specification for what that looks like on the open web.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Critique You Have to Make
&lt;/h2&gt;

&lt;p&gt;Here is where most conference coverage goes soft.&lt;/p&gt;

&lt;p&gt;WebMCP only matters if adoption follows. An open standard with one browser behind it and no ecosystem buy-in is just a Chrome experiment. The history of proposed web standards is mostly a graveyard of promising ideas that died waiting for critical mass, or got implemented inconsistently enough that developers ended up writing workarounds anyway — which is to say, they ended up back at the Selenium problem.&lt;/p&gt;

&lt;p&gt;Google has enough platform leverage to push Chrome 149 to most of the world's browsers in six months. It does not have the same leverage over every site that agents will need to use. The gap between "here is a standard" and "here is a standard that Stripe and Shopify and healthcare portals have implemented correctly" is years of developer effort and business negotiation. Nothing about announcing a standard compresses that timeline.&lt;/p&gt;

&lt;p&gt;There is also a safety question the I/O coverage largely sidesteps.&lt;/p&gt;

&lt;p&gt;Structured tool exposure is a double-sided surface. Right now browser agents are limited partly for the same reason they are safe: they cannot do that much. A web where every site exposes clean, machine-actionable capabilities is a web where the blast radius of a compromised or misbehaving agent gets significantly larger.&lt;/p&gt;

&lt;p&gt;The permissions model, the consent model, the audit trail — none of that is solved by declaring "here are the actions this site supports." If anything, it sharpens the accountability question.&lt;/p&gt;

&lt;p&gt;The infrastructure is arriving faster than the trust guarantees.&lt;/p&gt;

&lt;p&gt;That is the honest summary of where agentic development actually sits right now. Not just for WebMCP — for all of it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Is Still the Story
&lt;/h2&gt;

&lt;p&gt;None of those concerns make WebMCP less important. They make it more important to track carefully.&lt;/p&gt;

&lt;p&gt;The DEV community's instinct after I/O was telling. The submissions that resonated were not about model benchmarks. They were about infrastructure, about privacy, about frameworks designed for machines as much as humans. That pattern is not accidental.&lt;/p&gt;

&lt;p&gt;Developers who ship things for a living have a reliable nose for where the actual work is going to land, and right now that nose is pointing at integration — not intelligence.&lt;/p&gt;

&lt;p&gt;The capability problem is closer to solved than most people want to admit. Models reason well. Models act. What remains unsolved is making those actions reliable, auditable, and safe at scale.&lt;/p&gt;

&lt;p&gt;That is an infrastructure problem.&lt;/p&gt;

&lt;p&gt;And infrastructure problems get solved by protocols, not products.&lt;/p&gt;

&lt;p&gt;WebMCP is an early answer to the question of what reliable agent-web interaction should look like. It will probably not be the final answer. RSS wasn't either. But RSS proved the idea was viable, and everything that followed built on that proof.&lt;/p&gt;

&lt;p&gt;The original web connected documents.&lt;/p&gt;

&lt;p&gt;The next version may connect capabilities — not just for humans navigating pages, but for agents executing intent.&lt;/p&gt;

&lt;p&gt;The web was built for humans to navigate.&lt;/p&gt;

&lt;p&gt;The next version may be built for agents to operate.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Submitted for the Google I/O 2026 Writing Challenge on DEV.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>google</category>
      <category>ai</category>
      <category>web</category>
      <category>techtalks</category>
    </item>
  </channel>
</rss>
