DEV Community: Om Shree

The Protocol Consolidates: Five Core Industries Just Adopted the Model Context Protocol (MCP)

Om Shree — Tue, 02 Jun 2026 14:44:09 +0000

The battle for AI dominance is no longer waged purely on model weights or parameter counts. Instead, it is being decided at the integration layer. For platform architects and developers, the friction of writing bespoke, fragile API glue for every new LLM or enterprise tool has been a persistent bottleneck.

The Model Context Protocol (MCP) has emerged as the universal integration standard designed to solve this. In a massive wave of ecosystem maturity, five major engineering and enterprise platforms spanning Advertising, Web3/DeFi, DevSecOps, Community, and Cloud Observability have simultaneously shipped native MCP server integrations.

By exposing their core platforms as protocol-compliant context layers, these companies are shifting the industry from static dashboards to active, agentic engineering swarms. Here is a deep dive into what was just released.

1. Marketing Automation: AdRoll Brings "Draft-First" Controls to AI

Moving from analytical data to campaign execution inside advertising platforms typically involves heavy CSV exporting and manual dashboard navigation. AdRoll has closed this gap by launching its AdRoll MCP Server in open beta.

The Capability: Marketers can connect their AdRoll accounts directly to MCP-native environments like Claude, ChatGPT, or Cursor. Using natural language, agents can fetch real-time multi-channel metrics, run week-over-week conversion trends, and surface Account-Based Marketing (ABM) intent signals.
The Safety Rail: Crucially, the server supports draft-first campaign creation. If an agent identifies an optimization opportunity based on performance logs, it builds and stages a campaign draft inside AdRoll for human review rather than altering live budgets autonomously.

2. Web3 & Decentralized Finance: Base Launches "Base MCP" Onchain Gateway

Coinbase’s Layer 2 ecosystem, Base, has launched Base MCP, an onchain gateway that turns conversational interfaces into fully functional, secure web3 wallets.

The Capability: Rather than forcing users to manually interact with fractured dApp UIs, Base MCP exposes native wallet capabilities—such as portfolio tracking, token swaps, and fund transfers—directly to language models. From day one, it embeds pre-built skill plugins for major DeFi protocols including Uniswap, Aerodrome, Morpho, and Moonwell.
The Safety Rail: Base MCP introduces a stored requests primitive built on OAuth 2.1. The MCP server never touches or stores private keys. When an agent initiates a swap or transfer, it structures the unsigned payload locally and passes back a secure link, requiring the user to manually review, simulate asset impact, and sign the transaction via their wallet.

3. Application Security: Detectify Embeds the "Find & Fix" Security Loop

As autonomous coding agents generate and push code at unprecedented volumes, traditional security review cycles are falling behind. Detectify has addressed this by launching the Detectify MCP Server to embed real-time vulnerability validation directly into the autonomous software development lifecycle (SDLC).

The Capability: Coding agents working inside an IDE or CI environment can query Detectify's scanning engines dynamically to check for exploitable vulnerabilities.
The Deterministic Moat: LLMs are inherently probabilistic, which makes them notoriously poor at verifying security exploits definitively. The Detectify MCP server acts as a deterministic oracle. Through its Find & Fix automation, a coding agent can receive a vulnerability report from Detectify, draft an inline code patch, trigger a targeted Detectify validation scan, and present a verified, compile-clean fix for human sign-off.

4. Enterprise Observability & Service Mesh: Red Hat Kiali Brings AI to OpenShift

Managing microservice topologies, tracing request latencies, and debugging mutual TLS (mTLS) configurations across thousands of Kubernetes pods is an SRE's heaviest cognitive load. Red Hat has entered Tech Preview with its MCP Server for Red Hat OpenShift, shipping a deep integration with the Kiali service mesh toolset.

The Capability: By upgrading Kiali to v2.25+, platform teams can connect their cluster context directly to AI assistants via tools like OpenShift Lightspeed. The integration exposes specialized tools like traffic_graph and mesh_status.
The SRE Use Case: An operator can ask, "Why is the checkout service degrading in the production namespace?" The agent utilizes the Kiali tools to visualize service-to-service dependencies, isolates a specific network hop causing latency, pulls distributed traces via ossm_list_traces, and generates the precise Istio traffic-routing patches needed to remediate the failure in real time. All of this runs inside standard Kubernetes RBAC constraints with strict audit log tracking.

5. Community & Digital Experience: Higher Logic Vanilla Connects the Feedback Loop

Customer community platforms are often isolated from the rest of the engineering and product lifecycle. Higher Logic Vanilla has closed this loop by shipping its native MCP server integration, exposing community knowledge bases, forum threads, and user sentiment analytics to the broader enterprise AI context.

The Capability: Support, product, and engineering agents can query user forums directly from their native operational workspaces. By allowing an LLM to index community feedback side-by-side with internal task tracking (like Jira or GitHub Issues), product teams can autonomously categorize bug reports, track common friction points, and surface localized feature requests without running manual scraping scripts.

The Architectural Trend: The API Is for the Agent

This massive cross-industry rollout confirms a major architectural shift: the standard JSON/REST API is being abstracted by the Protocol.

When an advertising platform, a layer-2 blockchain, an application security engine, a Kubernetes service mesh, and an enterprise forum provider all adopt the exact same interface standard, the engineering landscape changes fundamentally. Developers are no longer writing custom integration wrappers. Instead, they are deploying autonomous swarms that can jump from optimizing an ad campaign, to verifying a security patch, to debugging a distributed container mesh—all through a single, unified protocol context layer.

Gated Frontiers: Inside OpenAI’s Rosalind Biodefense Initiative and the Shift Toward Controlled AI Distribution

Om Shree — Tue, 02 Jun 2026 14:39:44 +0000

When deploying frontier AI, the standard tech playbook typically favors raw scale and rapid, democratic distribution. However, when a model’s core competency shifts from writing copy to reasoning deeply about proteins, genomes, and cellular mechanisms, the traditional open-access model breaks down entirely. Dual-use biology—where the exact same insights can either synthesize a vaccine or optimize a pathogen—requires a completely different structural approach.

Addressing this reality, OpenAI has launched the Rosalind Biodefense Program. Built as an institutional access layer around GPT-Rosalind (OpenAI’s highly specialized, domain-frontier reasoning model for the life sciences), this initiative bypasses the public API entirely. Instead, it establishes a subsidized, heavily audited framework that embeds advanced AI directly into global public health and national security infrastructure.

For software engineers, biosecurity developers, and research architects, this launch marks the arrival of a new paradigm: Defensive Acceleration via Closed-Loop Infrastructure.

1. The Core Architecture: GPT-Rosalind’s Specialized Capabilities

Unlike standard large language models, GPT-Rosalind is built for long-horizon scientific reasoning. Rather than treating molecular biology as a raw text tokenization problem, its underlying weights are deeply optimized to reason about sequences, structure predictive biochemical hypotheses, and coordinate complex wet-lab experimental workflows.

                 ┌────────────────────────────────┐
                 │       OpenAI GPT-Rosalind      │
                 └──────────────┬─────────────────┘
                                │
         ┌──────────────────────┼──────────────────────┐
         ▼                      ▼                      ▼
┌──────────────────┐   ┌──────────────────┐   ┌──────────────────┐
│  Epidemiological │   │     Sequence     │   │     Codex Lab    │
│    Surveillance  │   │  Threat Screening│   │   Plugin Layer   │
└──────────────────┘   └──────────────────┘   └──────────────────┘

The system integrates directly with scientific tooling through a dedicated Codex plugin layer, enabling it to function as a software companion for automated assay designs, data harmonization, and real-time threat identification.

2. The Institutional Grid: LLNL, Johns Hopkins APL, and CEPI

To validate the model's utility without expanding the biological threat surface, OpenAI is deploying the framework through a carefully curated network of elite federal, academic, and global health partners.

🔬 Lawrence Livermore National Laboratory (LLNL)

At LLNL—one of the U.S. Department of Energy’s primary national security laboratories—researchers are integrating GPT-Rosalind with advanced physics and molecular simulation engines. The objective is to dramatically accelerate countermeasure discovery: compressing the months-long workflow of interpreting complex experimental data, isolating viable therapeutic candidates, and simulating interaction dynamics down to a matter of days.

🧬 Johns Hopkins Applied Physics Laboratory (APL)

Johns Hopkins APL is deploying the model within its high-throughput protein-engineering platforms. By leveraging the model’s unique reasoning loops, the lab aims to rapidly screen mutant enzymes. This allows defense teams to preemptively characterize emerging biothreats and design targeted therapeutic countermeasures before an anomaly ever manifests in a live population.

💉 [Coalition for Epidemic Preparedness Innovations (CEPI)

](https://www.gentoro.com/blog/agentic-commerce/)
On the global defense plane, CEPI is utilizing GPT-Rosalind to support its flagship 100 Days Mission—a coordinated global initiative to develop and scale viable vaccine candidates within 100 days of a novel pathogen's identification. The model acts as a core accelerant for literature synthesis, protocol design, and structural evaluation.

3. The Deployment Playbook: Gated Access Control as a Core Product Feature

For platform developers, the operational mechanics of the Rosalind Biodefense Program provide a clear blueprint for how frontier AI will likely be deployed in high-consequence, heavily regulated spaces like defense, finance, and critical infrastructure.

OpenAI is implementing a multi-layered security and access architecture:

Sponsored Onboarding, Rigid Vetting: Access is entirely subsidized by OpenAI for trusted developers (including specialized biosecurity startups like Fourth Eon, SecureDNA, and SecureBio) but requires strict, non-public vetting standards and alignment with clear public-benefit goals.
Pre-Deployment Red Teaming: Independent, domain-expert red teams constantly stress-test prompt injection vectors and evaluate model responses for dual-use risk before any operational deployments go live.
Function-Based Sandbox Isolation: Approved applications run in specialized, isolated sandboxes. For instance, when developers use the tool for automated DNA synthesis screening, the model analyzes sequences and generates threat assessments within a perimeter that strictly limits direct, unmonitored molecule or pathogen generation.
Continuous Revocation Capabilities: OpenAI maintains a centralized kill-switch. If an endpoint exhibits anomalous telemetry or behavior indicative of an adversarial data-extraction attempt, access can be revoked globally and instantly.

The Big Picture: The Bifurcation of Frontier AI

The Rosalind Biodefense initiative confirms that we are moving away from a world where a single, omnibus public API handles every workload from writing marketing emails to designing vaccines.

By separating its consumer-facing models from domain-specific national security engines like GPT-Rosalind, OpenAI is creating a two-tier ecosystem. For builders, this underscores a critical architectural truth: in high-stakes fields, the robustness of your security boundaries, the auditability of your event logs, and your data-vetting workflows are just as vital to your product's success as the underlying raw capabilities of your model.

The IDE is the New Cloud Console: Inside the Azure SRE MCP Server

Om Shree — Tue, 02 Jun 2026 14:33:32 +0000

Microsoft is bridging the gap between cloud governance and local development environments by launching a dedicated Azure SRE Model Context Protocol (MCP) Server.

By bringing Azure’s control plane directly into the IDE and desktop chat interface, developers and site reliability engineers (SREs) can orchestrate complex infrastructure tasks, triage active outages, and audit live environments using tools like VS Code and Claude Desktop without ever leaving their terminal.

Here is an architectural teardown of how the Azure SRE MCP Server transforms operations into a safe, agentic workflow.

1. Unified Cloud Operations via the IDE Context

Managing modern cloud infrastructure typically forces engineers to juggle multiple windows: an IDE for infrastructure-as-code (IaC), Azure Portal for log monitoring, and communication channels like PagerDuty or Slack for incident handling.

The Azure SRE MCP Server (@azure/mcp-server-sre) eliminates this fragmentation by wrapping the Azure Resource Manager (ARM) API and Azure Monitor into a suite of standard protocol tools.

┌────────────────────────────────────────────────────────┐
│               Azure SRE MCP Server Layer               │
└──────────────────────────┬─────────────────────────────┘
                           │
      ┌────────────────────┼────────────────────┐
      ▼                    ▼                    ▼
[Incident Triage]    [Safe Provisioning]   [Architecture Audit]
 Log Analytics &      Incremental Bicep     Live Topologies &
 Metric Tracking       Dry-runs & Apply     Compliance Scans

2. Deep Dive: Core Operational Capabilities

The server exposes specialized tools designed to handle telemetry ingestion, infrastructure mutations, and systemic architecture analysis safely.

🚨 Autonomous Incident Triage

When a critical alert triggers, an AI assistant connected to the Azure SRE server can instantly ingest the context and execute localized diagnosis:

Log Ingestion: It pulls from Azure Log Analytics tables using native Kusto Query Language (KQL) parsing to isolate specific exception stack traces.
Telemetry Analysis: The agent can query Azure Monitor Metrics to correlate the timing of the spike with recent deployment events.
Example Query: "Analyze the last 15 minutes of logs for the prod-auth-app App Service, find the source of the 5xx errors, and check if any traffic routing weights were changed recently."

🛠️ Safe Infrastructure Provisioning

Instead of blindly writing and pushing untested infrastructure changes to a CI/CD pipeline, the MCP server allows for safe, inline workspace testing.

Bicep/ARM Pre-flight Validations: An agent can draft an infrastructure modification (e.g., adding a georeplicated read-replica to an Azure Cosmos DB instance), generate the required Bicep files, and execute an Azure What-If operation to visualize the exact structural blast radius.
Controlled Execution: Under human-in-the-loop authorization, the tool can deploy micro-resources directly to sandbox or staging environments for instant feedback.

📐 Structural Architecture Auditing

For onboarding developers or cloud architects, understanding a massive legacy deployment is incredibly difficult. The server allows agents to map the infrastructure out programmatically:

Topology Discoverability: It can query Azure Resource Graph to list resource groups, trace internal network security group (NSG) rules, and flag orphaned disks.
Security & Cost Optimizations: The server taps into Azure Advisor recommendations, allowing an engineer to ask: "Scan our active Kubernetes clusters (AKS) for public IP exposures and list any compute nodes running under 5% utilization."

3. Production Hardening: Security & Governance

Giving an AI assistant access to a cloud platform requires strict architectural guardrails. Microsoft has built the Azure SRE MCP Server to inherit enterprise-grade security models implicitly:

Strict Identity Pass-through: The MCP server does not rely on static connection strings or universal administrative master keys. It inherits the local machine's active az cli session credentials. If a developer does not have write permissions to a production subscription, their AI assistant cannot mutate it.
Granular RBAC Mapping: SRE teams can enforce precise Role-Based Access Control (RBAC). For example, a developer's local agent can be restricted to the Monitoring Reader and Reader roles, completely stripping its capability to perform destructive actions while preserving diagnostic access.
Audit Trail Integration: Because every protocol call translates into authenticated ARM API requests underneath, every single tool execution, query, or configuration shift is comprehensively logged in Azure Activity Logs for compliance auditing.

Getting Started: Integrating into Claude Desktop

To run the server locally, you can initialize it using the Node package runner (npx). Ensure you are authenticated via the Azure CLI (az login) first.

Add the configuration snippet below to your local claude_desktop_config.json configuration file:

{
  "mcpServers": {
    "azure-sre-ops": {
      "command": "npx",
      "args": [
        "-y",
        "@azure/mcp-server-sre"
      ],
      "env": {
        "AZURE_TENANT_ID": "your-tenant-id-here",
        "AZURE_DEFAULT_SUBSCRIPTION_ID": "your-subscription-id-here"
      }
    }
  }
}

By turning the cloud console into a conversational, programmable context layer, Microsoft is making cloud infrastructure easier to manage. Complex debugging tasks that used to require clicking through multiple portal dashboards can now be performed instantly with a simple, direct prompt in your workspace.

The Kubernetes Native Layer for AI: Google Open-Sources Agent eXecutor (AX)

Om Shree — Tue, 02 Jun 2026 14:28:17 +0000

The AI ecosystem is rapidly shifting from ephemeral, single-turn chatbots to autonomous, distributed software agents that execute complex operations over hours, days, or weeks. For site reliability engineers (SREs) and platform architects, this shift introduces massive challenges: state drift, network dropouts, untrusted code execution, and unmanageable infrastructure costs.

To bridge this production readiness gap, Google has open-sourced Agent eXecutor (AX) under the Apache 2.0 license. Written in Go, AX is a Kubernetes-native, distributed runtime standard built specifically to schedule, isolate, persist, and scale long-running agentic workloads across enterprise data planes.

Here is a deep dive into the architecture of AX and why it represents the infrastructure blueprint for production-grade AI.

1. The Core Architecture: Durable Execution and Resumption

Existing orchestration frameworks excel at prototyping agent logic but often fail under real-world infrastructure failures. If a container restarts or a network timeout occurs mid-task, the agent state is lost.

AX treats agents as stateful, resilient microservices. It provides out-of-the-box durability through two architectural pillars:

                  ┌──────────────────────────────┐
                  │          AX Router           │
                  └──────────────┬───────────────┘
                                 │ (Resumable Streams)
                                 ▼
                  ┌──────────────────────────────┐
                  │        AX Controller         │
                  │  (Single-Writer, Event Log)  │
                  └──────────────┬───────────────┘
         ┌───────────────────────┼───────────────────────┐
         ▼                       ▼                       ▼
 ┌──────────────┐        ┌──────────────┐        ┌──────────────┐
 │ Isolated Worker│      │ Isolated Worker│      │ Native MCP   │
 │   (Agent)    │        │    (Skill)   │        │   Server     │
 └──────────────┘        └──────────────┘        └──────────────┘

The Event Log & Snapshotting

AX intercepts all context modifications, tool calls, and LLM completions, committing them to a high-throughput durable event log managed by a Single-Writer architecture. If an agent crashes or is descheduled by Kubernetes, a new worker spins up, replays the event log, and resumes execution seamlessly without repeating expensive LLM calls or duplicating external API mutations.

Connection Recovery & Resumable Streams

When building long-running workflows, client-to-agent disconnects are guaranteed to happen. AX routes client communications via resumable streams. If a network boundary drops, the client simply reconnects to the AX Controller, which automatically backfills all events missed during the outage window.

2. Native Model Context Protocol (MCP) Support

Instead of forcing developers into a proprietary ecosystem, Google has built AX with native support for the Model Context Protocol (MCP).

AX treats MCP servers as dynamically discoverable, sandboxed actors. The central AX Controller abstracts the operational complexities of managing multi-tenant tool lifecycles. When an agent requests a tool call, the AX Controller checks the tool registry, executes the protocol-compliant schema over secure channels, and records the interaction within the central audit log.

This decoupling ensures absolute portability: any standard enterprise database, file system, or internal API exposed via an MCP server can instantly serve as an operational tool inside an AX runtime environment.

3. Kubernetes Native Scaling via Agent Substrate

Standard Kubernetes deployments are highly optimized for thousands of static, long-running REST APIs or gRPC services. However, an enterprise agent workflow can generate millions of short-lived, bursty, sub-second tool calls that can quickly overwhelm a standard k8s control plane.

To handle this architectural strain, Google paired AX with Agent Substrate, a complementary open-source control plane layer for Kubernetes designed for ultra-scale agent infrastructure density.

Feature	Standard Kubernetes (K8s)	Kubernetes with AX & Agent Substrate
Control Plane Target	Thousands of long-running services	Millions of highly active agent sessions
Idle Capacity Management	Pods remain warm, drawing continuous compute resources	Pod Snapshots suspend idle workloads to cold state
Scaling Architecture	Standard HPA (Minutes/Seconds)	Fast allocation (300 sandboxes/sec at <200ms latency)
Workload Isolation	Shared node kernel boundaries	Strict sandboxing via gVisor / Kata Containers

By leveraging Pod Snapshots, Agent Substrate allows AX to completely freeze an agent's memory state and CPU context when it pauses for human feedback or goes idle. The resource footprints drop to near-zero, freeing up cluster compute. The second a callback or event triggers the agent, it instantly un-freezes from standby capacity with sub-second initialization times.

4. Advanced Debugging: Trajectory Branching

Debugging a failed state deep within a non-deterministic agentic loop is notoriously difficult. To address this, AX exposes a debugging primitive called Trajectory Branching.

Because AX explicitly tracks and registers every execution step in its event log, developers can branch an agentic execution path from any historical checkpoint. If an agent hits a logic exception at step 45 of an operation, you can spin up an alternative trajectory branch from step 44, hot-patch the agent's prompts or underlying code, and re-run the transaction from that exact snapshot without re-executing steps 1 through 43.

Getting Started

Because AX is runtime-agnostic, you can build your agents using your preferred framework (LangGraph, AutoGen, or custom Go/Python codebases) and hand execution management off to the AX runtime.

The AX CLI is written in Go and can be installed directly from the public GitHub repository:

go install github.com/google/ax/cmd/ax@latest
ax --help

For platform engineers looking to transition from brittle prototype scripts to highly stable, multi-tenant AI operations, AX delivers the necessary orchestration, security boundaries, and enterprise governance directly to your own Kubernetes data plane.

The API is the Agent: How the New Google Pay MCP Server and Android Express Checkout Automate the Transaction Layer

Om Shree — Tue, 02 Jun 2026 14:24:09 +0000

For software engineers and platform architects, the "transaction bottleneck" has long been a source of significant friction. Building payments infrastructure requires balancing rigid security protocols, dynamic cart calculations, and real-time validation across siloed environments.

Google is addressing this complexity directly from two distinct angles: the Google Pay & Wallet Developer MCP Server for development environments, and native Express Checkout with Dynamic Callbacks for Android applications.

This combination marks a significant step forward: it brings payment infrastructure closer to the AI context and transitions mobile checkouts toward highly dynamic, zero-friction workflows.

1. The Google Pay & Wallet Developer MCP Server: Inside the IDE Context

Historically, troubleshooting a failing payment token or updating a merchant config meant constantly context-switching between your IDE, the Google Pay Console, and open browser tabs of dense API documentation.

By deploying a dedicated Model Context Protocol (MCP) server ([https://paydeveloper.googleapis.com/mcp](https://paydeveloper.googleapis.com/mcp)), Google has turned its payment platform into an AI-readable layer. When connected to an MCP-compatible environment (such as Cursor, VS Code, or Claude Code), an AI assistant gains secure, real-time access to the integration environment.

The platform exposes several specialized tools to streamline these workflows:

┌────────────────────────────────────────────────────────┐
│             Google Pay & Wallet MCP Server             │
└──────────────────────────┬─────────────────────────────┘
                           │
      ┌────────────────────┼────────────────────┐
      ▼                    ▼                    ▼
[search_documentation] [manage_integrations] [Performance Metrics]
  RAG-powered live       Live account status    Real-time error
   docs & examples        and configuration      tracking & trends

Key Tool Capabilities:

search_documentation: Rather than relying on static model training data, this tool uses Retrieval-Augmented Generation (RAG) to fetch up-to-date documentation, localized error-handling strategies, and direct code samples (e.g., configuring a React button layout).
manage_integrations: AI agents can directly query integration status, retrieve merchant identifiers, list Google Wallet pass classes, or register entirely new merchant integrations without requiring manual navigation through the developer console.
Performance Monitoring: The server allows agents to pull down live integration health metrics, aggregate common error codes, and surface recent failure trends directly into your terminal or chat panel.

Security Guardrail: The server uses OAuth 2.0 via Google Cloud IAM rather than static API keys. Furthermore, it does not process live transactions or access raw credit card numbers; it serves exclusively as a development, configuration, and diagnostics inspector.

2. Android Gets a True One-Click "Express Checkout"

On the consumer-facing side, mobile apps often face high cart abandonment rates due to clunky, multi-step checkout sequences. To solve this, Google has expanded its Express Checkout framework with native Dynamic Callbacks for Android, bringing the mobile platform to functional parity with web capabilities.

Previously, changing a shipping address required the user to exit the Google Pay sheet, wait for the app to recalculate shipping and taxes, and reopen the payment flow. Now, the entire interaction happens asynchronously inside the sheet itself.

class MerchantPaymentDataCallbacks : BasePaymentDataCallbacks() {

    override fun onPaymentDataChanged(
        request: IntermediatePaymentData,
        onCompleteListener: OnCompleteListener<PaymentDataRequestUpdate>
    ) {
        val shippingAddress = request.shippingAddress

        // Asynchronously calculate shipping options and taxes via backend API
        val responseJson = JSONObject().apply {
            put("newTransactionInfo", JSONObject().apply {
                put("totalPriceStatus", "FINAL")
                put("totalPrice", "12.34") // Dynamically adjusted price
                put("currencyCode", "USD")
            })
        }

        val response = PaymentDataRequestUpdate.fromJson(responseJson.toString())
        onCompleteListener.complete(response)
    }

    override fun onPaymentAuthorized(
        request: PaymentData,
        onCompleteListener: OnCompleteListener<PaymentAuthorizationResult>
    ) {
        // Securely pass payment token to processing backend
        val responseJson = JSONObject().apply {
            put("transactionState", "SUCCESS")
        }
        val response = PaymentAuthorizationResult.fromJson(responseJson.toString())
        onCompleteListener.complete(response)
    }
}

The Architectural Benefits of Dynamic Callbacks:

Moving Checkout Upstream: By utilizing BasePaymentDataCallbacks, you can safely position the Google Pay button directly on Product Detail Pages (PDPs) or quick-view carts.
In-Sheet Recalculations: When a user selects or switches a saved shipping address within the sheet, onPaymentDataChanged triggers immediately. Your backend can update taxes, validate shipping regions, and push new final pricing back to the UI in real time.
Graceful Authorization Handling: onPaymentAuthorized manages token submission directly. If a card fails or a fraud check triggers, error state handling occurs natively inside the sheet, allowing the user to select an alternative payment method without closing the checkout funnel.

The Big Picture: Programmable Commerce

These updates point to a broader architectural trend: the automation of the checkout layer.

By standardizing payments through open interface patterns like the Model Context Protocol, Google is laying the groundwork for a transition from human-driven UIs to agentic workflows. Developers can use AI agents to securely deploy and monitor infrastructure, while those same systems rely on standardized browser and OS hooks (like Express Checkout) to safely execute consumer actions with minimal friction.

Beyond the Hype: Claude Opus 4.8, Parallel Subagents, and the Reality of 750K-Line Codebase Migrations

Om Shree — Tue, 02 Jun 2026 13:21:57 +0000

When a model update drops, the tech community usually braces for another round of synthetic benchmark optimizations. But the launch of Claude Opus 4.8 represents a fundamental architectural pivot. Anthropic isn't just shipping smarter weights; they are changing how those weights interact with complex, distributed systems over long horizons.

For engineering teams managing heavy technical debt or scaling agentic pipelines, three updates in this release demand close attention: the debut of native Dynamic Workflows, an aggressive focus on code honesty, and a massive real-world validation—the migration of a 750,000-line Zig repository to Rust in just 11 days.

Here is a technical teardown of what is happening under the hood.

1. Dynamic Workflows: Orchestrating the Subagent Swarm

Until now, using AI for large-scale code refactoring meant dealing with context window degradation or manually stitching together complex LangGraph/CrewAI loops.

With Opus 4.8, Anthropic introduced Dynamic Workflows within Claude Code. Instead of treating a massive task as a single, sequential prompt, Opus 4.8 operates as a centralized orchestrator.

                [Opus 4.8 Orchestrator]
            (Plans, Assigns, & Verifies)
                         │
         ┌───────────────┼───────────────┐
         ▼               ▼               ▼
   [Subagent 1]    [Subagent 2]    [Subagent N]
   (Module A)      (Module B)      (Module C)
         │               │               │
         └───────────────┼───────────────┘
                         ▼
             [Automated Test Verification]
                         │
                         ▼
             [Final Codebase Merge]

Parallel Subagent Swarms: When given a codebase-scale objective, the orchestrator maps out the dependency tree and spins up hundreds of parallel subagents within a single session. Each subagent isolates a specific module, microservice, or file.
Autonomous Verification Loops: Subagents do not simply dump raw code into git. They iteratively edit, run local compilers, parse error logs, and rewrite code until their specific module passes the existing test suite before checking back in with the orchestrator.
Long-Horizon Stamina: Backed by an adaptive thinking architecture and an enhanced 1M-token context window, these parallel loops can run completely unattended for hours, executing multi-stage projects without losing track of overarching architecture patterns.

2. Structural Calibration: 4x Better at Catching Code Flaws

The most dangerous trait of an LLM isn't ignorance; it is confident hallucination. In software engineering, an agent that silently pushes a subtle memory leak or race condition to production is a liability.

Anthropic targeted this head-on with an emphasis on self-calibration and code honesty.

According to internal system card evaluations, Claude Opus 4.8 is 4x less likely than Opus 4.7 to let a flaw in its own code pass unremarked.

If the model is uncertain about a complex typing constraint, a multi-service interaction, or a breaking change, it pushes back. Instead of dressing up incomplete or broken logic as finished work, Opus 4.8 flags its uncertainty, requests clarification, or spins up an alternative subagent to test a different hypothesis. For senior developers tasked with reviewing AI-generated PRs, this drastically reduces cognitive load and narrows the code review bottleneck.

3. Case Study: 750K Lines of Zig to Rust in 11 Days

To prove the production readiness of this framework, Anthropic put the Opus 4.8 dynamic workflow to the ultimate stress test: migrating a high-performance 750,000-line Zig codebase over to idiomatic Rust.

Migrating between these two languages is notoriously difficult. While both are systems languages targeting bare-metal performance without a garbage collector, their mental models diverge sharply:

Zig relies on explicit memory allocator passing, compile-time code execution (comptime), and manual safety patterns.
Rust strictly enforces safety via compile-time borrow checking, strict lifetime annotations, and algebraic data types.

Translating comptime logic into equivalent Rust generics, traits, or procedural macros requires a deep semantic understanding of the system's intent—not just token-to-token translation.

The Execution Metrics:

Scale: ~750,000 lines of code.
Time to Completion: 11 days of asynchronous, autonomous compute.
The Bar: 99.8% of the comprehensive integration and unit test suites passed on the first unified merge.

The subagent swarm divided the repository by service boundaries. When the Rust compiler predictably rejected code due to lifetime mismatches or borrow checker violations, the subagents didn't halt. They analyzed the compiler diagnostics, re-traced the ownership graph, adjusted the code, and re-compiled until the modules compiled cleanly.

The Architectural Shift

For technical leaders, the combination of Opus 4.8 and Dynamic Workflows signals a shift in software maintenance.

Large-scale refactoring, legacy framework migrations (e.g., Cobol to Java, or deprecated internal SDK upgrades), and security patch deployments across hundreds of microservices are transitioning from multi-month engineering grinds to orchestrated, high-autonomy pipeline tasks.

We are moving past the era of the AI autocomplete widget. The new baseline is an autonomous engineering swarm that knows its limits, verifies its logic, and successfully handles the heavy lifting.

The Ten-Gigawatt Moat: Unpacking Anthropic’s $965B Series H and the New AI Infrastructure Reality

Om Shree — Tue, 02 Jun 2026 13:14:00 +0000

The frontier AI landscape just witnessed an unprecedented consolidation of capital and power.

Anthropic has officially closed a monumental $65 billion Series H funding round at a staggering $965 billion post-money valuation. Led by Altimeter, Dragoneer, Greenoaks, and Sequoia, this round officially pushes Anthropic ahead of OpenAI in private market valuation. Fueling this valuation is a massive commercial surge: Anthropic’s annualized revenue run-rate has crossed $47 billion, heavily driven by enterprise adoption and developer reliance on tools like Claude Code.

But for developers, solutions architects, and engineering leaders, the eye-popping financial figures are secondary. The real story lies in the compute architecture and cloud distribution network embedded within this deal.

Anthropic isn’t just building models anymore; they are securing a multi-cloud, multi-gigawatt infrastructure monopoly.

1. The Multi-Cloud Reality: Claude Everywhere

For enterprise teams assessing dependency risk and data residency, Anthropic’s distribution strategy is a massive win. Claude is now natively live across the "Big Three" hyperscalers: AWS, Google Cloud, and Microsoft Azure.

Rather than locking developers into a single ecosystem, Anthropic has turned Claude into a universal layer. This provides distinct architectural advantages:

Zero-Egress Multi-Cloud Pipelines: You can spin up Claude instances directly inside your existing AWS VPCs, Google Cloud projects, or Azure tenants, drastically reducing latency and security overhead.
Global Compliance & Data Residency: By leveraging the regional footprints of all three hyperscalers, Anthropic is deploying localized inference clusters across Asia and Europe. This is a critical prerequisite for engineering teams building in highly regulated spaces like fintech, healthcare, and government.

2. Breaking Down the 10+ Gigawatt Compute Strategy

Training and running next-generation models requires an astronomical amount of power. Anthropic’s Series H functions as a massive infrastructure cap-ex vehicle, securing unprecedented terrestrial—and orbital—compute capacity.

[Anthropic Compute Footprint]
 ├── AWS (Trainium / Custom Chips) ──> Up to 5 GW Capacity (1 GW by end of 2026)
 ├── Google + Broadcom (TPUs) ───────> 5 GW Capacity (Starting 2027)
 ├── SpaceXAI (NVIDIA GPUs) ─────────> 300 MW (220k+ GPUs at Colossus 1)
 └── Future Horizon ─────────────────> Co-developing Orbital Space Compute

⚡ The Hyperscaler Pacts: 10 GW Committed

Anthropic has locked in a confirmed 5 GW compute agreement with Amazon Web Services (with nearly 1 GW expected to be active by the end of this year), leveraging AWS Trainium hardware. Concurrently, a massive 5 GW agreement with Google and Broadcom is set to bring next-generation TPU capacity online starting in 2027.

🚀 The SpaceX Colossus Deal: Immediate Scale

To meet immediate developer demand and lift strict API rate limits on current models like Claude Opus, Anthropic signed a major agreement with SpaceXAI. This grants Anthropic immediate access to 300 megawatts of capacity at the famous Colossus 1 supercomputer cluster.

The Hardware: Over 220,000 NVIDIA H100, H200, and next-gen GB200 accelerators.
The Developer Impact: If you've noticed your Claude API and Claude Code rate limits doubling or removing peak-hour throttles recently, this massive infusion of GPU muscle is why.

🌌 The Next Frontier: Orbital Compute

Terrestrial data centers are hitting hard limits on power grids and cooling efficiency. A fascinating addendum to the SpaceX partnership reveals that Anthropic has expressed formal interest in co-developing multi-gigawatt orbital AI compute capacity. By taking advantage of SpaceX's mass-to-orbit economics and continuous solar energy, future iterations of Claude might literally be trained and served from space.

What This Means for Developers and Technical Leaders

By designing an ecosystem that simultaneously thrives on AWS Trainium, Google TPUs, and NVIDIA GPUs, Anthropic has mitigated the severe hardware supply chain bottlenecks that plague other labs.

For engineers building agentic workflows, multi-agent frameworks, or deeply integrated coding pipelines, this news provides structural validation. The massive influx of capital and power ensures that the API endpoints you rely on will remain stable, highly performant, globally compliant, and capable of scaling alongside your enterprise infrastructure.

The AI race is no longer just about who has the best weights—it’s about who commands the gigawatts to run them. Right now, Anthropic is building an unshakeable lead.

I Built an MCP Agent Framework for My B.Tech Major Project. It Got 750+ npm Downloads in Week One. Here's the Comeback Story.

Om Shree — Thu, 28 May 2026 13:40:00 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

Last semester, under the pressure of B.Tech finals and a looming presentation deadline, I shipped a full-stack AI agent system called the Unified MCP Framework. The idea was straightforward: build a single orchestration layer where an AI could interpret natural language commands and route them to the right tool - a filesystem, a browser, a GitHub API - without the developer having to wire each one manually.

The core architecture had three pieces:

A React + Vite frontend for the chat interface and tool trace visualization
A FastAPI backend acting as the AI Orchestrator, powered by Google Gemini
Three specialized tool servers: Filesystem (sandboxed), Browser (Playwright), and GitHub (PyGithub)

It worked. The demo went well. I submitted the PDF, the PPTX, the poster - and then I closed the repo and moved on.

The problem was that "it worked" and "it was usable by anyone else" were two very different things.

Demo

📦 npm package: unified-mcp on npm

💻 GitHub repo: Om-Shree-0709/Major-Project

The system takes a natural language query like:

"Summarize the latest commits in my repo and write a summary file."

...and routes it across the GitHub tool (fetch commits), Filesystem tool (write file), and Gemini (generate summary) - with each tool call visible in the frontend trace panel.

The Comeback Story

When I came back to this project after graduation exams, here's what I found:

Before:

No installation path that didn't require reading the source code
README that assumed you already understood what MCP was
No .env.example, so first-time setup always failed
Playwright setup instructions buried halfway through a wall of text
The npm package (unified-mcp) existed but had no usage examples - just a package.json and good intentions
Zero error messages that actually told you what went wrong; just raw Python tracebacks

After:

Rewrote the README top-to-bottom: quick-start first, architecture second
Added a proper .env.example with inline comments
Separated Windows and Unix setup paths - Playwright's async event loop fix is Windows-only; nobody should need to figure that out mid-debug
Added a QUICK_TEST_QUERIES.md and COMPLEX_TEST_QUERIES.md so any developer could validate the system end-to-end in under 5 minutes
Fixed the sandboxed filesystem error handling - instead of a traceback, you now get a clear message: "Access denied: path is outside sandbox directory"
Polished the npm package with real usage documentation, proper exports, and a working install flow
Added a troubleshooting section covering the three most common failure modes: backend offline, Playwright binary missing, env vars not loaded

The week the npm package was properly documented and re-announced: 750+ downloads.

That number matters to me not because it's large - it isn't - but because week one with no usable docs had yielded single digits. The code hadn't changed. The docs had.

My Experience with GitHub Copilot

I'll be specific about where it actually helped, because "Copilot helped me" is a useless sentence.

Rewriting the README: I had rough notes about what the system did. Copilot autocompleted the setup steps once I established the structure - it understood that after pip install -r requirements.txt comes playwright install chromium, and it kept that sequencing consistent when I reorganized sections. Saved probably 30 minutes of manual tab-matching.

The .env.example file: I typed the first variable with a comment. Copilot generated the remaining four in the same format - correct variable names, sensible placeholder values. That's the kind of tedious-but-error-prone work where it genuinely earns its keep.

The error handling refactor: The original filesystem_server.py had bare except Exception as e: raise e blocks everywhere. I asked Copilot to help me add user-facing error messages. It suggested wrapping each block with specific messages tied to the exception type - FileNotFoundError, PermissionError, IsADirectoryError - rather than a single generic catch. That was the right call and I wouldn't have done it that cleanly by hand at 11pm.

The COMPLEX_TEST_QUERIES.md: I started writing test cases by hand. Copilot kept generating the next logical one based on the pattern I established. "List all Python files in the sandbox" → "Read the contents of a specific file" → "Write a new file with generated content." The progression made sense and I kept most of it.

Where it didn't help: anything requiring knowledge of my specific project structure. It would hallucinate import paths, suggest tools I hadn't built, and occasionally propose FastAPI route patterns that conflicted with what I already had. The rule I settled on: use it for boilerplate and structure, verify everything that touches the actual logic.

The original project was built in a sprint, for a grade. This version was rebuilt for the people who might actually use it. That's a different problem, and it turned out to be a harder one. GitHub Copilot made the second pass faster - not by doing the thinking, but by handling the parts that didn't require any.

If you're building MCP tooling and want a working reference implementation with a real setup path, the repo is linked above. The npm package is live. Feedback welcome.

Follow for more coverage on MCP, agentic AI, and developer infrastructure.

Google I/O 2026: MCP Is Now Infrastructure (Spark, Managed Agents, WebMCP & More)

Om Shree — Thu, 28 May 2026 12:37:25 +0000

Google I/O 2026: MCP Is Now Infrastructure

Google I/O used to be about new models. This year it was about what those models do - and how they connect to everything else. MCP was everywhere.

Not as a novelty. Not as an experiment. As the assumed plumbing.

Here's what actually shipped.

Gemini Spark Will Run on MCP for Third-Party Tools

The headline agent at I/O 2026 was Gemini Spark - a 24/7 AI agent that runs on cloud VMs, works while your devices are off, and handles long-running tasks across Gmail, Docs, and Calendar. Spark integrates with Google Workspace apps first, then expands to third-party tools via MCP over the summer.

That's the part worth sitting with. Google built its flagship consumer agent and then said: for everything outside our walls, we'll use the open protocol. A year ago, MCP was a specification from Anthropic. Today, Google built its flagship consumer AI agent on it. Cursor, Copilot, Windsurf, Mistral, Grok - they all support it too.

When the company that runs Search, Gmail, Android, and Chrome commits to MCP as the integration layer for its flagship product, the protocol debate is effectively over.

Managed Agents Get MCP Servers by Default

Google also launched Managed Agents through the Gemini API - a setup where a single API call provisions a remote Linux environment with its own isolated sandbox. Each agent gets its own ephemeral sandbox provisioned with skills, Model Context Protocol (MCP) servers, and server-side tools. Full integration with A2A and Agent Platform governance and security are coming soon.

Managed Agents are powered by the Antigravity agent and built on Gemini 3.5 Flash. Developers can define custom agents through versionable markdown files such as AGENTS.md and SKILL.md, rather than building complex orchestration layers from scratch.

This is Google offering hosted execution, sandboxing, state handling, and MCP tool access as a bundled service. The enterprise pitch is operational abstraction - you define the agent, Google runs the runtime.

WebMCP: MCP Gets a Browser Layer

The most underreported announcement at I/O 2026 was WebMCP. WebMCP is a proposed open web standard that allows developers to expose structured tools, like JavaScript functions and HTML forms, so browser-based AI agents can execute complex tasks with greater speed, reliability, and precision. The experimental WebMCP origin trial starts in Chrome 149, with support for Gemini in Chrome coming soon.

The problem it solves is real. Browser agents today navigate by reading rendered HTML and guessing where to click. A dynamically injected form field, a JavaScript-rendered dropdown, a modal that loads on interaction - these are routine failures. WebMCP lets developers annotate their JavaScript functions and HTML forms so that browser-based AI agents can call them directly as structured tools - with the same reliability you'd expect from a typed API, not from a model guessing where to click.

The protocol composes cleanly with the rest of the stack: MCP handles agent-to-infrastructure connections (databases, APIs, file systems), A2A handles agent-to-agent coordination across vendors, and WebMCP handles agent-to-website interaction in the browser. Three protocols, three layers.

WebMCP currently lives in the W3C Web Machine Learning Community Group - an incubation space, not the full standards process. The path from origin trial to official standard is long. But six major consumer platforms publicly committed to implement it before it's finalized. That's a credible signal.

Google Security Operations Ships a Remote MCP Server

On the enterprise security side, Google shipped a remote MCP server for Google Security Operations - and made it generally available. You can build your own security agents with remote Google Cloud MCP server support for Google Security Operations, now generally available. You can also access the MCP server client directly from the Google Security Operations chat interface, available in preview.

The Google Security Operations remote MCP server is enabled when you enable the Google Security Operations API. It connects with AI applications including Gemini CLI, ChatGPT, Claude, and custom applications you're developing.

This matters because security operations is one of the domains where agent reliability directly affects risk. Shipping a managed, remote MCP server here - rather than asking security teams to run their own - is a meaningful architectural choice.

Genkit 2.0 Adds Native MCP Server Integration

For developers building agent applications in TypeScript, Genkit 2.0 GA ships as a TypeScript AI framework with native MCP server integration, streaming, Cloud Trace observability, and one-click Cloud Run deployment.

Native MCP integration in a GA framework means developers no longer need to wire MCP separately - it's in the baseline toolchain. Combined with Cloud Run deployment, the path from "I have an MCP server" to "it's running in production" is now shorter than it's ever been.

A2A Hits 150 Organizations in Production - and It Complements MCP, Not Replaces It

Google's Agent2Agent protocol also had a significant update at I/O 2026. A2A has reached 150 organisations in production - not pilot - routing real tasks between agents built on different platforms. The protocol is now governed by the Linux Foundation's Agentic AI Foundation and has reached version 1.2, with signed agent cards using cryptographic signatures for domain verification. Microsoft, AWS, Salesforce, SAP, and ServiceNow are running A2A in production environments.

The distinction from MCP is worth being clear about. MCP handles how an agent connects to tools and data sources. A2A handles how agents communicate with each other across organisational and platform boundaries. They're complementary. The full interoperability stack for multi-agent systems uses both.

What This Actually Means

Google I/O 2026 didn't introduce MCP to the world. It normalized it.

Managed Agents provision MCP servers by default. Gemini Spark uses MCP for third-party tools. Security Operations ships a remote MCP server. WebMCP extends the protocol's logic into the browser. Genkit 2.0 bundles native MCP integration in a GA framework.

None of these are experiments. They're production decisions made by a company that controls a significant portion of the developer toolchain.

If you're building agents, or building tools that agents should be able to call, MCP is the interface layer. That was already true six months ago. Google just made it harder to ignore.

For a fuller breakdown of Google I/O 2026's agent announcements, see the Gentoro analysis.

AWS Just Made Its MCP Server Generally Available. Here's What It Actually Gives AI Agents.

Om Shree — Mon, 25 May 2026 12:16:36 +0000

The dirty secret of AI coding agents working on AWS has always been the credential problem: give the agent too much access and you've handed over the keys; give it too little and it's useless.
AWS just shipped its answer.

The Problem It's Solving

AI coding agents working with AWS have two compounding failure modes. First, their training data goes stale fast. Without access to current AWS documentation, agents rely on training data that may be months out of date and may not know about services like Amazon S3 Vectors, Amazon Aurora DSQL, or Amazon Bedrock AgentCore. Second, when they do reach for AWS tooling, their instincts are wrong: they tend to reach for the AWS CLI rather than AWS CDK or CloudFormation, and they produce IAM policies that are far broader than necessary. The result is infrastructure that clears a demo and breaks in production.

The deeper issue is structural. Before this release, connecting an AI agent to AWS meant either injecting broad credentials into a prompt context — a governance nightmare — or building custom middleware that quickly becomes a maintenance burden. Neither solution scales in an enterprise setting where audit trails and least-privilege access aren't optional.

How the AWS MCP Server Actually Works

The AWS MCP Server is now part of the recently announced Agent Toolkit for AWS, a set of tools, plugins, and workflows that help AI coding agents work with AWS services. The toolkit's open-source codebase is available on GitHub.

The server exposes a compact, fixed tool set rather than dumping the entire AWS surface area into the agent's context window. The call_aws tool covers all 15,000+ AWS API operations using existing IAM credentials. The search_documentation and read_documentation tools pull current AWS docs at query time, bypassing the model's knowledge cutoff entirely. And the newest addition, run_script, lets the agent execute short Python scripts server-side in a sandboxed environment — no local filesystem access, no shell, IAM permissions inherited but network-isolated.

When an agent needs to call multiple APIs and combine the results, making them one at a time is slow and burns context. With run_script, the agent chains API calls, filters responses, and computes results in a single round-trip, which is both faster and more context-efficient.

On authentication: the AWS MCP Server uses IAM and SigV4 authentication. As the MCP server currently supports only OAuth 2.1, local AWS credentials can be used through the open-source MCP Proxy for AWS, which runs locally and translates IAM-based authentication into OAuth-compatible requests. It's a thin bridge, not a workaround — the IAM trust model stays intact end-to-end.

The call_aws configuration for Claude Code is a single command:

claude mcp add-json aws-mcp --scope user \
   '{"command":"uvx","args":["mcp-proxy-for-aws@latest","https://aws-mcp.us-east-1.api.aws/mcp","--metadata","AWS_REGION=us-west-2"]}'

What Teams Are Actually Using It For

The stale-documentation problem is the most immediately visible win. AWS's own demo illustrates it cleanly: ask Claude Code (backed by Opus 4.6, knowledge cutoff May 2025) how to store embeddings on S3 without the MCP server, and you get five technically correct answers — none of which mention Amazon S3 Vectors, which launched in preview July 2025. Connect the MCP server, ask the same question, and the agent searches live AWS documentation and surfaces S3 Vectors directly.

Beyond documentation freshness, the enterprise governance story is significant. You can use IAM policies or Service Control Policies to specify that a given user can perform mutating operations while the MCP server is restricted to read-only actions. Amazon CloudWatch metrics published under the AWS-MCP namespace let you observe MCP server calls separately from direct human calls, giving you the audit trail that compliance teams require. Amazon CloudTrail captures all API calls for a complete record.

Documentation search and skill discovery can now be used without requiring AWS credentials — a deliberate decision to lower the barrier for read-only exploration without relaxing the security posture on mutating operations.

Why This Is a Bigger Deal Than It Looks

The Skills system — curated guidance and best practices for the tasks where agents most commonly make mistakes — is the part of this release that deserves more attention than it's getting. Skills are contributed and maintained by AWS service teams, which means the agent's guidance on IAM policy scoping, CDK patterns, or CloudFormation structure comes from the people who own those services, updated when those services change. That's a fundamentally different posture than baking best practices into a system prompt and hoping the model generalizes correctly.

The governance architecture also matters at an industry level. Darryl Ruggles, principal cloud solutions architect at Ciena, notes that the AWS MCP Server takes "a measured approach" to the long-standing tension between usefulness and safety in giving AI agents access to AWS. That measured approach — IAM context keys, CloudWatch namespacing, CloudTrail integration — is the kind of governance scaffolding that turns an interesting prototype into something a CISO will actually approve.

The open question raised by practitioners is fair: some question whether there are gateways to restrict certain actions or operations. Fine-grained operation-level blocking beyond standard IAM is still on the community's wishlist.

Availability and Access

The AWS MCP Server is currently available only in two regions, Northern Virginia and Frankfurt. It is free to use, although charges apply to the resources consumed by agents.

The MCP Server can be integrated with any AI agent that supports MCP, including Claude Code, Kiro, Cursor, and Codex. The full setup guide lives in the AWS MCP Server User Guide. You'll need uv installed (curl -LsSf https://astral.sh/uv/install.sh | sh) before wiring up the proxy.

The Agent Toolkit for AWS is the broader container this fits into — worth watching as AWS continues adding skills and plugins from individual service teams.

Every major cloud provider is now racing to become the default infrastructure layer for agentic AI. AWS isn't winning that race on documentation freshness alone — they're winning it by being the only provider that's built IAM governance directly into the MCP layer from day one.

Follow for more coverage on MCP, agentic AI, and AI infrastructure.

An npm Package for AI Agent Orchestration Just Shipped With Its Front Door Unlocked. Here's What the CVE Actually Reveals.

Om Shree — Mon, 25 May 2026 12:14:55 +0000

MCP ecosystem is growing fast enough that security researchers are now hunting it like any other production attack surface. CVE-2026-46701 — published May 21, 2026 — is the first notable proof that the hunt is paying off.

The Problem It's Solving (Or Was Supposed To)

Network-AI is a TypeScript/Node.js multi-agent orchestration layer. It handles the coordination problem that every team building with multiple agents eventually hits: parallel agents writing to the same shared state, overwriting each other, corrupting context with no error thrown. Network-AI addresses this with a shared blackboard that uses atomic propose-validate-commit locking, HMAC/Ed25519 audit trails, per-agent token budgets, and FSM governance. It plugs into 17 AI frameworks — LangChain, AutoGen, CrewAI, OpenAI Assistants, LlamaIndex, and more — through a local MCP server running on port 3001.

The MCP server is the attack surface.

How the Vulnerability Actually Works

The advisory describes three lines of code that interact badly enough to hand full orchestrator access to any web page a user visits.

The first is in bin/mcp-server.ts. The server's secret defaults to an empty string:

secret: process.env['NETWORK_AI_MCP_SECRET'] ?? '',

The second is in the auth guard in lib/mcp-transport-sse.ts. When the secret is falsy — which an empty string is — _isAuthorized returns true unconditionally, no Authorization header required:

private _isAuthorized(req: http.IncomingMessage): boolean {
  if (!this._opts.secret) return true;
  // ...
}

The third is the CORS header, set before any auth check runs:

res.setHeader('Access-Control-Allow-Origin', '*');

Put these together: any cross-origin browser request reaches the MCP server's JSON-RPC handler with no credentials, and the browser is explicitly allowed to read the response back. An attacker who can get a user to visit a malicious web page while Network-AI is running locally can invoke all 22 exposed MCP tools silently. The proof-of-concept in the advisory demonstrates this cleanly — an unauthenticated POST to /mcp from http://evil.example.com returns HTTP 200 with isError: false, config_set executed without a token.

The CWE here is CWE-346: Origin Validation Error. CVSS score is 7.6 High, with attack complexity rated Low and privileges required rated None. That combination matters: no special setup, no brute force, no existing session. One page visit.

What an Attacker Can Actually Do With It

The 22 MCP tools exposed through this vector are not read-only status endpoints. The advisory specifically calls out config_set (mutate orchestrator configuration arbitrarily), agent_spawn (launch new agents), blackboard_write and blackboard_delete (corrupt the shared state that every agent in the system is reading), and token_create / token_revoke (tamper with the permission token system).

The integrity impact is rated High. An attacker who can write to the blackboard can feed poisoned state to every downstream agent. An attacker who can spawn agents can redirect the orchestrator's work. An attacker who can revoke tokens can deny legitimate agents access. All of this from a browser tab, assuming the user has a default Network-AI install running and hasn't set NETWORK_AI_MCP_SECRET.

The confidentiality impact is rated Low — blackboard contents and audit log queries are readable, but model weights and credentials are not directly exposed through the MCP API. Availability impact is also Low. The service keeps running, just with attacker-controlled configuration.

Why This Is a Bigger Deal Than It Looks

This vulnerability is a preview of a class of issues the MCP ecosystem is about to encounter at scale.

The pattern — a local server running on a fixed port, trusting localhost-adjacent requests, with permissive CORS — is not unique to Network-AI. It's a natural consequence of how MCP servers are typically architected: they're designed to be easy to connect to from a client (Claude, Cursor, VS Code) on the same machine, and "easy to connect to" and "secure against cross-origin requests" require explicit attention to keep from conflicting.

The MCP specification itself doesn't mandate auth. Individual implementations are expected to handle it. When a library ships with an empty default secret and a ?? '' fallback, the developer who installs it and never sets NETWORK_AI_MCP_SECRET gets an open server — and probably doesn't know it.

The remediation in the advisory is correct: require a non-empty secret at startup, fail fast if none is set in SSE mode, and restrict CORS to localhost and 127.0.0.1 origins rather than wildcarding everything. Moving CORS headers after the auth check would also prevent rejected requests from advertising cross-origin access in the first place.

Affected versions are <= 5.4.4. The fix is in 5.4.5.

Availability and Access

The patched version is on npm now. If you're running Network-AI as part of an agentic workflow — connected to Claude, Cursor, or VS Code via the MCP server — update to 5.4.5 and set NETWORK_AI_MCP_SECRET explicitly. Don't leave it to the environment variable default.

The full advisory is at GHSA-j3vx-cx2r-pvg8. Credit to reporters 232-323 and min8282 for responsible disclosure.

The MCP ecosystem now has enough production installs that it's worth treating like any other networked attack surface. Default-open auth and wildcard CORS on a local server handling agent orchestration is the kind of configuration issue that looks benign in a demo and looks serious in a post-mortem. This one got caught before the post-mortem.

Follow for more coverage on MCP, agentic AI, and AI infrastructure.

Microsoft Foundry Just Added CI/CD for AI Agents. Here's What That Actually Changes.

Om Shree — Mon, 25 May 2026 12:13:41 +0000

Most teams can build an AI agent in a weekend. Getting it to production — with version control, quality gates, multi-environment promotion, and audit trails — is where everything breaks down. Microsoft just shipped a reference architecture that treats that problem seriously.

The Problem It's Solving

AI agents have been stuck in a productionization gap. You can prototype fast. Shipping responsibly is another matter entirely. The gap isn't model quality — it's infrastructure. Who owns the deployment pipeline? How do you gate a release on evaluation scores, not just unit tests? How do you promote an agent from dev to test to prod without manual intervention and prayer?

Standard software teams have solved this with CI/CD rigour. The friction is applying that same rigour to AI agents, where the "code" is a combination of prompts, tool schemas, model versions, and evaluation thresholds. That combination doesn't fit neatly into a GitHub Actions workflow designed for stateless services.

Microsoft Foundry is Microsoft's answer to that gap. It's a fully managed platform for building, deploying, and governing AI agents at scale, with a first-class agent runtime and built-in lifecycle management — applicable whether you're building containerised hosted agents or declarative prompt-based agents.

How It Actually Works

The architecture has two deployment targets and one shared pipeline model. Hosted Agents use an agent.yaml declarative manifest — aligned with the AgentSchema spec — that defines an agent's portable configuration: name, description, target model, system instructions, tool declarations, and runtime settings like environment variables and protocol choices. This lets you version the agent definition as infrastructure-as-config stored directly in your repo.

The reference pipeline handles promotion across three environments: Dev, Test, and Production. It uses parallel implementations in both GitHub Actions and Azure DevOps, with credentials referenced through secret stores and variable groups — no hardcoded secrets in tracked pipeline files.

The quality gate is the key structural difference from standard software CI/CD. Agents don't fail linting — they fail evaluations. Azure AI Foundry provides offline evaluation tooling within CI/CD pipelines, so agents are assessed against quality standards before any release reaches production. That evaluation step is what makes the pipeline an actual gate rather than a deployment script with extra steps.

On the observability side, Foundry Control Plane now offers full GA on core capabilities including end-to-end tracing built on OpenTelemetry, built-in evaluators covering coherence, relevance, groundedness, and safety, and continuous production traffic monitoring through Azure Monitor. Custom evaluators — both code-based and LLM-as-a-judge — are available in preview for teams with domain-specific quality requirements.

The hosted agent runtime itself has been rebuilt around isolation. Each agent session runs in its own dedicated secure sandbox — no shared state between sessions, no cross-tenant data leakage, sub-100ms startup time with zero idle cost since agents are suspended between conversation turns.

What Teams Are Actually Using It For

The most direct use case is enterprise agent deployment with governance requirements. Foundry Agent Service is a flexible, pro-code solution with extensive developer tooling and CI/CD integration designed for complex enterprise scenarios — including multi-agent orchestration, advanced security, compliance features, flexible model support, and connectivity options suited to large-scale, regulated environments. That's the positioning Microsoft is going after: teams where "it works on my machine" is not a ship criterion.

The AI Red Teaming Agent is now generally available alongside the CI/CD stack, giving teams automated adversarial testing capabilities with CI/CD integration so red teaming runs can be gated into the deployment pipeline itself. Findings are logged and tracked over time in Foundry, so risk posture improves alongside the agent as it evolves.

For teams already using Microsoft Agent Framework, the v1.0 release is now stable across Python and .NET, unifying the enterprise-grade foundations of Semantic Kernel with the multi-agent orchestration from AutoGen. It ships with native MCP, A2A, and OpenAPI support out of the box.

Why This Is a Bigger Deal Than It Looks

The framing here matters. Microsoft isn't shipping a deployment tool — it's shipping an opinion about how agentic software should be developed. The opinion is that agents should be managed exactly like application software: versioned, evaluated, promoted through environments, and governed at the tenant level.

Every agent created in Foundry Agent Service is automatically visible in Microsoft Agent 365, giving IT admins a single unified control plane to observe, secure, and govern all agents across the organization, regardless of where they were built. That's not a developer feature. That's an enterprise procurement argument.

The second implication is framework-level. The Toolbox in Foundry — which exposes web search, file search, code interpreter, and Azure AI Search through a single unified endpoint — works regardless of which agent framework you're using: Microsoft Agent Framework, LangGraph, or others, without custom glue code. That interoperability is deliberate. Microsoft is betting on Foundry as the deployment and governance layer even if teams pick their own orchestration stack.

Availability and Access

The reference architecture includes the GitHub Actions workflow, the Azure DevOps pipeline YAML, and the architecture diagram. The foundry-cicd repository on GitHub has the full implementation. Foundry Toolkit for VS Code is generally available. Hosted agents, memory, and Toolbox are in public preview. Memory billing begins June 1, 2026, with hosted agent compute priced at $0.0994 per vCPU-hour and memory at $0.0118 per GiB-hour during preview — you pay only for active execution.

The bet Microsoft is making is that the hard part of agentic AI isn't building agents — it's shipping them with the same operational rigor that existing software demands. Whether that framing lands depends on whether enterprise teams are actually blocked on deployment infrastructure, or on something harder to automate.

Follow for more coverage on MCP, agentic AI, and AI infrastructure.