AI Tools Race Heats Up: Week of April 3–9, 2026

#agents #ai #microsoft #news

Microsoft shipped Agent Framework 1.0 this week with full MCP and A2A support, AMD posted record MLPerf Inference 6.0 results, and a JetBrains survey put hard numbers on how fast Claude Code is climbing the professional adoption charts. The agentic stack is snapping together fast.

AI Coding Tools: Microsoft Agent Framework 1.0 Ships

On April 7, Microsoft released Agent Framework 1.0, the production-ready unification of Semantic Kernel and AutoGen into a single open-source SDK. The release delivers stable APIs, a long-term support commitment, and enterprise-grade multi-agent orchestration out of the box. The headline capability is cross-runtime interoperability: Agent Framework 1.0 ships with full MCP support for tool discovery and invocation, plus A2A 1.0 support arriving imminently for cross-framework agent collaboration. A browser-based DevUI debugger lets teams visualize agent execution, message flows, and tool calls in real time.

The release lands the same week JetBrains published research from its January 2026 AI Pulse survey of more than 10,000 developers. The numbers tell a clear story: 90% of professional developers now use at least one AI tool at work regularly. GitHub Copilot leads work adoption, but Claude Code has risen to share second place alongside Copilot, each used by 18% of developers in professional settings. That is a significant jump from its position just two surveys ago. The JetBrains data also shows Claude Code scoring 80.8% on the SWE-bench Verified benchmark, which measures real bug fixes across actual GitHub repositories — the highest published score for complex debugging and large-codebase work.

Google also drove meaningful activity this week on the open-weights coding side. Gemma 4 launched April 2 under an Apache 2.0 license, built from the same research as Gemini 3. The 31B Dense variant ranks third on Arena AI's open model leaderboard. AMD confirmed day-zero support for all Gemma 4 models across its Instinct GPUs, Radeon GPUs, and Ryzen AI processors — covering everything from cloud data centers to AI PCs — through vLLM, SGLang, llama.cpp, Ollama, and LM Studio.

AI Processing: AMD Posts Record Inference Results

AMD published its MLPerf Inference 6.0 results this week, anchored by the Instinct MI355X GPU. Built on CDNA 4 architecture with a 3nm process, the MI355X carries 185 billion transistors, supports FP4 and FP6 data types, and pairs all of that with up to 288GB of HBM3E memory. The submission covered a range of generative AI workloads from single GPU to multi-node scale, and AMD's ecosystem of partners reproduced the results across four different Instinct GPU types, a first for an MLPerf submission that gives customers real confidence in the numbers.

The broader hardware shift is toward heterogeneous inference architectures. Intel and SambaNova announced a collaboration this week that combines GPUs for the prefill phase of inference, SambaNova Reconfigurable Dataflow Units for the decode phase, and Xeon CPUs for orchestration. The design challenge they are addressing is real: GPU resources are expensive and poorly suited to the decode phase of token generation, which has different memory-bandwidth and compute characteristics than prefill. By mapping each phase of inference to the hardware dataflow it is best suited for, the collaboration targets a meaningful reduction in cost-per-token at production scale. Intel frames this as an ecosystem-first strategy rather than a single-chip bet, and the modular architecture maps naturally to the way cloud providers assemble rack-scale AI systems.

AMD also released PACE (Platform Aware Compute Engine) on April 8, an optimization framework for LLM inference on 5th Generation EPYC CPUs. PACE targets throughput improvement and latency reduction by adapting inference execution to the specific NUMA topology and cache hierarchy of the CPU it is running on. For organizations running inference on CPU-only infrastructure — a common pattern for privacy-sensitive workloads and edge deployments — this is a practical tool for squeezing more tokens per second out of existing hardware.

Looking further ahead, NVIDIA's Vera Rubin platform is in full production and scheduled to reach cloud providers in the second half of 2026. AWS, Google Cloud, Microsoft, and OCI are confirmed as among the first to deploy Vera Rubin NVL72 rack-scale systems. The platform targets a 10x reduction in inference token cost and a 4x reduction in the number of GPUs needed to train MoE models, compared to the Blackwell generation. Microsoft's Fairwater AI superfactories are the flagship deployment, scaling to hundreds of thousands of Vera Rubin Superchips.

Standards & Protocols: MCP v2.1 and the Agent Stack Solidifies

The agentic protocol stack continued maturing this week. MCP has crossed 97 million monthly SDK downloads in Python and TypeScript combined, and has been adopted by every major AI provider — Anthropic, OpenAI, Google, Microsoft, and Amazon. The MCP v2.1 specification adds Server Cards, a standard for exposing structured server metadata via a .well-known URL, enabling registries and crawlers to discover server capabilities without connecting to them. Major host applications including Claude Desktop and Cursor have shipped full MCP v2.1 support. The Linux Foundation's Agentic AI Foundation (AAIF) — co-founded by OpenAI, Anthropic, Google, Microsoft, AWS, and Block in December 2025 — now serves as the permanent governance home for both MCP and A2A.

Microsoft's Agent Framework 1.0 is the most concrete evidence yet that the MCP-plus-A2A architecture is becoming the production-ready default for enterprise agentic systems. The framework treats MCP as the resource layer — connecting agents to tools, APIs, and data sources through standardized servers — and A2A as the networking layer, enabling agents built on different frameworks to delegate tasks and coordinate workflows. The Elastic team published a two-part implementation guide this week walking through how MCP and A2A complement each other in a practical newsroom multi-agent example, with Elasticsearch providing the data substrate. The pattern of MCP for tool access and A2A for agent coordination is becoming the standard vocabulary for describing agentic architecture, and the practical implementation resources are finally catching up to the conceptual work.

For data engineering teams, the intersection of these protocols with the lakehouse stack is the most interesting frontier. MCP servers for Dremio and Apache Iceberg catalogs let agents query and reason over live data without custom integration code. As A2A matures, the pattern of orchestrator agents delegating to specialist data agents — each with MCP-backed access to specific catalog namespaces or table subsets — becomes a plausible production architecture for agentic analytics workflows.

Resources to Go Further

The AI landscape changes fast. Here are tools and resources to help you keep pace.

Try Dremio Free — Experience agentic analytics and an Apache Iceberg-powered lakehouse. Start your free trial

Learn Agentic AI with Data — Dremio's agentic analytics features let your AI agents query and act on live data. Explore Dremio Agentic AI

Join the Community — Connect with data engineers and AI practitioners building on open standards. Join the Dremio Developer Community

Book: The 2026 Guide to AI-Assisted Development — Covers prompt engineering, agent workflows, MCP, evaluation, security, and career paths. Get it on Amazon

Book: Using AI Agents for Data Engineering and Data Analysis — A practical guide to Claude Code, Google Antigravity, OpenAI Codex, and more. Get it on Amazon

Top comments (3)

Archit Mittal • Apr 9

The MCP v2.1 Server Cards feature is a game-changer for discoverability. Right now one of the biggest friction points in building multi-agent systems is knowing what tools are available across your stack — having a .well-known endpoint for structured server metadata means you can build agent registries that auto-discover capabilities instead of hardcoding tool configs. I've been building automation workflows with MCP and the pattern of separating tool access (MCP) from agent coordination (A2A) maps really well to real-world use cases where you want specialist agents handling different domains. Curious to see how the PACE framework performs for CPU-only inference — that's a big deal for edge deployments where GPU access isn't feasible.

Global Chat • Apr 12

The Server Cards addition in v2.1 is the piece I've been waiting for. We've been crawling MCP server metadata across multiple registries and the lack of a standard discovery endpoint meant every registry had its own format for describing capabilities. With .well-known/mcp.json, crawlers can finally build a unified index without per-registry adapters.

One thing I keep coming back to though: the spec doesn't say much about versioning the server card itself. If a server updates its capabilities, how should crawlers detect the diff? ETags? Last-Modified headers? Feels like there's room for a lightweight change-detection mechanism on top of the static card.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.