<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Om Shree</title>
    <description>The latest articles on DEV Community by Om Shree (@om_shree_0709).</description>
    <link>https://dev.to/om_shree_0709</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2900392%2F78ad1723-16ab-4e46-b39c-7f3feb416d23.jpg</url>
      <title>DEV Community: Om Shree</title>
      <link>https://dev.to/om_shree_0709</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/om_shree_0709"/>
    <language>en</language>
    <item>
      <title>DeepSeek Just Dropped V4. Here's What the Benchmarks Actually Tell You.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Fri, 24 Apr 2026 09:01:03 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/deepseek-just-dropped-v4-heres-what-the-benchmarks-actually-tell-you-1oae</link>
      <guid>https://dev.to/om_shree_0709/deepseek-just-dropped-v4-heres-what-the-benchmarks-actually-tell-you-1oae</guid>
      <description>&lt;p&gt;Open-source AI has spent two years being "almost there." With DeepSeek-V4-Pro, the gap with frontier closed-source models isn't almost closed — in some benchmarks, it's gone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;The standard narrative has been simple: closed-source models from OpenAI, Google, and Anthropic sit at the frontier. Open-source models follow, months behind, at a fraction of the cost but with a meaningful capability tax. You pay in quality for what you save in dollars.&lt;/p&gt;

&lt;p&gt;DeepSeek-V4-Pro-Max — the maximum reasoning effort mode of DeepSeek-V4-Pro — is being positioned as the best open-source model available today, significantly advancing knowledge capabilities and bridging the gap with leading closed-source models on reasoning and agentic tasks. &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; That's a bold claim. The benchmark data makes it harder to dismiss than the usual open-source PR.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;DeepSeek-V4-Pro ships as a 1.6 trillion parameter Mixture-of-Experts model with 49 billion parameters activated per token, while DeepSeek-V4-Flash runs at 284 billion total with 13 billion activated. Both support a one million token context window. &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architecture is doing real work here, not just scaling. A hybrid attention mechanism combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) dramatically improves long-context efficiency — in the 1M-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2. &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; That's not a marginal improvement. That's a fundamentally different inference cost profile at scale.&lt;/p&gt;

&lt;p&gt;Manifold-Constrained Hyper-Connections (mHC) strengthen residual connections across layers while preserving model expressivity &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; , and the Muon optimizer handles training stability. This isn't DeepSeek iterating on V3 — it's a ground-up architectural rethink.&lt;/p&gt;

&lt;p&gt;The reasoning modes matter for how you deploy. Both Pro and Flash support three effort levels: standard, high, and max. For Think Max reasoning mode, DeepSeek recommends setting the context window to at least 384K tokens. &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; The Flash-Max mode is particularly interesting — Flash-Max achieves comparable reasoning performance to the Pro version when given a larger thinking budget, though its smaller parameter scale places it slightly behind on pure knowledge tasks and the most complex agentic workflows. &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Developers Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;The benchmark table that &lt;a href="https://glama.ai" rel="noopener noreferrer"&gt;Frank Fiegel at Glama&lt;/a&gt; flagged this morning tells the real story — specifically, the agentic and coding numbers.&lt;/p&gt;

&lt;p&gt;On LiveCodeBench, V4-Pro leads the pack at 93.5, ahead of Gemini (91.7) and Claude (88.8). Codeforces rating — a real-world competitive programming measure — puts V4-Pro at 3206, ahead of GPT-5.4 (3168) and Gemini (3052). &lt;a href="https://officechai.com/ai/deepseek-v4-pro-deepseek-v4-flash-benchmarks-pricing/" rel="noopener noreferrer"&gt;OfficeChai&lt;/a&gt; Competitive programming benchmarks are notoriously hard to game; this is the kind of number that makes engineers pay attention.&lt;/p&gt;

&lt;p&gt;On SWE-Verified (real software engineering tasks), V4-Pro sits at 80.6 — within a fraction of Claude (80.8) and matching Gemini (80.6). On Terminal Bench 2.0, V4-Pro (67.9) beats Claude (65.4) and is competitive with Gemini (68.5), though GPT-5.4 leads at 75.1. &lt;a href="https://officechai.com/ai/deepseek-v4-pro-deepseek-v4-flash-benchmarks-pricing/" rel="noopener noreferrer"&gt;OfficeChai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For math reasoning: on IMOAnswerBench, V4-Pro scores 89.8 — well ahead of Claude (75.3) and Gemini (81.0), though GPT-5.4 edges ahead at 91.4. &lt;a href="https://officechai.com/ai/deepseek-v4-pro-deepseek-v4-flash-benchmarks-pricing/" rel="noopener noreferrer"&gt;OfficeChai&lt;/a&gt; The one clear gap is Humanity's Last Exam, where V4-Pro scores 37.7 — just below GPT-5.4 (39.8), Claude (40.0), and Gemini (44.4). &lt;a href="https://officechai.com/ai/deepseek-v4-pro-deepseek-v4-flash-benchmarks-pricing/" rel="noopener noreferrer"&gt;OfficeChai&lt;/a&gt; Factual world knowledge retrieval is still where closed-source models hold a real edge.&lt;/p&gt;

&lt;p&gt;DeepSeek says V4 has been optimized for use with popular agent tools including Claude Code and OpenClaw &lt;a href="https://www.cnbc.com/2026/04/24/deepseek-v4-llm-preview-open-source-ai-competition-china.html" rel="noopener noreferrer"&gt;CNBC&lt;/a&gt; , which signals the team is building for production agentic deployment, not just benchmark positioning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The capability story is interesting. The cost story is the one that matters for anyone running production workloads.&lt;/p&gt;

&lt;p&gt;In comparison, OpenAI's GPT-5.4 costs $2.50 per 1M input tokens and $15.00 per 1M output tokens, while Claude Opus 4.6 costs $5 per 1M input tokens and $25 per 1M output tokens. DeepSeek — at least on benchmarks — delivers similar performance to these models at a 50-80% cost reduction. &lt;a href="https://officechai.com/ai/deepseek-v4-pro-deepseek-v4-flash-benchmarks-pricing/" rel="noopener noreferrer"&gt;OfficeChai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The timing is not accidental. OpenAI shipped GPT-5.5 the same day. DeepSeek needed a launch window where an open-source 1M-context MoE at a fraction of the cost would not be buried under a closed-source announcement. &lt;a href="https://ofox.ai/blog/deepseek-v4-release-guide-2026/" rel="noopener noreferrer"&gt;Ofox&lt;/a&gt; Shipping on the same day as your biggest competitor's release is a calculated move.&lt;/p&gt;

&lt;p&gt;The V3.2 to V4-Pro jump on Arena AI's live code leaderboard is 88 Elo — roughly the same delta between the third and thirteenth ranked models on the current board. It is a genuine generational step, not a refresh. &lt;a href="https://ofox.ai/blog/deepseek-v4-release-guide-2026/" rel="noopener noreferrer"&gt;Ofox&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The MCPAtlas Public benchmark in the LinkedIn post — where V4-Pro-Max scores 73.6 against Opus 4.6's 73.8 — is the number that stands out most for anyone building MCP-integrated agent pipelines. Open-source is now essentially at parity on structured tool use. That's the gap that just closed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;The weights are hosted on Hugging Face and ModelScope in FP8 and FP4+FP8 mixed precision formats, released under the MIT License for research and commercial use. &lt;a href="https://www.androidsage.com/2026/04/24/deepseek-v4-released-a-better-ai-alternative-to-chatgpt/" rel="noopener noreferrer"&gt;Android Sage&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;DeepSeek's pricing sits at $0.14/million tokens input and $0.28/million tokens output for Flash, and $1.74/million input and $3.48/million output for Pro. &lt;a href="https://simonwillison.net/2026/Apr/24/deepseek-v4/" rel="noopener noreferrer"&gt;Simon Willison&lt;/a&gt; The API is live today via OpenRouter and DeepSeek's own endpoint, supporting both OpenAI ChatCompletions and Anthropic protocols.&lt;/p&gt;

&lt;p&gt;Running a 1.6T parameter model locally requires significant GPU infrastructure — even in FP4+FP8 mixed precision, the memory requirements are substantial. &lt;a href="https://www.androidsage.com/2026/04/24/deepseek-v4-released-a-better-ai-alternative-to-chatgpt/" rel="noopener noreferrer"&gt;Android Sage&lt;/a&gt; For most teams, the API is the practical path. Flash-Max gives you near-Pro reasoning at Flash pricing, which is the configuration worth benchmarking against your specific workloads first.&lt;/p&gt;




&lt;p&gt;The gap between open-source and frontier AI just got measurably smaller — and for the first time, in some categories that actually matter for production agentic systems, it's not a gap at all. The question for teams running closed-source models at frontier prices is no longer "when will open-source catch up?" It's "what are we still paying for?"&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>deepseek</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Microsoft Fabric Just Exposed Its MCP Architecture. Here's What It Actually Changes for Data Teams.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Fri, 24 Apr 2026 01:17:59 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/microsoft-fabric-just-exposed-its-mcp-architecture-heres-what-it-actually-changes-for-data-teams-1i4e</link>
      <guid>https://dev.to/om_shree_0709/microsoft-fabric-just-exposed-its-mcp-architecture-heres-what-it-actually-changes-for-data-teams-1i4e</guid>
      <description>&lt;p&gt;Enterprise data platforms have spent decades building walls around their data. Microsoft just shipped the protocol that lets AI agents walk through those walls — natively, securely, and without a single custom integration.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;Every time an engineering team wants to connect an AI agent to a data platform, they rebuild the same plumbing from scratch: OAuth2 flows, token management, rate-limiting logic, API versioning, error handling. That's before the agent even does anything useful. Multiply that across a company running &lt;a href="https://github.com/features/copilot" rel="noopener noreferrer"&gt;GitHub Copilot&lt;/a&gt;, &lt;a href="https://claude.ai" rel="noopener noreferrer"&gt;Claude&lt;/a&gt;, &lt;a href="https://cursor.sh" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, and &lt;a href="https://www.microsoft.com/en-us/microsoft-copilot/microsoft-copilot-studio" rel="noopener noreferrer"&gt;Copilot Studio&lt;/a&gt; simultaneously, and the integration surface becomes unmanageable.&lt;/p&gt;

&lt;p&gt;The deeper issue is that AI tools have no shared language for talking to enterprise systems. Each integration is bespoke, brittle, and built by someone who had better things to do. The agent either gets too little context or too much — and neither produces reliable outputs against production infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Fabric MCP Architecture Actually Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.microsoft.com/en-us/microsoft-fabric" rel="noopener noreferrer"&gt;Microsoft Fabric&lt;/a&gt; is now shipping two distinct MCP entry points, each targeting a different level of autonomy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Fabric Local MCP&lt;/a&gt;&lt;/strong&gt; is now Generally Available. It's an open-source server that runs on the developer's machine, giving AI assistants deep knowledge of Fabric's APIs. It also enables local-to-cloud data operations — upload data to OneLake, create items, inspect table schemas — all within a single conversation. The Local MCP can wrap the Fabric CLI as tools, meaning CI/CD pipelines can use it to deploy changes with no human in the loop. Authentication is integrated, so there's no manual token management. The recommended install path is a VS Code extension that configures everything automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Fabric Remote MCP&lt;/a&gt;&lt;/strong&gt; is in Preview. This is the cloud-hosted server — no local setup required. It lets AI agents perform authenticated operations directly in a Fabric environment: managing workspaces, handling permissions, executing tasks on behalf of teams. This is the entry point for autonomous agents running in Copilot Studio, not developers pair-programming at a terminal.&lt;/p&gt;

&lt;p&gt;Both run inside the security model, audit trail, and RBAC boundaries Fabric already enforces. The agents can only access what the authenticated user can access. There are no additional roles to provision, no shadow permissions, no new attack surface to manage.&lt;/p&gt;

&lt;p&gt;The underlying protocol making this possible is &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP&lt;/a&gt; — originally created by Anthropic and now adopted by GitHub, Cloudflare, Stripe, and a growing list of enterprise platforms. Rather than creating unique integrations for each AI tool, exposing the platform as an MCP server means any MCP-compatible client can connect instantly. &lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Microsoft Fabric&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Teams Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;The use cases split cleanly by role.&lt;/p&gt;

&lt;p&gt;A developer building a data pipeline uses the Local MCP to let GitHub Copilot or Claude look up the correct Fabric API spec, generate code against it, upload data to &lt;a href="https://learn.microsoft.com/en-us/fabric/onelake/onelake-overview" rel="noopener noreferrer"&gt;OneLake&lt;/a&gt;, and validate the result — all within one conversation thread. The agent isn't guessing at APIs or hallucinating parameter names. It's reading the live spec through the MCP server.&lt;/p&gt;

&lt;p&gt;A data team running autonomous workflows points &lt;a href="https://www.microsoft.com/en-us/microsoft-copilot/microsoft-copilot-studio" rel="noopener noreferrer"&gt;Copilot Studio&lt;/a&gt; at the Remote MCP. The agent provisions workspaces, adjusts permissions, and manages resources on behalf of the team without anyone opening the Fabric portal.&lt;/p&gt;

&lt;p&gt;A CI/CD pipeline uses the Fabric CLI wrapped as MCP tools to deploy changes on a schedule, no human in the loop, no interactive auth required.&lt;/p&gt;

&lt;p&gt;And separately, &lt;a href="https://blog.fabric.microsoft.com/en-us/blog/give-your-ai-agent-the-keys-to-onelake-onelake-mcp-generally-available/" rel="noopener noreferrer"&gt;OneLake MCP is now Generally Available&lt;/a&gt; as part of the same extension, letting agents traverse the full OneLake hierarchy — from workspace to item to table schema to physical Delta Lake files — through natural language. An admin could ask an agent to inventory every item in a workspace, a data engineer could check table optimization across lakehouses, and an analyst could explore an unfamiliar dataset without writing a query. &lt;a href="https://blog.fabric.microsoft.com/en-us/blog/give-your-ai-agent-the-keys-to-onelake-onelake-mcp-generally-available/" rel="noopener noreferrer"&gt;Microsoft Fabric&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;When Microsoft previewed the Fabric Local MCP in October, the announcement became one of their most-read posts, approaching 100K views. &lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Microsoft Fabric&lt;/a&gt; That's not a vanity metric — it's a signal that data engineers are actively looking for exactly this kind of native agent integration, not another middleware layer to manage.&lt;/p&gt;

&lt;p&gt;The more consequential signal is architectural. Microsoft didn't build a Fabric-specific agent framework. They implemented MCP — the same protocol Anthropic, GitHub, Cloudflare, and Stripe are converging on — and exposed Fabric through it. That's a deliberate bet that the agentic ecosystem will standardize on one protocol, and that being MCP-native is table stakes for enterprise platforms going forward.&lt;/p&gt;

&lt;p&gt;The analogy Microsoft uses in their own post is precise: MCP is to AI what USB was to hardware — a universal connector that replaces a tangle of proprietary cables with a single standard. &lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Microsoft Fabric&lt;/a&gt; USB didn't make hardware more capable. It made capability composable. That's exactly what MCP does for data infrastructure.&lt;/p&gt;

&lt;p&gt;For teams evaluating where to build agentic data workflows, this changes the calculus. Fabric is no longer just a lakehouse or a BI platform. It's now a surface that any MCP-compatible agent can operate against, with enterprise-grade governance baked in, not bolted on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Fabric Local MCP&lt;/a&gt; is Generally Available now. Install via the VS Code Marketplace extension — it configures automatically and works with GitHub Copilot, Cursor, Claude Desktop, and any MCP-compatible client. &lt;a href="https://blog.fabric.microsoft.com/en-US/blog/agentic-fabric-how-mcp-is-turning-your-data-platform-into-an-ai-native-operating-system/" rel="noopener noreferrer"&gt;Fabric Remote MCP&lt;/a&gt; is in Preview. &lt;a href="https://blog.fabric.microsoft.com/en-us/blog/give-your-ai-agent-the-keys-to-onelake-onelake-mcp-generally-available/" rel="noopener noreferrer"&gt;OneLake MCP tools&lt;/a&gt; ship automatically as part of the Fabric MCP extension if you already have it installed — no additional configuration required.&lt;/p&gt;




&lt;p&gt;The question enterprise data teams should be asking isn't whether to adopt MCP-native tooling. It's how quickly they can deprecate the custom integration layers they've already built. That migration just got a lot easier.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>microsoft</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Google Just Launched an Official Agent Skills Repository. Here's What It Actually Solves.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Fri, 24 Apr 2026 01:15:08 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/google-just-launched-an-official-agent-skills-repository-heres-what-it-actually-solves-2k5c</link>
      <guid>https://dev.to/om_shree_0709/google-just-launched-an-official-agent-skills-repository-heres-what-it-actually-solves-2k5c</guid>
      <description>&lt;p&gt;Google just shipped an official repository of Agent Skills at &lt;a href="https://www.googlecloudevents.com/next-vegas/" rel="noopener noreferrer"&gt;Google Cloud Next 2026&lt;/a&gt;. It's a quiet announcement, but it points at one of the most persistent unsolved problems in production agentic AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;MCP servers were supposed to fix context. Give your agent a live, grounded connection to documentation, and it wouldn't hallucinate outdated APIs or confuse one SDK version with another. And that largely works — Google already runs an &lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;MCP server for its developer docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But there's a compounding cost. When agents lean heavily on MCP servers, they pull massive amounts of context into their window on every request. The model drowns in raw documentation, token costs spike, and coherence drops. The community calls this "context bloat," and it gets worse the more products an agent is expected to know.&lt;/p&gt;

&lt;p&gt;The real gap isn't access to information. It's the absence of &lt;em&gt;condensed&lt;/em&gt;, agent-optimized expertise that loads on demand rather than all at once.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Agent Skills Actually Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://agentskills.io/home" rel="noopener noreferrer"&gt;Agent Skills&lt;/a&gt; is an open format — originally developed by Anthropic and released as a community standard — for giving agents packaged, structured expertise. At its core, a skill is a folder containing a &lt;code&gt;SKILL.md&lt;/code&gt; file with metadata and task-specific instructions. It can also bundle scripts, reference docs, templates, and other assets. Think of it as agent-first documentation: compact, purposeful, and written for a machine that needs to act, not just read.&lt;/p&gt;

&lt;p&gt;The mechanism that makes it practical is progressive disclosure. At startup, an agent loads only the name and description of each available skill — just enough to know whether a skill is relevant to the current task. When there's a match, the full instructions are pulled into context. The agent then executes, optionally running bundled scripts or referencing additional files.&lt;/p&gt;

&lt;p&gt;Full context only loads when it's actually needed. That's the design decision that separates this from dumping a documentation site into a system prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Developers Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;Google's official repository launches at &lt;a href="https://github.com/google/skills" rel="noopener noreferrer"&gt;github.com/google/skills&lt;/a&gt; with thirteen skills out of the gate. Seven are product-specific: AlloyDB, BigQuery, Cloud Run, Cloud SQL, Firebase, the Gemini API, and GKE. Three map to the &lt;a href="https://docs.cloud.google.com/architecture/framework" rel="noopener noreferrer"&gt;Well-Architected Framework&lt;/a&gt; pillars — Security, Reliability, and Cost Optimization. And three are "recipe" skills covering onboarding, authentication, and network observability.&lt;/p&gt;

&lt;p&gt;Installing them is a single command: &lt;code&gt;npx skills install github.com/google/skills&lt;/code&gt;. They work across &lt;a href="https://antigravity.google/" rel="noopener noreferrer"&gt;Antigravity&lt;/a&gt;, &lt;a href="https://geminicli.com/" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt;, and any third-party agent that implements the Skills spec.&lt;/p&gt;

&lt;p&gt;The product-specific skills are the immediately practical ones. An agent working against BigQuery or GKE no longer needs to maintain a live MCP connection to documentation just to get accurate syntax, service limits, or recommended patterns. The skill carries that knowledge, loads it when relevant, and stays out of the way otherwise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The Agent Skills format wasn't built by Google — it was built by Anthropic and open-sourced. Google adopting it as the vehicle for their official documentation layer is a meaningful signal: this is becoming infrastructure, not a framework-specific feature.&lt;/p&gt;

&lt;p&gt;For teams building agents on Google Cloud, the practical implication is real. You can now equip an agent with accurate, maintained, Google-authored knowledge about Cloud Run or Firebase without inflating every prompt with raw documentation. The skills are versioned, auditable, and composable — which matters when you're running multi-step workflows across multiple GCP products.&lt;/p&gt;

&lt;p&gt;The deeper shift here is architectural. MCP solved &lt;em&gt;access&lt;/em&gt;. Agent Skills solves &lt;em&gt;delivery&lt;/em&gt;. They're complementary, and the combination starts to look like a serious answer to the context problem that's been quietly breaking production agents for the past year.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;The repository is live now at &lt;a href="https://github.com/google/skills" rel="noopener noreferrer"&gt;github.com/google/skills&lt;/a&gt;. Google has confirmed additional skills will ship in the coming weeks and months. The &lt;a href="https://agentskills.io/specification" rel="noopener noreferrer"&gt;Agent Skills format spec&lt;/a&gt; is open, meaning any agent platform can implement support, and any team can build and distribute their own skills using the same structure.&lt;/p&gt;




&lt;p&gt;Context bloat has been treated like an engineering nuisance. Google just made the case that it's an infrastructure problem — and shipped a solution for it. The question now is how quickly the rest of the ecosystem follows with their own official skills repositories.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>discuss</category>
      <category>agentskills</category>
    </item>
    <item>
      <title>The Bitwarden CLI Just Got Backdoored. Here's What the Supply Chain Attack Actually Did.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Fri, 24 Apr 2026 01:11:43 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/the-bitwarden-cli-just-got-backdoored-heres-what-the-supply-chain-attack-actually-did-4aoi</link>
      <guid>https://dev.to/om_shree_0709/the-bitwarden-cli-just-got-backdoored-heres-what-the-supply-chain-attack-actually-did-4aoi</guid>
      <description>&lt;p&gt;Bitwarden serves over 10 million users and 50,000 businesses. On April 22, 2026, for exactly 93 minutes, its CLI was shipping malware.&lt;/p&gt;

&lt;p&gt;This was not a phishing campaign. Nobody tricked a Bitwarden employee into clicking a link. The attackers walked straight through the CI/CD pipeline, injected malicious code into an official release, and let Bitwarden's own npm publishing mechanism do the distribution for them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving (For the Attackers)
&lt;/h2&gt;

&lt;p&gt;Supply chain attacks are effective precisely because they subvert trust. You don't need to compromise a developer's machine directly if you can compromise the tool they're already running with elevated permissions in their build pipeline.&lt;/p&gt;

&lt;p&gt;The affected package version was &lt;code&gt;@bitwarden/cli@2026.4.0&lt;/code&gt;, and the malicious code was published in &lt;code&gt;bw1.js&lt;/code&gt;, a file included in the package contents. &lt;a href="https://thehackernews.com/2026/04/bitwarden-cli-compromised-in-ongoing.html" rel="noopener noreferrer"&gt;The Hacker News&lt;/a&gt; The Bitwarden CLI sits in a privileged position in developer environments — it's commonly used for secrets injection and automated deployments in CI/CD pipelines. That makes it a high-value target.&lt;/p&gt;

&lt;p&gt;The compromise was connected to the ongoing Checkmarx supply chain campaign, with a threat group hijacking the npm package and injecting malicious code designed to steal sensitive data from developer workstations and CLI environments. &lt;a href="https://securityboulevard.com/2026/04/bitwarden-cli-compromise-linked-to-ongoing-checkmarx-supply-chain-campaign/" rel="noopener noreferrer"&gt;Security Boulevard&lt;/a&gt; The researchers who caught it — Socket, JFrog, Ox Security, and StepSecurity — identified it as part of a broader pattern that has been running since at least March 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Attack Actually Worked
&lt;/h2&gt;

&lt;p&gt;The attackers gained access by exploiting a GitHub Actions workflow in Bitwarden's CI/CD pipeline, mirroring previously documented techniques in the Checkmarx campaign, where threat actors leveraged stolen credentials to inject malicious workflows, exfiltrate secrets, and tamper with build outputs before distribution. &lt;a href="https://cyberinsider.com/bitwarden-cli-backdoored-in-checkmarx-supply-chain-attack/" rel="noopener noreferrer"&gt;CyberInsider&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once inside, the payload was embedded quietly into a legitimate build. The malicious payload, embedded in a file called &lt;code&gt;bw1.js&lt;/code&gt;, ran during package installation and harvested GitHub and npm tokens, SSH keys, environment variables, shell history, and cloud credentials. &lt;a href="https://tech.yahoo.com/cybersecurity/articles/bitwarden-cli-supply-chain-attack-142710104.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The exfiltration destination is worth noting. The stolen data was encrypted with AES-256-GCM and exfiltrated to &lt;code&gt;audit.checkmarx[.]cx&lt;/code&gt;, a domain impersonating Checkmarx. &lt;a href="https://thehackernews.com/2026/04/bitwarden-cli-compromised-in-ongoing.html" rel="noopener noreferrer"&gt;The Hacker News&lt;/a&gt; Typosquatting a security company's domain to hide malware traffic is a particularly cynical touch.&lt;/p&gt;

&lt;p&gt;The blast radius doesn't stop at the developer's machine either. If GitHub tokens are found, the malware weaponizes them to inject malicious Actions workflows into repositories and extract CI/CD secrets — meaning a single developer with the affected version installed can become the entry point for a broader supply chain compromise, with the attacker gaining persistent workflow injection access to every CI/CD pipeline the developer's token can reach. &lt;a href="https://thehackernews.com/2026/04/bitwarden-cli-compromised-in-ongoing.html" rel="noopener noreferrer"&gt;The Hacker News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There's also an attribution wrinkle that security researchers are still untangling. While the shared tooling strongly suggests a connection to the same malware ecosystem as the Checkmarx campaign, the operational signatures differ: the ideological branding is embedded directly in the malware, from the Shai-Hulud repository names to the "Butlerian Jihad" manifesto payload to commit messages proclaiming resistance against machines. &lt;a href="https://socket.dev/blog/bitwarden-cli-compromised" rel="noopener noreferrer"&gt;Socket&lt;/a&gt; That points to either a splinter group or a campaign evolution — not a clean attribution to TeamPCP, who claimed the original Checkmarx attack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Developers Are Actually Exposed To
&lt;/h2&gt;

&lt;p&gt;Only the npm CLI package was affected. Bitwarden's Chrome extension, MCP server, and other official distribution channels remain uncompromised. &lt;a href="https://cybersecuritynews.com/bitwarden-cli-compromised/" rel="noopener noreferrer"&gt;Cyber Security News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The malicious package was active between 5:57 PM and 7:30 PM ET on April 22, 2026. &lt;a href="https://community.bitwarden.com/t/bitwarden-statement-on-checkmarx-supply-chain-incident/96127" rel="noopener noreferrer"&gt;Bitwarden&lt;/a&gt; That's a 93-minute exposure window. Narrow, but enough.&lt;/p&gt;

&lt;p&gt;Security researcher Adnan Khan noted this is the first known compromise of a package using npm's trusted publishing mechanism, which was designed to eliminate long-lived tokens. &lt;a href="https://beincrypto.com/bitwarden-cli-supply-chain-attack-crypto/" rel="noopener noreferrer"&gt;BeInCrypto&lt;/a&gt; That's significant. Trusted publishing was supposed to be the hardened path. The attackers didn't bypass it — they compromised the GitHub Actions workflow upstream of it, then let the trusted mechanism publish for them.&lt;/p&gt;

&lt;p&gt;TeamPCP's broader campaign separately targets crypto wallet data, including MetaMask, Phantom, and Solana wallet files, and has chained similar attacks against Trivy, Checkmarx, and LiteLLM since March 2026, targeting developer tools that sit deep in build pipelines. &lt;a href="https://tech.yahoo.com/cybersecurity/articles/bitwarden-cli-supply-chain-attack-142710104.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;Bitwarden confirmed the incident but contained the framing carefully: no end-user vault data was accessed, no production systems compromised. That's true, and it matters. But the more important story here isn't about Bitwarden specifically.&lt;/p&gt;

&lt;p&gt;The attack is another example of the increasing cybersecurity risks to CI/CD architectures as they become more foundational in the software development pipeline, with threat actors expanding their targeting of them in supply chain campaigns. &lt;a href="https://securityboulevard.com/2026/04/bitwarden-cli-compromise-linked-to-ongoing-checkmarx-supply-chain-campaign/" rel="noopener noreferrer"&gt;Security Boulevard&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The vector — GitHub Actions compromise leading to poisoned npm releases — is repeatable. It has been used against Trivy, Checkmarx's own tooling, and LiteLLM. The Bitwarden compromise isn't an isolated incident; it's the latest iteration of a campaign that is actively refining its technique against high-trust developer tooling.&lt;/p&gt;

&lt;p&gt;And the MCP angle is worth flagging: the malicious &lt;code&gt;bw1.js&lt;/code&gt; payload shares core infrastructure with the previously analyzed &lt;code&gt;mcpAddon.js&lt;/code&gt;, including an identical C2 endpoint. &lt;a href="https://cybersecuritynews.com/bitwarden-cli-compromised/" rel="noopener noreferrer"&gt;Cyber Security News&lt;/a&gt; As MCP servers proliferate across developer toolchains, they're becoming targets in the same supply chain vector. The attack surface is expanding in lockstep with adoption.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access (What To Do Right Now)
&lt;/h2&gt;

&lt;p&gt;If you installed &lt;code&gt;@bitwarden/cli@2026.4.0&lt;/code&gt; during the window: treat your environment as fully compromised. Rotate every secret the machine had access to — GitHub tokens, npm credentials, cloud provider keys, SSH keys, everything.&lt;/p&gt;

&lt;p&gt;Socket recommends downgrading to version 2026.3.0 or switching to official signed binaries from Bitwarden's website. &lt;a href="https://beincrypto.com/bitwarden-cli-supply-chain-attack-crypto/" rel="noopener noreferrer"&gt;BeInCrypto&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On endpoints and runners, hunt for outbound connections to &lt;code&gt;audit[.]checkmarx[.]cx&lt;/code&gt;, execution of Bun where it is not normally used, and access to files such as &lt;code&gt;.npmrc&lt;/code&gt;, &lt;code&gt;.git-credentials&lt;/code&gt;, &lt;code&gt;.env&lt;/code&gt;, and cloud credential stores. For GitHub Actions, review whether any unapproved workflows were created on transient branches. &lt;a href="https://socket.dev/blog/bitwarden-cli-compromised" rel="noopener noreferrer"&gt;Socket&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A CVE for Bitwarden CLI version 2026.4.0 is being issued in connection with this incident. &lt;a href="https://cyberinsider.com/bitwarden-cli-backdoored-in-checkmarx-supply-chain-attack/" rel="noopener noreferrer"&gt;CyberInsider&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;The Bitwarden CLI attack is a clean demonstration of where the real risk in developer infrastructure lives right now: not in the applications themselves, but in the build systems that ship them. One poisoned GitHub Actions workflow, one 93-minute publish window, and a trusted tool becomes a credential harvester running inside your pipeline with your own permissions.&lt;/p&gt;

&lt;p&gt;The question isn't whether your password manager's vault is safe. It's whether you can verify the integrity of every tool in your build chain. Most teams can't.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>discuss</category>
      <category>devops</category>
    </item>
    <item>
      <title>Google Just Split Its TPU Into Two Chips. Here's What That Actually Signals About the Agentic Era.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:47:11 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/google-just-split-its-tpu-into-two-chips-heres-what-that-actually-signals-about-the-agentic-era-2485</link>
      <guid>https://dev.to/om_shree_0709/google-just-split-its-tpu-into-two-chips-heres-what-that-actually-signals-about-the-agentic-era-2485</guid>
      <description>&lt;p&gt;Training and inference have always had different physics. Google just decided to stop pretending one chip could handle both.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://cloud.google.com/blog/products/compute/ai-infrastructure-at-next26" rel="noopener noreferrer"&gt;Google Cloud Next '26&lt;/a&gt; on April 22, Google announced the eighth generation of its Tensor Processing Units — but for the first time in TPU history, that generation isn't a single chip. It's two: the &lt;strong&gt;TPU 8t&lt;/strong&gt; for training, and the &lt;strong&gt;TPU 8i&lt;/strong&gt; for inference and agentic workloads. That architectural split is the most meaningful signal in this announcement, and most coverage has buried it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;Standard RAG retrieves. Agents reason, plan, execute, and loop back. That distinction matters enormously at the infrastructure level.&lt;/p&gt;

&lt;p&gt;Chat-based AI inference has a relatively forgiving latency budget. A user submits a prompt, waits a second or two, reads the response. Agentic workflows don't work that way. A primary agent decomposes a goal into subtasks, dispatches specialized agents, collects results, evaluates them, and decides what to do next — all in real time, potentially across thousands of concurrent sessions. The per-step latency compounds. If your inference chip is optimized for throughput over latency (which it was, because that's what training needs), you end up with agent loops that are sluggish, expensive, and hard to scale.&lt;/p&gt;

&lt;p&gt;Previous TPU generations, including last year's &lt;a href="https://cloud.google.com/tpu" rel="noopener noreferrer"&gt;Ironwood&lt;/a&gt;, were pitched as unified flagship chips. Google's internal experience running Gemini, its consumer AI products, and increasingly complex agent workloads apparently showed that a single architecture forces uncomfortable trade-offs. So they split the roadmap.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the TPU 8t and TPU 8i Actually Work
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;TPU 8t&lt;/strong&gt; is the training powerhouse. It packs 9,600 chips in a single superpod to provide 121 exaflops of compute and two petabytes of shared memory connected through high-speed inter-chip interconnects. That's roughly 3x higher compute performance than the previous generation, with doubled ICI bandwidth to ensure that massive models hit near-linear scaling. At the cluster level, Google can now connect more than one million TPUs across multiple data center sites into a training cluster — essentially transforming globally distributed infrastructure into one seamless supercomputer.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;TPU 8i&lt;/strong&gt; is the more architecturally interesting chip. With 3x more on-chip SRAM over the previous generation, TPU 8i can host a larger KV Cache entirely on silicon, significantly reducing the idle time of the cores during long-context decoding. The key innovation is a component called the &lt;strong&gt;Collectives Acceleration Engine (CAE)&lt;/strong&gt; — a dedicated unit that aggregates results across cores with near-zero latency, specifically accelerating the reduction and synchronization steps required during autoregressive decoding and chain-of-thought processing. The result: on-chip latency of collectives drops by 5x.&lt;/p&gt;

&lt;p&gt;Google also redesigned the inter-chip network topology specifically for 8i. The previous 3D torus topology prioritized bandwidth. For 8i, Google changed how chips connect together using fully connected boards aggregated into groups — a high-radix design called Boardfly that connects up to 1,152 chips together, reducing the network diameter and the number of hops a data packet must take to cross the system, achieving up to a 50% improvement in latency for communication-intensive workloads.&lt;/p&gt;

&lt;p&gt;In raw spec terms, the 8i delivers 9.8x the FP8 EFlops per pod, 6.8x the HBM capacity per pod, and a pod size that grows 4.5x from 256 to 1,152 chips compared to the prior generation.&lt;/p&gt;

&lt;p&gt;The economic headline: TPU 8i delivers 80% better performance per dollar for inference than the prior generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Teams Are Actually Using This For
&lt;/h2&gt;

&lt;p&gt;The split architecture is most directly useful for three categories of workload.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontier model training&lt;/strong&gt; at labs and large enterprises. TPU 8t was designed in partnership with Google DeepMind and is built to efficiently train world models like DeepMind's Genie 3, enabling millions of agents to practice and refine their reasoning in diverse simulated environments. If you're training large proprietary models, the 8t's near-linear scaling at million-chip clusters changes the economics of when you can afford to retrain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High-concurrency agentic inference&lt;/strong&gt; is where the 8i shines. Multi-agent pipelines, MoE model serving, chain-of-thought reasoning loops — all of these hammer the all-to-all communication patterns that the Boardfly topology specifically addresses. The implication is lower latency per agent step at scale, which compounds significantly when you're running thousands of parallel agent sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reinforcement learning post-training&lt;/strong&gt; sits between the two. Google's new Axion-powered N4A CPU instances handle the complex logic, tool-calls, and feedback loops surrounding the core AI model — offering up to 30% better price-performance than comparable agent workloads on other hyperscalers. The intended stack is TPU 8t for pre-training, TPU 8i for RL and inference, and Axion for orchestration logic.&lt;/p&gt;

&lt;p&gt;Google is also wrapping all of this in upgraded networking. The Virgo Network's collapsed fabric architecture offers 4x the bandwidth of previous generations and can connect 134,000 TPUs into a single fabric in a single data center. Storage got overhauled too: Google Cloud Managed Lustre now delivers 10 TB/s of bandwidth — a 10x improvement over last year — with sub-millisecond latency via TPUDirect and RDMA, allowing data to bypass the host and move directly to the accelerators.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The obvious read on this announcement is "Google vs. Nvidia." That framing is mostly wrong, and Google itself isn't pretending otherwise. Google promises its cloud will have Nvidia's latest chip, Vera Rubin, available later this year, and the two companies are co-engineering the open-source Falcon networking protocol via the Open Compute Project. This is not a replacement strategy — it's a portfolio strategy.&lt;/p&gt;

&lt;p&gt;The more important signal is what the architectural split says about where the AI workload is going. Seven generations of TPUs were built on the assumption that training and inference are different phases of the same pipeline — you train, then you serve. The 8t/8i split encodes a different belief: that agentic inference is so architecturally distinct from training that they require fundamentally different silicon. That's a bet on the permanence of agentic workflows, not just a current optimization.&lt;/p&gt;

&lt;p&gt;For enterprise buyers, the TPU v8 reframes the 2026–2027 cloud evaluation in concrete ways: teams training large proprietary models should look at 8t availability windows and Virgo networking access. Teams serving agents or reasoning workloads should evaluate 8i on Vertex AI and whether HBM-per-pod sizing fits their context windows.&lt;/p&gt;

&lt;p&gt;There's also a vertical integration argument here that's easy to underestimate. Google co-designs its chips with DeepMind, runs them on its own networking fabric, manages its own storage layer, and orchestrates everything through GKE. Native PyTorch support for TPU — TorchTPU — is now in preview with select customers, allowing models to run on TPUs as-is with full support for native PyTorch Eager Mode. That removes one of the biggest friction points developers have historically had with TPUs: you no longer need to rewrite your training code to access Google's silicon. Combined with vLLM support on TPU, the migration path from an Nvidia-based setup is shorter than it's ever been.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;TPU 8t and TPU 8i will be available to Cloud customers later in 2026. You can request more information now to prepare for their general availability. The chips are integrated into Google's &lt;a href="https://cloud.google.com/solutions/ai-hypercomputer" rel="noopener noreferrer"&gt;AI Hypercomputer&lt;/a&gt; stack, supporting JAX, PyTorch, vLLM, and XLA. Deployment options range from &lt;a href="https://cloud.google.com/vertex-ai" rel="noopener noreferrer"&gt;Vertex AI&lt;/a&gt; managed services to &lt;a href="https://cloud.google.com/kubernetes-engine" rel="noopener noreferrer"&gt;GKE&lt;/a&gt; for teams that want infrastructure-level control.&lt;/p&gt;

&lt;p&gt;The honest caveat: these are self-reported benchmarks against Google's own prior generation. Independent third-party numbers from cloud customers and evaluators will emerge over the next two quarters, and those will be the numbers that actually matter for procurement decisions.&lt;/p&gt;

&lt;p&gt;The split TPU roadmap isn't just a chip announcement — it's Google encoding its architectural thesis about what AI infrastructure looks like in an agentic world directly into silicon. Every other hyperscaler is going to have to answer the same question: do you build one chip to do everything, or do you specialize?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>discuss</category>
      <category>cloud</category>
      <category>google</category>
    </item>
    <item>
      <title>NeoCognition Just Raised $40M to Fix the One Thing Every AI Agent Gets Wrong</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:34:27 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/neocognition-just-raised-40m-to-fix-the-one-thing-every-ai-agent-gets-wrong-i1n</link>
      <guid>https://dev.to/om_shree_0709/neocognition-just-raised-40m-to-fix-the-one-thing-every-ai-agent-gets-wrong-i1n</guid>
      <description>&lt;p&gt;Every AI agent demo looks impressive until you actually depend on one. That 50% task completion rate you've quietly accepted as "normal"? &lt;a href="https://techcrunch.com/2026/04/21/ai-research-lab-neocognition-lands-40m-seed-to-build-agents-that-learn-like-humans/" rel="noopener noreferrer"&gt;NeoCognition&lt;/a&gt; just called it out directly, and raised $40 million to do something about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;The foundational critique that NeoCognition is building on is blunt: current agents — whether from Claude Code, OpenClaw, or Perplexity's computer tools — successfully complete tasks as intended only about 50% of the time. That is not a UX problem or a prompt engineering problem. It's a structural one. Today's agents are stateless generalists. They bring no accumulated knowledge of your environment, your workflows, or your domain's specific constraints to each task. Every time you invoke one, it's starting from scratch.&lt;/p&gt;

&lt;p&gt;The standard industry response to this has been fine-tuning — custom-engineering an agent for a specific vertical and hoping it holds. That works until the domain shifts, the tooling changes, or you need to deploy the same agent somewhere new. Then you're back to zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  How NeoCognition Actually Works
&lt;/h2&gt;

&lt;p&gt;NeoCognition was started by Yu Su, Xiang Deng, and Yu Gu, who all worked together in Su's AI agent lab at Ohio State University. Su's team began developing LLM-based agents before the ChatGPT moment, and their research — including Mind2Web and MMMU — is now used by OpenAI, Anthropic, and Google. This is not a product team that pivoted into agents. It's the research behind the agents you're already using, now building something opinionated about what those agents got wrong.&lt;/p&gt;

&lt;p&gt;The core thesis is drawn from how humans actually acquire expertise. NeoCognition's agents continuously learn the structure, workflows, and constraints of the environments they operate in, and specialize into domain experts by learning a world model of work. The phrase "world model" is doing significant work here. Rather than applying general reasoning to every task, these agents are designed to build an internal map of a specific micro-environment — its rules, its dependencies, its edge cases — and continuously refine that map through experience.&lt;/p&gt;

&lt;p&gt;The Palo Alto startup argues that its agents learn on the job as specialists rather than relying on fixed general training, which is the architectural distinction that matters. Fixed training is a snapshot. A world model grows.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Enterprises Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;NeoCognition's primary target is the enterprise market, and specifically the SaaS layer. NeoCognition intends to sell its agent systems primarily to enterprises, including established SaaS companies, which can use them to build agent workers or to enhance existing product offerings. The framing here is interesting: they're not just selling agents to enterprises, they're selling the infrastructure for SaaS companies to make their own products agentic.&lt;/p&gt;

&lt;p&gt;The Vista Equity Partners participation is strategic, not just financial. As one of the largest private equity firms in the software space, Vista can provide NeoCognition with direct access to a vast portfolio of companies looking to modernize their products with AI. That's a go-to-market lever, not just a check. You don't close Vista for the cap table optics — you close them because they own the distribution you need.&lt;/p&gt;

&lt;p&gt;The deeper implication for enterprises is the safety argument. Deeper understanding of their environments enables NeoCognition's agents to be more responsible and safer actors in high-stake settings. An agent that understands why a workflow exists — not just what the workflow is — is less likely to take a technically correct action that's contextually wrong. That's the difference between a tool and a trusted system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The investor list deserves more attention than most coverage is giving it. Angel investors and founding advisors include Lip-Bu Tan, CEO of Intel, Ion Stoica, co-founder and executive chairman of Databricks, and leading AI researchers like Dawn Song, Ruslan Salakhutdinov, and Luke Zettlemoyer. That last trio — Song, Salakhutdinov, Zettlemoyer — are foundational researchers in modern deep learning and NLP. When researchers of that caliber put their names on a company, they're endorsing the technical thesis, not just the team.&lt;/p&gt;

&lt;p&gt;The timing reflects a broader pattern in AI investment in 2026: capital is increasingly flowing not towards frontier model development — dominated by a small number of well-capitalized labs — but towards the infrastructure and agent layer above it. The model wars are effectively over for now. The next real competition is in what those models can reliably &lt;em&gt;do&lt;/em&gt;, and that's an infrastructure and learning problem, not a parameter-count problem.&lt;/p&gt;

&lt;p&gt;What NeoCognition is proposing — agents that build structured world models of their operating environments — is also the missing architectural primitive for MCP-based agent pipelines. Right now, most agentic systems using MCP are still stateless: each tool call happens in context, but the agent isn't &lt;em&gt;learning&lt;/em&gt; the tool ecosystem it operates in. An agent layer that builds persistent, structured knowledge of its environment and the tools available to it would meaningfully change what's achievable in production agentic workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;NeoCognition has just emerged from stealth, so there's no public product available yet. The company currently has about 15 employees, the majority of whom hold PhDs. This is explicitly still a research-to-product transition — the $40M is funding that transition. Enterprise access will likely come through direct partnership channels, given the Vista relationship and the SaaS-first go-to-market. Developers wanting to follow the research can track Su's prior work through his &lt;a href="https://ysu1989.github.io/" rel="noopener noreferrer"&gt;Ohio State lab page&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;The 50% reliability ceiling on current agents isn't a model problem — it's a memory and specialization problem. NeoCognition is making a structural bet that the next unlock in agent reliability isn't more parameters; it's agents that actually learn where they're deployed. If they're right, the companies building on today's stateless agent architectures are building on borrowed time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>agents</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Google's Project Jitro Just Redefined What a Coding Agent Is. Here's What It Actually Changes.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Wed, 22 Apr 2026 03:35:56 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/googles-project-jitro-just-redefined-what-a-coding-agent-is-heres-what-it-actually-changes-4oc3</link>
      <guid>https://dev.to/om_shree_0709/googles-project-jitro-just-redefined-what-a-coding-agent-is-heres-what-it-actually-changes-4oc3</guid>
      <description>&lt;p&gt;Project Jules used to tell your AI what to do. Jitro tells it what you want. That gap — between task execution and outcome ownership — is the entire bet Google is making with its next-generation coding agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With Every Coding Agent Right Now
&lt;/h2&gt;

&lt;p&gt;Every major AI coding tool today, &lt;a href="https://github.com/features/copilot" rel="noopener noreferrer"&gt;GitHub Copilot&lt;/a&gt;, &lt;a href="https://cursor.sh/" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, &lt;a href="https://codeium.com/windsurf" rel="noopener noreferrer"&gt;Windsurf&lt;/a&gt;, &lt;a href="https://openai.com/codex" rel="noopener noreferrer"&gt;OpenAI's Codex&lt;/a&gt; — operates on the same underlying model: you define the work, the agent does it. You write the prompt, you review the output, you write the next prompt. The developer is still the scheduler, the project manager, and the QA team. The AI is a very fast, very capable executor.&lt;/p&gt;

&lt;p&gt;That's genuinely useful. But it hits a ceiling. When your goal is "reduce memory leaks in the backend by 20%" or "get our accessibility score to 100%," you don't want to translate that into ten sequential prompts across a week. You want to hand it off. No current tool actually lets you do that.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Project Jitro Actually Works
&lt;/h2&gt;

&lt;p&gt;Google is internally developing Project Jitro as an autonomous AI system that moves beyond prompt-based coding to independently execute high-level development goals. It's built on &lt;a href="https://jules.google/" rel="noopener noreferrer"&gt;Jules&lt;/a&gt;, Google's existing asynchronous coding agent — but the architecture is meaningfully different.&lt;/p&gt;

&lt;p&gt;Rather than asking developers to manually instruct an agent on what to build or fix, Jules V2 appears designed around high-level goal-setting — KPI-driven development, where the agent autonomously identifies what needs to change in a codebase to move a metric in the right direction.&lt;/p&gt;

&lt;p&gt;The workspace model is the critical piece. A dedicated workspace for the agent suggests Google envisions Jitro as a persistent collaborator rather than a one-shot tool. Early signals point to a workspace where developers can list goals, track insights, and configure tool integrations — a layer of continuity that current coding agents don't offer.&lt;/p&gt;

&lt;p&gt;From leaked tooling definitions, the Jitro workspace API exposes operations like: list goals, create a goal after helping articulate it clearly, list insights, get update history for an insight, and list configured tool integrations including MCP remote servers and API connections. That last item is significant — Jitro integrates through Model Context Protocol (MCP) remote servers and various API connections to ensure it has the context it needs.&lt;/p&gt;

&lt;p&gt;Transparency is baked in by design. When you set a goal in the Jitro workspace, the AI doesn't just operate silently — it surfaces its reasoning process, explaining why it chose a specific library or restructured a database table. You stay in control by approving the general direction, while the AI handles the execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Engineering Teams Are Actually Going to Use This For
&lt;/h2&gt;

&lt;p&gt;The use cases where this model genuinely wins are the ones that are currently painful in proportion to their importance: reducing error rates becomes the objective instead of debugging individual functions; improving test coverage becomes the target instead of writing test cases manually across multiple files; increasing conversions becomes the priority instead of adjusting isolated page elements without strategy alignment.&lt;/p&gt;

&lt;p&gt;The primary beneficiaries would be engineering teams managing large codebases where incremental improvements compound — performance optimization, test coverage, accessibility compliance.&lt;/p&gt;

&lt;p&gt;Jules V1 already demonstrated that the asynchronous model works. During the beta, thousands of developers tackled tens of thousands of tasks, resulting in over 140,000 code improvements shared publicly. Jules is now out of beta and available across free and paid tiers, integrated into Google AI Pro and Ultra subscriptions. Jitro inherits that async foundation and extends it to goals that span sessions, not just tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The shift from prompt-driven to goal-driven AI isn't a UX improvement — it's a change in the unit of work. Right now, developer productivity is measured by how good your prompts are. Jitro changes that to how clearly you can define outcomes.&lt;/p&gt;

&lt;p&gt;Routine tasks like debugging, writing boilerplate code, or running tests may increasingly be handled by AI systems. As a result, developers may shift toward higher-level responsibilities — guiding AI systems, reviewing outputs, and aligning technical work with business goals.&lt;/p&gt;

&lt;p&gt;This marks a departure from the task-level paradigm seen across competitors like GitHub Copilot, Cursor, and even OpenAI's Codex agent, all of which still rely on developers defining specific work items. If Jitro ships as described, it resets what the category baseline looks like. Every competitor will be asked why their tool still needs a prompt for every action.&lt;/p&gt;

&lt;p&gt;The MCP integration angle is also worth watching closely. A goal-oriented coding agent that natively connects to MCP remote servers can reach across your entire toolchain — CI/CD, monitoring, issue trackers — rather than reasoning only over local files. That's a different class of tool.&lt;/p&gt;

&lt;p&gt;The honest caveat: the risk is that autonomous goal-pursuing agents introduce unpredictable changes, and trust will be the key barrier to adoption. None of the UI is visible yet, so the full scope remains unclear. There's a real question about what "approve the direction" actually looks like in practice when the agent is making dozens of decisions across a large codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;Project Jitro is still pre-launch. The upcoming experience is expected to launch under a waitlist, with Google I/O 2026 on May 19 as the likely announcement moment alongside broader Gemini ecosystem updates. The Jules team has published a waitlist page with messaging that reads: "Manually prompting your agents is so… 2025."&lt;/p&gt;

&lt;p&gt;Current &lt;a href="https://jules.google/" rel="noopener noreferrer"&gt;Jules&lt;/a&gt; users on Google AI Pro and Ultra are the most likely early access recipients. No public timeline beyond "2026" has been confirmed.&lt;/p&gt;




&lt;p&gt;The line between "AI that helps you code" and "AI that owns a development objective" is the line Jitro is trying to cross. Whether it lands or not at I/O, the framing alone forces every other coding tool to answer the same question: how long until your users stop writing prompts?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Anthropic's Most Dangerous Model Just Got Accessed by People Who Weren't Supposed to Have It</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Wed, 22 Apr 2026 01:28:17 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/anthropics-most-dangerous-model-just-got-accessed-by-people-who-werent-supposed-to-have-it-14dn</link>
      <guid>https://dev.to/om_shree_0709/anthropics-most-dangerous-model-just-got-accessed-by-people-who-werent-supposed-to-have-it-14dn</guid>
      <description>&lt;p&gt;Anthropic built a model so dangerous they refused to release it publicly. Then a Discord group got in anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Model They Wouldn't Ship
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Claude Mythos Preview&lt;/a&gt; is Anthropic's most capable model to date for coding and agentic tasks. &lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt; But it was never meant to reach the public. During testing, Mythos improved to the point where it mostly saturated existing cybersecurity benchmarks, prompting Anthropic to shift focus to novel real-world security tasks — specifically zero-day vulnerabilities, bugs that were not previously known to exist. &lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What they found was stark. Mythos Preview had already identified thousands of zero-day vulnerabilities across critical infrastructure — many of them critical — in every major operating system and every major web browser. &lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt; In one documented case, Mythos fully autonomously identified and exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS. No human was involved in either the discovery or exploitation of this vulnerability after the initial request to find the bug. &lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is why the model never went public.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project Glasswing: The Controlled Release
&lt;/h2&gt;

&lt;p&gt;Announced on April 7, Mythos was deployed as part of Anthropic's "Project Glasswing," a controlled initiative under which select organizations are permitted to use the unreleased Claude Mythos Preview model for defensive cybersecurity. &lt;a href="https://www.yahoo.com/news/articles/anthropics-mythos-model-accessed-unauthorized-214920132.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Launch partners included Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Access was also extended to over 40 additional organizations that build or maintain critical software infrastructure. &lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt; The logic was clear: get defenders ahead of the curve before the capabilities proliferate to actors who won't use them carefully.&lt;/p&gt;

&lt;p&gt;Claude Mythos Preview is available to Project Glasswing participants at $25/$125 per million input/output tokens, accessible via the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. Anthropic committed $100M in model usage credits to cover Project Glasswing throughout the research preview. &lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The perimeter was tight by design. The news today is that it didn't hold.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Discord Group Got In
&lt;/h2&gt;

&lt;p&gt;A "private online forum," the members of which have not been publicly identified, managed to gain access to the tool through a third-party vendor. The unauthorized group tried a number of different strategies to gain access to the model, including using "access" enjoyed by a person currently employed at a third-party contractor that works for Anthropic. &lt;a href="https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims/" rel="noopener noreferrer"&gt;TechCrunch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Members of the group are part of a Discord channel that seeks out information about unreleased AI models. The group has been using Mythos regularly since gaining access to it, and provided evidence to Bloomberg in the form of screenshots and a live demonstration of the software. &lt;a href="https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims/" rel="noopener noreferrer"&gt;TechCrunch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The method they used to find the endpoint is particularly revealing. The group, which gained access on the very same day Mythos was publicly announced, "made an educated guess about the model's online location based on knowledge about the format Anthropic has used for other models." &lt;a href="https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims/" rel="noopener noreferrer"&gt;TechCrunch&lt;/a&gt; This wasn't a sophisticated breach — it was pattern recognition applied to a known naming convention. The group reportedly described themselves as being interested in exploring new models, not causing harm.&lt;/p&gt;

&lt;p&gt;Anthropic said it is investigating the claims and, so far, has seen no sign that its own systems were affected — the allegation points to possible misuse of access outside Anthropic's core network, not a confirmed breach of the company's internal defenses. &lt;a href="https://www.prismnews.com/news/anthropic-probes-claims-of-unauthorized-access-to" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The immediate reassurance — no core systems compromised, the group wasn't malicious — is accurate but beside the point. The problem isn't what this specific group did. It's what this incident reveals about the entire premise of Project Glasswing.&lt;/p&gt;

&lt;p&gt;Anthropic's controlled release strategy rests on the assumption that access can be meaningfully gated through vendor relationships. A small group of unauthorized users reportedly accessed Mythos on the same day Anthropic announced limited testing &lt;a href="https://www.prismnews.com/news/anthropic-probes-claims-of-unauthorized-access-to" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt; — meaning the access controls failed within hours of the first public announcement, before most Glasswing partners had even begun their work. If the group could guess the model's endpoint from Anthropic's known URL patterns, so can threat actors with more resources and worse intentions.&lt;/p&gt;

&lt;p&gt;There's also a pattern here worth naming. This is the third significant information control failure at Anthropic in recent weeks. The Claude Code source leak in March exposed 512,000 lines of unobfuscated TypeScript via a missing .npmignore entry. Before that, a draft blog post describing Mythos as "by far the most powerful AI model" ever built at Anthropic was left in a publicly accessible data store. That March 26 leak of draft materials — which Anthropic said resulted from human error in its content-management configuration — was actually Mythos's first public exposure. &lt;a href="https://www.prismnews.com/news/anthropic-probes-claims-of-unauthorized-access-to" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then there's the government subplot. The National Security Agency is using Mythos Preview despite top officials at the Department of Defense — which oversees the NSA — insisting Anthropic is a "supply chain risk." The department moved in February to cut off Anthropic and force its vendors to follow suit. The military is now broadening its use of Anthropic's tools while simultaneously arguing in court that using those tools threatens U.S. national security. &lt;a href="https://www.axios.com/2026/04/19/nsa-anthropic-mythos-pentagon" rel="noopener noreferrer"&gt;Axios&lt;/a&gt; Meanwhile, CISA — the agency whose entire mandate is critical infrastructure protection — reportedly does not have access to the model. &lt;a href="https://www.axios.com/2026/04/21/cisa-anthropic-mythos-ai-security" rel="noopener noreferrer"&gt;Axios&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The entity designed to defend critical systems can't get in. A Discord group can.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Anthropic Actually Said
&lt;/h2&gt;

&lt;p&gt;"We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," an Anthropic spokesperson said. The company found no evidence that the supposedly unauthorized activity impacted Anthropic's systems at all. &lt;a href="https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims/" rel="noopener noreferrer"&gt;TechCrunch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's a factually careful statement. It's also a familiar shape: acknowledge the narrow, deny the broader implication. Anthropic has been here before.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vendor Problem Nobody Wants to Solve
&lt;/h2&gt;

&lt;p&gt;The deeper structural issue is that enterprise AI deployments at frontier capability levels require trust chains that extend across dozens of organizations. Anthropic's 40-organization Glasswing rollout means 40 distinct security postures, 40 sets of contractors, and 40 potential lateral entry points for anyone who knows what they're looking for.&lt;/p&gt;

&lt;p&gt;Anthropic said it does not plan to make Mythos Preview generally available, but its eventual goal is to enable users to safely deploy Mythos-class models at scale — for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. &lt;a href="https://simonwillison.net/2026/Apr/7/project-glasswing/" rel="noopener noreferrer"&gt;Simon Willison&lt;/a&gt; That goal is legitimate. But reaching it requires solving vendor access governance at a level the industry hasn't had to reckon with before. This incident is an early indication of what the stakes look like when the effort falls short.&lt;/p&gt;

&lt;p&gt;A model capable of finding zero-days in every major operating system and browser has now been accessed by people outside the intended perimeter. The question isn't whether the Discord group caused harm. It's whether the perimeter can hold when the people on the other side are actually trying.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The line between "interested in playing around" and "interested in breaking things" isn't enforced by intent. It's enforced by access controls. Anthropic's have now failed twice in the same month.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>discuss</category>
      <category>claude</category>
    </item>
    <item>
      <title>Anthropic Just Passed OpenAI in Revenue. Here's What Actually Built That Lead.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Mon, 20 Apr 2026 06:46:06 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/anthropic-just-passed-openai-in-revenue-heres-what-actually-built-that-lead-2kmo</link>
      <guid>https://dev.to/om_shree_0709/anthropic-just-passed-openai-in-revenue-heres-what-actually-built-that-lead-2kmo</guid>
      <description>&lt;p&gt;A year ago, the consensus was that OpenAI had an insurmountable lead. The brand. The user base. ChatGPT with hundreds of millions of users. The head start. In April 2026, Anthropic crossed $30 billion in annualized revenue and left OpenAI's $25 billion behind — the first time any rival has led this race since ChatGPT launched in November 2022.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Number That Shocked Even the Analysts
&lt;/h2&gt;

&lt;p&gt;Anthropic's annualized revenue run-rate hit $30 billion in April 2026, officially overtaking OpenAI's $25 billion — the first time any rival has surpassed OpenAI since ChatGPT launched in 2022. &lt;a href="https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/" rel="noopener noreferrer"&gt;Vucense&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Epoch AI had modeled it. Analysts debated the timing. It was supposed to happen even under the most optimistic assessments in August 2026. It happened in April. &lt;a href="https://www.saastr.com/anthropic-just-passed-openai-in-revenue-while-spending-4x-less-to-train-their-models/" rel="noopener noreferrer"&gt;SaaStr&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The trajectory itself is the story. Anthropic went from $87 million run-rate in January 2024, to $1 billion by December 2024, to $9 billion by end of 2025, to $14 billion in February 2026, to $19 billion in March, to $30 billion in April. That last sequence — $14B to $30B in roughly 8 weeks — is hard to make sense of in traditional software terms. &lt;a href="https://www.saastr.com/anthropic-just-passed-openai-in-revenue-while-spending-4x-less-to-train-their-models/" rel="noopener noreferrer"&gt;SaaStr&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For context: Salesforce took about 20 years to reach $30 billion in annual revenue. Anthropic did it in under 3 years from a standing start. &lt;a href="https://www.saastr.com/anthropic-just-passed-openai-in-revenue-while-spending-4x-less-to-train-their-models/" rel="noopener noreferrer"&gt;SaaStr&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Enterprise Bet That Everyone Underestimated
&lt;/h2&gt;

&lt;p&gt;OpenAI's revenue composition is more consumer-heavy, with ChatGPT Plus and Pro subscriptions making up a large share. Anthropic's composition runs roughly 80% enterprise — higher retention, lower churn, and contracts that expand over time rather than cancelling when novelty fades. &lt;a href="https://www.roborhythms.com/anthropic-revenue-30-billion-2026/" rel="noopener noreferrer"&gt;Robo Rhythms&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The customer numbers make this concrete. Enterprise customers spending over $1 million annually doubled to 1,000+ in under two months. Eight of the Fortune 10 are Anthropic customers. &lt;a href="https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/" rel="noopener noreferrer"&gt;Vucense&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the enterprise LLM API market, Anthropic accounts for 32% compared to OpenAI's 25%. Seven out of every ten new enterprise customers choose Anthropic. &lt;a href="https://www.tradingkey.com/analysis/stocks/us-stocks/261756528-anthropic-openai-ipo-tradingkey" rel="noopener noreferrer"&gt;Tradingkey&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enterprise buyers treat a large funding round as a signal of platform stability. Companies that had been hesitant to commit multi-year API contracts moved forward after Anthropic's February 2026 Series G because Anthropic looked like it was in the race to stay. The doubling of $1M+ clients in under two months right after the Series G confirms that signal-driven buying happened at scale. &lt;a href="https://www.roborhythms.com/anthropic-revenue-30-billion-2026/" rel="noopener noreferrer"&gt;Robo Rhythms&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code: The Single Product That Changed Everything
&lt;/h2&gt;

&lt;p&gt;None of this happens without &lt;a href="https://claude.ai/code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;. Launched in May 2025, Claude Code reached an annualized revenue of $1 billion by November, and surpassed $2.5 billion by February 2026 — a product growing from zero to $2.5 billion in nine months. Reviewing SaaS industry history, no faster case has been found. &lt;a href="https://www.kucoin.com/news/flash/anthropic-surpasses-openai-in-revenue-and-market-share" rel="noopener noreferrer"&gt;KuCoin&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Business subscriptions to Claude Code have quadrupled since the start of 2026, and enterprise use has grown to represent over half of all Claude Code revenue. &lt;a href="https://www.anthropic.com/news/anthropic-raises-30-billion-series-g-funding-380-billion-post-money-valuation" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Claude Code holds a 54% market share in the AI programming tool segment — far exceeding GitHub Copilot and Cursor. &lt;a href="https://www.tradingkey.com/analysis/stocks/us-stocks/261756528-anthropic-openai-ipo-tradingkey" rel="noopener noreferrer"&gt;Tradingkey&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The reason enterprises pay for it is structural, not incremental. GitHub Copilot helps you complete the next line as you write code — you're still the one doing the work. Claude Code doesn't just autocomplete; it handles entire workflows. &lt;a href="https://www.kucoin.com/news/flash/anthropic-surpasses-openai-in-revenue-and-market-share" rel="noopener noreferrer"&gt;KuCoin&lt;/a&gt; That's the difference between a feature and a budget line replacement.&lt;/p&gt;

&lt;p&gt;And Claude Code is available on every major surface. Claude is the only frontier AI model available on all three of the world's largest cloud platforms: AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry. &lt;a href="https://www.the-ai-corner.com/p/anthropic-30b-arr-passed-openai-revenue-2026" rel="noopener noreferrer"&gt;The-ai-corner&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Training Cost Gap Nobody Is Talking About Enough
&lt;/h2&gt;

&lt;p&gt;Revenue is the headline. The cost structure is the real story.&lt;/p&gt;

&lt;p&gt;OpenAI is projected to spend $125 billion per year on training by 2030. Anthropic's projection for the same period: around $30 billion. Same race. 4x difference in cost. &lt;a href="https://www.the-ai-corner.com/p/anthropic-30b-arr-passed-openai-revenue-2026" rel="noopener noreferrer"&gt;The-ai-corner&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenAI is burning approximately $17 billion in cash this year. Internal documents project a $14 billion loss for 2026. The company does not project positive free cash flow until 2029. Anthropic projects positive free cash flow by 2027 — three years ahead of its main competitor, while generating more revenue. &lt;a href="https://www.saastr.com/anthropic-just-passed-openai-in-revenue-while-spending-4x-less-to-train-their-models/" rel="noopener noreferrer"&gt;SaaStr&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A new agreement with Google and Broadcom will deliver approximately 3.5 gigawatts of next-generation TPU capacity starting in 2027. Rather than relying solely on Nvidia GPUs, Anthropic is diversifying across Google TPUs, AWS Trainium chips, and Nvidia hardware — matching workloads to the chips best suited for them. &lt;a href="https://medium.com/@david.j.sea/anthropic-just-passed-openai-in-revenue-here-is-why-it-matters-e3dd9bb04069" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anthropic is investing its revenue advantage into infrastructure before it needs it. That's a different kind of discipline than raising $120 billion and spending it on training runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than a Revenue Chart
&lt;/h2&gt;

&lt;p&gt;The revenue story is inseparable from Anthropic's deliberate choice to prioritise enterprise over consumers. The $30B ARR is earned by being useful to businesses, not by harvesting user attention. &lt;a href="https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/" rel="noopener noreferrer"&gt;Vucense&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Pentagon labelled Anthropic a supply chain risk for refusing to arm autonomous weapons with Claude. Revenue accelerated anyway — from $19B to $30B ARR in the weeks after that clash became public. The enterprise customer base that drives Anthropic's revenue appears to have either ignored or positively responded to Anthropic's refusal to compromise. &lt;a href="https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/" rel="noopener noreferrer"&gt;Vucense&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One caveat worth stating plainly: OpenAI has argued that Anthropic is using a gross revenue accounting treatment with its deals with Amazon and Google that inflates top-line figures. The real net figure, by OpenAI's accounting, would be lower. &lt;a href="https://gardenzhome.com/anthropic-revenue-breakdown-openai/" rel="noopener noreferrer"&gt;Gardenzhome&lt;/a&gt; That dispute isn't settled. But even accounting for it, the trajectory is real, the enterprise customer count is real, and the Claude Code numbers are real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and What It Means for Developers
&lt;/h2&gt;

&lt;p&gt;Anthropic operates its models on a diversified range of AI hardware — AWS Trainium, Google TPUs, and NVIDIA GPUs — which means it can match workloads to the chips best suited for them. &lt;a href="https://www.anthropic.com/news/anthropic-raises-30-billion-series-g-funding-380-billion-post-money-valuation" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The IPO question is now live. Anthropic is targeting October 2026, aiming to raise $60B+ at a $380B valuation. No S-1 has been filed. The timeline is subject to market conditions and the SEC's review of accounting methodology questions. &lt;a href="https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/" rel="noopener noreferrer"&gt;Vucense&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anthropic's $30 billion run rate exceeds the trailing twelve-month revenues of all but approximately 130 S&amp;amp;P 500 companies. A company that was essentially pre-revenue in early 2024 now out-earns most of the Fortune 500. &lt;a href="https://medium.com/@david.j.sea/anthropic-just-passed-openai-in-revenue-here-is-why-it-matters-e3dd9bb04069" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The company that left OpenAI to build AI more carefully just built a bigger business doing it. That's not an accident — it's a thesis proving out in real time. The question now isn't whether Anthropic belongs in the same conversation as OpenAI. It's whether the enterprise-first, developer-first model it validated is the one the rest of the industry will be chasing for the next decade.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>security</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Vercel Just Confirmed a Security Breach. Here's What Actually Got Exposed — and Why It's Bigger Than One Company.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Mon, 20 Apr 2026 00:42:45 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/vercel-just-confirmed-a-security-breach-heres-what-actually-got-exposed-and-why-its-bigger-pon</link>
      <guid>https://dev.to/om_shree_0709/vercel-just-confirmed-a-security-breach-heres-what-actually-got-exposed-and-why-its-bigger-pon</guid>
      <description>&lt;p&gt;Vercel is the deployment layer for a meaningful percentage of the modern web. That's exactly what makes yesterday's confirmed breach something every developer should understand, not just Vercel customers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving for Attackers
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://vercel.com" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt; is a cloud platform that provides hosting and deployment infrastructure for developers, with a strong focus on JavaScript frameworks. The company is known for developing &lt;a href="https://nextjs.org" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;, a widely used React framework, and for offering services such as serverless functions, edge computing, and CI/CD pipelines that enable developers to build, preview, and deploy applications. &lt;a href="https://www.bleepingcomputer.com/news/security/vercel-confirms-breach-as-hackers-claim-to-be-selling-stolen-data/" rel="noopener noreferrer"&gt;Bleeping Computer&lt;/a&gt; In short: Vercel sits at the center of how thousands of startups and enterprises ship code. That's not an incidental detail. That's exactly why it became a target.&lt;/p&gt;

&lt;p&gt;On April 19, 2026, Vercel published a security bulletin confirming that the company detected unauthorized access and has since engaged external incident response experts to investigate and contain the breach. Law enforcement has also been notified, and the company says it is continuing its forensic analysis while maintaining service availability. &lt;a href="https://cyberinsider.com/vercel-confirms-security-incident-as-hackers-claim-to-sell-internal-access/" rel="noopener noreferrer"&gt;CyberInsider&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Attack Actually Happened
&lt;/h2&gt;

&lt;p&gt;This wasn't a brute-force attack on Vercel's perimeter. The entry point was far more insidious — and a warning for every engineering team running a modern SaaS stack.&lt;/p&gt;

&lt;p&gt;Vercel's investigation revealed that the incident originated from a small, third-party AI tool whose Google Workspace OAuth app was the subject of a broader compromise, potentially affecting its hundreds of users across many organizations. &lt;a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt; Vercel has not publicly named the specific tool. The Verge reported that Vercel has not disclosed which specific third-party AI vendor served as the attack vector. &lt;a href="https://startupfortune.com/vercel-breach-exposes-ai-tool-supply-chain-risk-ahead-of-ipo/" rel="noopener noreferrer"&gt;Startup Fortune&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architecture of the attack matters here. Attackers do not always need to smash through a front door when they can slip in through a trusted integration. Some reporting said the intrusion may have started through a compromised third-party AI tool linked to Google Workspace, rather than a direct attack on Vercel itself. &lt;a href="https://www.prismnews.com/news/hackers-claim-vercel-breach-leak-employee-data-and-seek" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once inside, the blast radius expanded fast. Developer Theo Browne shared additional details, noting that Vercel's Linear and GitHub integrations bore the brunt of the attack. &lt;a href="https://tech.yahoo.com/cybersecurity/articles/vercel-security-breach-raises-concerns-164955320.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Teams Are Actually Dealing With Right Now
&lt;/h2&gt;

&lt;p&gt;Here's what got exposed, based on what's been confirmed and what threat actors are claiming — those are two different things worth keeping separate.&lt;/p&gt;

&lt;p&gt;A person claiming to be a member of ShinyHunters posted a file containing 580 employee records, including names, Vercel email addresses, account status, and activity timestamps. The same actor claimed access to internal deployments, API keys, NPM tokens, GitHub tokens, source code, and database data. Vercel has not independently verified those assertions. &lt;a href="https://www.prismnews.com/news/hackers-claim-vercel-breach-leak-employee-data-and-seek" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It should be noted that while the hacker claims to be part of the ShinyHunters group, threat actors linked to recent attacks attributed to the ShinyHunters extortion gang have denied to BleepingComputer that they are involved in this incident. &lt;a href="https://www.bleepingcomputer.com/news/security/vercel-confirms-breach-as-hackers-claim-to-be-selling-stolen-data/" rel="noopener noreferrer"&gt;Bleeping Computer&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ransom demand adds another layer. In messages shared on Telegram, the threat actor claimed they were in contact with Vercel regarding the incident and that they discussed an alleged ransom demand of $2 million. &lt;a href="https://www.bleepingcomputer.com/news/security/vercel-confirms-breach-as-hackers-claim-to-be-selling-stolen-data/" rel="noopener noreferrer"&gt;Bleeping Computer&lt;/a&gt; The group is offering what they describe as access keys, source code, and database contents from Vercel, asking $2 million, with an initial payment of $500,000 in Bitcoin. &lt;a href="https://techweez.com/2026/04/19/vercel-data-breach-third-party-ai-tool/" rel="noopener noreferrer"&gt;Techweez&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the customer side, the immediate concern is environment variables. The main concern for Vercel customers is environment variables — configuration values your app uses at runtime, which includes things like API keys, database credentials, and signing tokens. The problem is anything that wasn't marked sensitive. Those values should be treated as compromised and rotated immediately. &lt;a href="https://techweez.com/2026/04/19/vercel-data-breach-third-party-ai-tool/" rel="noopener noreferrer"&gt;Techweez&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, environment variables marked as "sensitive" within the platform remained protected. &lt;a href="https://tech.yahoo.com/cybersecurity/articles/vercel-security-breach-raises-concerns-164955320.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than One Breach
&lt;/h2&gt;

&lt;p&gt;The reason this incident matters beyond Vercel's own customer list comes down to two words: supply chain.&lt;/p&gt;

&lt;p&gt;What makes the claim worth paying attention to is the scale ShinyHunters is alluding to. Vercel hosts Next.js, which reportedly sees around 6 million weekly downloads. The group suggests that access to Vercel's internals could enable a supply chain attack — essentially, tampering with packages that millions of developers download and run in their own software. &lt;a href="https://techweez.com/2026/04/19/vercel-data-breach-third-party-ai-tool/" rel="noopener noreferrer"&gt;Techweez&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If even part of that access turns out to be real, the fallout could extend well beyond employee privacy. Secrets and tokens can be reused to reach build systems, package registries, and source repositories, which is why researchers warned that the incident could become a supply-chain problem for startups, enterprises, and ordinary users relying on apps hosted or deployed through Vercel, including Next.js projects. &lt;a href="https://www.prismnews.com/news/hackers-claim-vercel-breach-leak-employee-data-and-seek" rel="noopener noreferrer"&gt;Prism News&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For crypto and Web3 developers specifically, the situation is acute. Many crypto and Web3 frontends deploy on Vercel, from wallet connectors to decentralized application interfaces. Projects storing API keys, private RPC endpoints, or wallet-related secrets in non-sensitive environment variables face potential exposure risk. The breach does not threaten blockchains or smart contracts directly, as those operate independently of frontend hosting. However, compromised deployment pipelines could theoretically allow build tampering for affected accounts. &lt;a href="https://tech.yahoo.com/cybersecurity/articles/vercel-security-breach-raises-concerns-164955320.html" rel="noopener noreferrer"&gt;Yahoo!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And then there's the IPO angle. This breach lands at a brutal moment for Vercel's business trajectory. Reports from just days earlier highlighted a planned IPO following a reported 240% revenue surge, driven largely by enterprise adoption of AI-powered deployment workflows. Security incidents are notoriously damaging during a quiet period, when companies are legally restricted in how they can communicate with investors and the public. &lt;a href="https://startupfortune.com/vercel-breach-exposes-ai-tool-supply-chain-risk-ahead-of-ipo/" rel="noopener noreferrer"&gt;Startup Fortune&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access: What You Should Do Right Now
&lt;/h2&gt;

&lt;p&gt;Vercel's guidance to customers covers several concrete steps: review account and environment activity logs for suspicious behavior, rotate environment variables and API keys, and leverage built-in features for managing sensitive variables. &lt;a href="https://trilogyai.substack.com/p/vercel-has-a-confirmed-breach" rel="noopener noreferrer"&gt;Substack&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vercel has also rolled out updates to its dashboard, including an overview page of environment variables and an improved interface for managing sensitive environment variables. &lt;a href="https://www.bleepingcomputer.com/news/security/vercel-confirms-breach-as-hackers-claim-to-be-selling-stolen-data/" rel="noopener noreferrer"&gt;Bleeping Computer&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vercel is publishing an IOC (indicator of compromise) to support the wider community in investigating and vetting potential malicious activity. They recommend that Google Workspace administrators and Google account owners check for usage of the compromised app immediately. &lt;a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you use Vercel: rotate every secret that wasn't explicitly marked sensitive. If your project built or deployed during the breach window, audit it regardless of whether you're in the "limited subset" Vercel is directly contacting. The investigation is still ongoing — the scope could expand.&lt;/p&gt;

&lt;p&gt;The deeper lesson here isn't about Vercel specifically. It's about what happens when a small, trusted AI tool with OAuth access to your workspace becomes the softest point in your entire deployment chain — and you had no way to know it was compromised until someone started selling your tokens on BreachForums.&lt;/p&gt;

&lt;p&gt;Credential hygiene and OAuth scope reviews aren't optional maintenance tasks anymore. They're the front line.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>security</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Anthropic Just Launched Claude Design. Here's What It Actually Changes for Non-Designers.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Sun, 19 Apr 2026 05:31:03 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/anthropic-just-launched-claude-design-heres-what-it-actually-changes-for-non-designers-5e3e</link>
      <guid>https://dev.to/om_shree_0709/anthropic-just-launched-claude-design-heres-what-it-actually-changes-for-non-designers-5e3e</guid>
      <description>&lt;p&gt;Figma has been the unchallenged design layer for product teams for years. On April 17, 2026, Anthropic quietly placed a bet that the next design tool doesn't look like Figma at all — it looks like a conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;Design has always had a bottleneck that nobody talks about openly: the distance between the person with the idea and the person who can execute it. A founder has a vision for a landing page. A PM sketches a feature flow on a whiteboard. A marketer needs a campaign asset by end of day. In every case, they're either waiting on a designer, wrestling with a tool that wasn't built for them, or shipping something that looks like it was made in a hurry — because it was.&lt;/p&gt;

&lt;p&gt;Even experienced designers face a version of this. Exploration is rationed. There's rarely time to prototype ten directions when you have two days before a stakeholder review. So teams commit early, iterate less, and ship with more uncertainty than they'd like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://claude.ai/design" rel="noopener noreferrer"&gt;Claude Design&lt;/a&gt; is Anthropic's answer to both problems simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;The product is powered by &lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7&lt;/a&gt;, Anthropic's latest and most capable vision model. The core loop is simple: describe what you need, Claude builds a first version, and you refine it through conversation. But the details of how that refinement works are what separate this from a glorified prompt-to-image tool.&lt;/p&gt;

&lt;p&gt;You can comment inline on specific elements — not the whole design, a specific button or heading. You can edit text directly in the canvas. And in a genuinely interesting touch, Claude can generate custom adjustment sliders for spacing, color, and layout that let you tune parameters live without writing another prompt.&lt;/p&gt;

&lt;p&gt;The brand system integration is the piece that makes this credible for actual teams rather than solo experiments. During onboarding, Claude reads your codebase and design files and assembles a design system — your colors, typography, components. Every project after that uses it automatically. Teams can maintain multiple systems and switch between them per project.&lt;/p&gt;

&lt;p&gt;Input is flexible: start from a text prompt, upload images, DOCX, PPTX, or XLSX files, or point Claude at a codebase. There's also a web capture tool that grabs elements directly from your live site, so prototypes match the real product rather than approximating it.&lt;/p&gt;

&lt;p&gt;Collaboration is organization-scoped. Designs can be kept private, shared view-only with anyone in the org via link, or opened for group editing where multiple teammates can chat with Claude together in the same canvas. Output formats include internal URLs, standalone HTML files, PDF, PPTX, and direct export to &lt;a href="https://www.canva.com/" rel="noopener noreferrer"&gt;Canva&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The handoff to &lt;a href="https://claude.com/product/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; is the closing piece of the loop. When a design is ready to build, Claude packages it into a handoff bundle that Claude Code can consume directly. The intent is to eliminate the translation layer between design and implementation entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Teams Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;Anthropic lists six use cases, and they span a wider range of roles than you'd expect from a "design tool." Designers are using it for rapid prototyping and broad exploration. PMs are using it to sketch feature flows before handing off to engineering. Founders are turning rough outlines into pitch decks. Marketers are drafting landing pages and campaign visuals before looping in a designer to finish.&lt;/p&gt;

&lt;p&gt;The early testimonials from teams are specific enough to be useful. Brilliant's senior product designer noted that their most complex pages — which previously required 20+ prompts in other tools — needed only 2 prompts in Claude Design. Datadog's PM described going from rough idea to working prototype before anyone leaves the room, with the output already matching their brand guidelines. Those aren't marketing abstractions; they're describing a workflow compression that most product teams would recognize as real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;The obvious read is that this is Anthropic entering the design tool market. The less obvious read is that Anthropic is extending the Claude Code workflow upward into the creative layer.&lt;/p&gt;

&lt;p&gt;Claude Code already handles the bottom of the product development stack — reading codebases, writing and editing files, managing git workflows. Claude Design handles the top — ideation, visual prototyping, stakeholder-ready output. The handoff bundle between the two is not a nice-to-have; it's the architectural seam Anthropic is betting on. If that seam works reliably, the design-to-deployment loop stops requiring multiple tools, multiple handoffs, and multiple rounds of translation.&lt;/p&gt;

&lt;p&gt;The Canva integration is also worth noting. Canva's CEO described the partnership as making it seamless to bring ideas from Claude Design into Canva for final polish and publishing. That positions Claude Design as the ideation and prototyping layer, with Canva as the finishing and distribution layer — rather than as direct competitors. It's a smart separation that gives Claude Design a clear lane without requiring it to replace every workflow Canva owns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;Claude Design launched April 17, 2026, in research preview. It's available for &lt;a href="https://claude.com/pricing" rel="noopener noreferrer"&gt;Claude Pro, Max, Team, and Enterprise&lt;/a&gt; subscribers, included with your existing plan and counted against subscription limits. Extra usage can be enabled if you hit those limits.&lt;/p&gt;

&lt;p&gt;Enterprise organizations get it off by default — admins enable it through Organization settings. Access is at &lt;a href="https://claude.ai/design" rel="noopener noreferrer"&gt;claude.ai/design&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The research preview label matters. This is not a finished product. Anthropic says integrations with other tools are coming in the weeks ahead.&lt;/p&gt;




&lt;p&gt;The gap between "person with an idea" and "polished thing that exists" has always been where time, money, and momentum go to die. Claude Design is a direct attempt to close it — and the Claude Code handoff suggests Anthropic is thinking about the full stack, not just the canvas.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>uidesign</category>
      <category>discuss</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Anthropic Just Gave Claude a Design Studio. Here's What Claude Design Actually Does.</title>
      <dc:creator>Om Shree</dc:creator>
      <pubDate>Sat, 18 Apr 2026 04:44:41 +0000</pubDate>
      <link>https://dev.to/om_shree_0709/anthropic-just-gave-claude-a-design-studio-heres-what-claude-design-actually-does-5h1f</link>
      <guid>https://dev.to/om_shree_0709/anthropic-just-gave-claude-a-design-studio-heres-what-claude-design-actually-does-5h1f</guid>
      <description>&lt;p&gt;Figma has been the unchallenged center of digital design for years. Yesterday, Anthropic quietly placed a bet that AI can change that.&lt;/p&gt;

&lt;p&gt;On April 17, Anthropic launched &lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-design-anthropic-labs" rel="noopener noreferrer"&gt;Claude Design&lt;/a&gt;&lt;/strong&gt; - a new product under its &lt;a href="https://www.anthropic.com/news/introducing-anthropic-labs" rel="noopener noreferrer"&gt;Anthropic Labs&lt;/a&gt; umbrella that lets you collaborate with Claude to build visual work: prototypes, slides, wireframes, landing pages, one-pagers, and more. It's powered by &lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7&lt;/a&gt;&lt;/strong&gt;, their latest vision model, and it's rolling out in research preview for Pro, Max, Team, and Enterprise subscribers right now.&lt;/p&gt;

&lt;p&gt;This isn't Claude generating pretty mockups you paste into Figma. This is a full design loop - ideation, iteration, export, and handoff - without ever leaving the chat.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem It's Solving
&lt;/h2&gt;

&lt;p&gt;Anthropic frames the core issue well: even experienced designers ration exploration. There's never enough time to prototype ten directions, so you pick two or three and commit. And for founders, PMs, and marketers who have a strong vision but no design background, turning ideas into shareable visuals has always required either hiring someone or learning tools that take months to master.&lt;/p&gt;

&lt;p&gt;Claude Design is trying to solve both problems at once. Give designers room to explore widely. Give everyone else a way to produce visual work that doesn't look like a Canva template from 2019.&lt;/p&gt;




&lt;h2&gt;
  
  
  How the Workflow Actually Works
&lt;/h2&gt;

&lt;p&gt;The flow is more structured than you'd expect from a chat-based tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your brand gets built in first.&lt;/strong&gt; During onboarding, Claude reads your codebase and design files to build a design system - your colors, typography, components. Every project after that inherits it automatically. No more pasting hex codes into every prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You can start from anything.&lt;/strong&gt; A text prompt, uploaded images, a DOCX, PPTX, or XLSX file, your codebase, or a live website via the web capture tool. If you want the prototype to look like your actual product, you point it at your site and Claude pulls the elements directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iteration happens inline.&lt;/strong&gt; You can comment on specific elements, edit text directly, or use custom adjustment knobs - built by Claude - to tweak spacing, color, and layout live. Then ask Claude to apply changes across the entire design at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Collaboration is organization-scoped.&lt;/strong&gt; Keep designs private, share a view-only link inside your org, or grant edit access so teammates can jump into the same conversation with Claude together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Export goes everywhere.&lt;/strong&gt; Standalone HTML, PDF, PPTX, a shareable internal URL, or directly to &lt;strong&gt;&lt;a href="https://www.canva.com" rel="noopener noreferrer"&gt;Canva&lt;/a&gt;&lt;/strong&gt;. The Canva integration is a first-class feature - designs land as fully editable Canva files, ready to refine and publish.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Handoff goes to Claude Code.&lt;/strong&gt; When a design is ready to build, Claude bundles everything into a handoff package you pass to &lt;strong&gt;&lt;a href="https://claude.com/product/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;&lt;/strong&gt; with a single instruction. Design to implementation in one pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Teams Are Actually Using It For
&lt;/h2&gt;

&lt;p&gt;Anthropic lists six core use cases, and they're more specific than the usual "boost your productivity" marketing copy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Realistic prototypes&lt;/strong&gt; - Designers turn static mockups into interactive, shareable prototypes without touching code or going through PR review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product wireframes&lt;/strong&gt; - PMs sketch feature flows and hand off directly to Claude Code for implementation, or to designers for refinement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design explorations&lt;/strong&gt; - Quick generation of a wide range of visual directions to explore before committing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pitch decks and presentations&lt;/strong&gt; - From rough outline to on-brand deck in minutes, exported as PPTX or sent to Canva.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Marketing collateral&lt;/strong&gt; - Landing pages, social media assets, campaign visuals, ready for designer polish.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontier design&lt;/strong&gt; - Code-powered prototypes with voice, video, shaders, 3D, and built-in AI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one is the most interesting. "Frontier design" positions this beyond Figma's territory entirely - into interactive, AI-native artifacts that traditional design tools can't produce at all.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Early Users Are Saying
&lt;/h2&gt;

&lt;p&gt;Three companies shared early reactions, and the numbers are specific enough to be credible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://brilliant.org" rel="noopener noreferrer"&gt;Brilliant&lt;/a&gt;&lt;/strong&gt;, the interactive learning platform, noted that their most complex pages - which previously took 20+ prompts to recreate in other tools - required only 2 prompts in Claude Design. Their Senior Product Designer called the prototype-to-production handoff with Claude Code "seamless."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datadoghq.com" rel="noopener noreferrer"&gt;Datadog&lt;/a&gt;&lt;/strong&gt;'s product team reported going from rough idea to working prototype before anyone leaves the room. Work that previously took a week of back-and-forth between briefs, mockups, and review rounds now happens in a single conversation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.canva.com" rel="noopener noreferrer"&gt;Canva&lt;/a&gt;&lt;/strong&gt; co-founder and CEO Melanie Perkins framed the integration as a natural extension of their mission - bringing Canva to wherever ideas begin. When a design exits Claude Design into Canva, it becomes fully editable and collaborative immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Is a Bigger Deal Than It Looks
&lt;/h2&gt;

&lt;p&gt;Most AI design tools have been wrappers - you describe something, get an image, manually replicate it in your actual design tool. Claude Design is different in structure. The brand system, the inline editing, the Claude Code handoff, the Canva export - these aren't convenience features. They're the infrastructure of a complete design workflow.&lt;/p&gt;

&lt;p&gt;What Anthropic is building here is a &lt;strong&gt;design agent&lt;/strong&gt;, not a design assistant. One that holds context about your brand, your product, your team's work, and the full history of a project. That's the same pattern we've seen with &lt;a href="https://claude.com/product/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; in engineering - an AI that doesn't just answer questions but participates in the actual production pipeline.&lt;/p&gt;

&lt;p&gt;The implications for teams without dedicated design resources are significant. A founder with a clear vision and access to Claude Pro can now go from napkin sketch to investor-ready prototype without a single design hire. A PM can produce a feature wireframe precise enough to hand off to engineering directly. A marketer can generate a campaign landing page in a conversation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Availability and Access
&lt;/h2&gt;

&lt;p&gt;Claude Design is available now in research preview for &lt;strong&gt;Pro, Max, Team, and Enterprise&lt;/strong&gt; subscribers at &lt;strong&gt;&lt;a href="https://claude.ai/design" rel="noopener noreferrer"&gt;claude.ai/design&lt;/a&gt;&lt;/strong&gt;. Access is included in your existing plan and uses your subscription limits, with the option to enable &lt;a href="https://support.claude.com/en/articles/12429409-manage-extra-usage-for-paid-claude-plans" rel="noopener noreferrer"&gt;extra usage&lt;/a&gt; if you go beyond them.&lt;/p&gt;

&lt;p&gt;For Enterprise orgs, it's off by default - admins can enable it via &lt;a href="https://support.claude.com/en/articles/14604406-claude-design-admin-guide-for-team-and-enterprise-plans" rel="noopener noreferrer"&gt;Organization settings&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Anthropic says integrations with more tools are coming in the next few weeks.&lt;/p&gt;




&lt;p&gt;Design just became part of the agentic stack. The question now is how fast the design community actually adopts it - and what Figma does next.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Follow for more coverage on MCP, agentic AI, and AI infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>programming</category>
      <category>discuss</category>
    </item>
  </channel>
</rss>
