<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: gentic news</title>
    <description>The latest articles on DEV Community by gentic news (@gentic_news).</description>
    <link>https://dev.to/gentic_news</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838995%2F269c20bb-f64f-483a-862d-49c6481df897.png</url>
      <title>DEV Community: gentic news</title>
      <link>https://dev.to/gentic_news</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gentic_news"/>
    <language>en</language>
    <item>
      <title>KKR Launches Helix Digital Infrastructure with $10B for AI Buildout</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 22:18:07 +0000</pubDate>
      <link>https://dev.to/gentic_news/kkr-launches-helix-digital-infrastructure-with-10b-for-ai-buildout-3ga7</link>
      <guid>https://dev.to/gentic_news/kkr-launches-helix-digital-infrastructure-with-10b-for-ai-buildout-3ga7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;KKR launched Helix Digital Infrastructure with over $10B in commitments from KKR, KIA, Nvidia, and Vistra to bundle AI data centers, power, and connectivity for hyperscalers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;KKR, Nvidia, and the Kuwait Investment Authority launched Helix Digital Infrastructure on June 12, 2026, with over $10 billion in committed capital. The new company aims to bundle data centers, power, and connectivity into a single offering for hyperscalers struggling with AI infrastructure complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Over $10B in long-duration capital commitments.&lt;/li&gt;
&lt;li&gt;Anchor investors: KKR, KIA, Nvidia, Vistra.&lt;/li&gt;
&lt;li&gt;Led by former AWS CEO Adam Selipsky.&lt;/li&gt;
&lt;li&gt;Nvidia DSX AI factory-aligned infrastructure partner.&lt;/li&gt;
&lt;li&gt;Vistra is preferred power provider.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;KKR, together with the Kuwait Investment Authority (KIA), NVIDIA and Vistra, has announced the launch of Helix Digital Infrastructure, a new company designed to deliver integrated infrastructure at the speed and scale required for hyperscalers to meet accelerating artificial intelligence (AI) demand. As building AI infrastructure becomes increasingly complex, Helix will serve as a single coordination point for hyperscalers' data centers, power, connectivity and related needs &lt;a href="https://www.hpcwire.com/off-the-wire/kkr-launches-helix-digital-infrastructure-with-10b-to-accelerate-ai-infrastructure-buildout/" rel="noopener noreferrer"&gt;According to the source&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Founded with anchor investments from investors including KKR, KIA, NVIDIA and Vistra, the Helix strategy has more than $10 billion in total long-duration capital commitments to date. NVIDIA will also serve as a strategic partner to support the deployment of NVIDIA DSX AI factory-aligned infrastructure with a view to maximizing tokens per watt, achieving lowest total cost of ownership and accelerating time to first token for investments pursued by Helix. Vistra, a leading integrated power generation and electricity company with operations across 18 states and Washington, D.C., will be the preferred power provider for Helix investments. Following the closing of the founding commitments, Helix is open to additional eligible institutional investors.&lt;/p&gt;

&lt;p&gt;AI is driving the largest infrastructure buildout in modern history, requiring trillions of dollars in investment across data centers, power generation and transmission, connectivity and related infrastructure over the coming decade. The scale and complexity of financing and coordinating this buildout represents a key industry bottleneck, ultimately slowing hyperscalers from delivering the models, services and applications their customers demand. Delivering AI infrastructure requires credible, long-term financial underwriters capable of committing capital consistently. Hyperscalers are also seeking more integrated and repeatable infrastructure solutions that meaningfully reduce the complexity they face in building at unprecedented scale.&lt;/p&gt;

&lt;p&gt;KKR launched Helix in response to these challenges. Helix will be positioned as a single, trusted strategic partner to hyperscalers, armed with a long-duration, multi-billion-dollar capital base, and with integrated development capabilities and coordinated execution across AI infrastructure. The company is led by Adam Selipsky, former CEO of Amazon Web Services, who brings first-hand experience scaling the world's largest cloud business, and deep insight into hyperscaler infrastructure priorities. He is joined by a dedicated management team and Board. Waldemar Szlezak, KKR's Global Head of Digital Infrastructure, will serve as Helix's Chief Investment Officer. Helix will seek to invest in and manage assets critical to enabling AI, including hyperscale data center development and operations; baseload and flexible power generation; transmission and distribution infrastructure; and fiber and connectivity infrastructure, among other assets.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fld3bumlkak62ysk41gp4.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fld3bumlkak62ysk41gp4.webp" alt="Helix Digital Infrastructure Launches With $10 Billion Back…" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for Helix's first project announcements in H2 2026, likely a multi-gigawatt data center campus co-located with a Vistra power plant. Also track whether additional sovereign wealth funds (e.g., GIC, ADIA) join as limited partners, and if Helix signs a hyperscaler anchor tenant before year-end.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://www.hpcwire.com/off-the-wire/kkr-launches-helix-digital-infrastructure-with-10b-to-accelerate-ai-infrastructure-buildout/" rel="noopener noreferrer"&gt;hpcwire.com&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/kkr-launches-helix-digital" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>startup</category>
      <category>business</category>
      <category>funding</category>
    </item>
    <item>
      <title>Claudectl: The Windows Workspace Manager That Makes Claude Code</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 22:18:04 +0000</pubDate>
      <link>https://dev.to/gentic_news/claudectl-the-windows-workspace-manager-that-makes-claude-code-54fg</link>
      <guid>https://dev.to/gentic_news/claudectl-the-windows-workspace-manager-that-makes-claude-code-54fg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Claudectl solves Claude Code's biggest pain point on Windows: losing context when switching projects. Install via &lt;code&gt;pipx install claudectl&lt;/code&gt; for session browsing, CLAUDE.md scaffolding, and per-project MCP/M model configs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Claudectl solves Claude Code's biggest pain point on Windows: losing context when switching projects.&lt;/li&gt;
&lt;li&gt;Install via &lt;code&gt;pipx install claudectl&lt;/code&gt; for session browsing, CLAUDE.md scaffolding, and per-project MCP/M model configs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Changed — A Workspace Manager for Claude Code on Windows
&lt;/h2&gt;

&lt;p&gt;Claude Code treats your work as a collection of chats. That's fine for a single project. But the moment you switch between three repos, onboard a new teammate, or revisit a project after a week, you're fighting context loss. Sessions pile up. CLAUDE.md files get stale. MCP servers drift out of sync.&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;claudectl&lt;/strong&gt; — an open-source workspace manager by developer Babar Muhammad that's purpose-built for Windows Claude Code users. It wraps Claude Code's CLI with a terminal UI that treats every project as a persistent, searchable, configurable workspace.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does — The Four Pillars
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Session Management That Doesn't Suck
&lt;/h3&gt;

&lt;p&gt;Claudectl's main screen is a session browser: every Claude Code project and session, sorted by recency. Press &lt;code&gt;R&lt;/code&gt; to rename, &lt;code&gt;D&lt;/code&gt; to delete, &lt;code&gt;F&lt;/code&gt; to fork a session. The killer feature? &lt;strong&gt;Quick-resume with ★/☆ shortcuts&lt;/strong&gt; — mark your top sessions and jump straight back in across all projects. Search filters sessions live by name or preview.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Project Memory That Actually Persists
&lt;/h3&gt;

&lt;p&gt;This is where claudectl solves the CLAUDE.md problem. Two approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scaffold (C key)&lt;/strong&gt;: Build project context mechanically from git repos, recent commits, READMEs, and prior session topics. No AI cost, fast iteration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI generation (A key)&lt;/strong&gt;: Claude itself deep-analyzes the codebase and writes or updates a comprehensive CLAUDE.md. You review before anything is written.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plus a &lt;strong&gt;system prompt (S key)&lt;/strong&gt; per project — AI-generate or hand-edit a prompt injected on every launch. This means your "be careful with this monorepo's test suite" instructions persist across sessions.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. MCP Awareness at a Glance
&lt;/h3&gt;

&lt;p&gt;Claudectl shows connected MCP servers in the footer on startup. More importantly, it can analyze any MCP server's tools and write the docs into the global &lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt; so Claude knows them in every session. No more forgetting which MCP server does what.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Per-Project Launch Control
&lt;/h3&gt;

&lt;p&gt;Before launching Claude, you pick:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reasoning effort (low/medium/high)&lt;/li&gt;
&lt;li&gt;Model override (Opus 4.6, Sonnet 4.6, etc.)&lt;/li&gt;
&lt;li&gt;Extra PATH entries injected into Claude's environment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Last choices are remembered per project. This is huge for teams that want Opus for architecture work but Sonnet for quick refactors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup — Two Commands
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;claudectl
claudectl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requirements: Python 3.10+, Windows 10 or 11, Claude Code CLI installed (auto-detected at &lt;code&gt;%USERPROFILE%/.local/bin/claude.exe&lt;/code&gt; or on PATH).&lt;/p&gt;

&lt;h2&gt;
  
  
  When To Use It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-project developers&lt;/strong&gt;: You switch between 3+ repos daily and hate losing session context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teams onboarding new members&lt;/strong&gt;: Scaffolded CLAUDE.md files give new devs instant project context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP-heavy workflows&lt;/strong&gt;: You run multiple MCP servers and want visibility into which are connected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model experimentation&lt;/strong&gt;: You want to quickly switch between Opus 4.6 and Sonnet 4.6 per project without editing config files.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Catch
&lt;/h2&gt;

&lt;p&gt;It's Windows-only (for now). Linux and macOS users will need to wait or contribute. Also, it's a community tool, not an Anthropic product — expect rough edges and no official support.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;Claudectl fixes the biggest pain point of Claude Code on Windows: context loss across projects. If you're managing more than one project, install it today. The session browser alone saves 30 seconds every time you switch contexts — and that adds up fast.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/babarmuhammad/claudectl" rel="noopener noreferrer"&gt;github.com&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/claudectl-the-windows-workspace" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>MCP Server Report: 54% of 39,762 Servers Have Zero Community Adoption —</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 16:18:05 +0000</pubDate>
      <link>https://dev.to/gentic_news/mcp-server-report-54-of-39762-servers-have-zero-community-adoption--1149</link>
      <guid>https://dev.to/gentic_news/mcp-server-report-54-of-39762-servers-have-zero-community-adoption--1149</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;54% of 39,762 MCP servers are invisible to AI agents due to zero community adoption. Use Agent Tool Intelligence's new grading model to boost your server's discoverability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;54% of 39,762 MCP servers are invisible to AI agents due to zero community adoption.&lt;/li&gt;
&lt;li&gt;Use Agent Tool Intelligence's new grading model to boost your server's discoverability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Changed — The New Grading Model for MCP Servers
&lt;/h2&gt;

&lt;p&gt;The MCP ecosystem just crossed 39,762 indexed servers, and Agent Tool Intelligence released its June 2026 report with a completely rebuilt scoring engine. The old system scored 85.7% of tools as Grade B — useless for differentiation. Now they use a three-dimensional additive model: Quality Score + Community Bonus + Trust Bonus.&lt;/p&gt;

&lt;p&gt;Here's the breakdown that matters to you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;B+ (1,123 servers, 2.8%)&lt;/strong&gt;: Very good — close to elite&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;B (7,187 servers, 18.1%)&lt;/strong&gt;: Good — solid quality + some community&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C+ (5,477 servers, 13.8%)&lt;/strong&gt;: OK — decent quality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C (21,464 servers, 54.0%)&lt;/strong&gt;: Average — good foundation, no community signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D (4,351 servers, 10.9%)&lt;/strong&gt;: Needs work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;F (160 servers, 0.4%)&lt;/strong&gt;: Critical&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What It Means For You — The Discoverability Problem
&lt;/h2&gt;

&lt;p&gt;If you've built an MCP server and nobody's using it, you're in the 54%. Your code could be perfect, but AI agents don't know it exists. The report's key insight: "54% of MCP tools have solid code quality but zero community adoption. They're invisible to AI agents."&lt;/p&gt;

&lt;p&gt;This is a huge problem because Claude Code, which uses the Model Context Protocol extensively (58 sources confirm this connection), relies on discoverable servers to extend its capabilities. If your server isn't on the radar, it's like having a library with no catalog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Now — How to Boost Your Server's Grade
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3a245bujjvbzm4lec9a.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3a245bujjvbzm4lec9a.jpeg" alt="EP163: 12 MCP Servers You Can Use in 2025" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The path from C to B is clear: &lt;strong&gt;get 10+ GitHub stars + stay active&lt;/strong&gt;. Here's your action plan:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check your current grade&lt;/strong&gt;: Go to &lt;a href="https://agent-tool-intel-production.up.railway.app" rel="noopener noreferrer"&gt;agent-tool-intelligence.com&lt;/a&gt; and search for your server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get 10+ GitHub stars&lt;/strong&gt;: Share your server on r/ClaudeCode, Hacker News, or Dev.to. A single post can get you there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push within 30 days&lt;/strong&gt;: The report shows 100% of indexed servers are active. If yours isn't, it drops off.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add documentation and examples&lt;/strong&gt;: The Quality Score component rewards clear READMEs and usage examples.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For your Claude Code workflow, you can also use the &lt;code&gt;claude mcp add&lt;/code&gt; command to manually connect to any server, but discoverability matters when you're searching for new tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters for Claude Code Users
&lt;/h2&gt;

&lt;p&gt;The MCP ecosystem is the backbone of Claude Code's extensibility. As we've reported, Claude Code uses MCP extensively, and GitHub itself now uses the protocol. The recent launch of Spec-Kit (June 7, 2026) shows how MCP is becoming standard infrastructure.&lt;/p&gt;

&lt;p&gt;If you're building MCP servers for your team or company, this grading system is your SEO. A B+ rating means your server gets surfaced when AI agents search for tools. A C rating means you're invisible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Tips
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For server builders&lt;/strong&gt;: Focus on community signals. Stars and activity matter more than perfect code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For users&lt;/strong&gt;: Use the grading tool to find high-quality servers before adding them to Claude Code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For both&lt;/strong&gt;: The methodology is open source at &lt;a href="https://github.com/agent-tool-intel/agent-tool-intel" rel="noopener noreferrer"&gt;github.com/agent-tool-intel/agent-tool-intel&lt;/a&gt; — you can even contribute to the scoring model.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://dev.to/hm_cheng_208d77b57f7f15c3/mcp-ecosystem-report-june-2026-40le"&gt;dev.to&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/mcp-server-report-54-of-39762" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Adam Selipsky Leaves AWS to Lead $10B AI Data Center Venture</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 16:18:04 +0000</pubDate>
      <link>https://dev.to/gentic_news/adam-selipsky-leaves-aws-to-lead-10b-ai-data-center-venture-49fn</link>
      <guid>https://dev.to/gentic_news/adam-selipsky-leaves-aws-to-lead-10b-ai-data-center-venture-49fn</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Former AWS CEO Adam Selipsky to lead a $10B AI data center venture, highlighting the capital race for AI compute infrastructure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Former AWS CEO Adam Selipsky will lead a new $10 billion AI data center venture, GeekWire reported. The move signals the escalating capital race for AI compute infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$10 billion capital commitment for AI data center venture.&lt;/li&gt;
&lt;li&gt;Adam Selipsky, former AWS CEO, to lead the company.&lt;/li&gt;
&lt;li&gt;Selipsky left AWS in May 2024.&lt;/li&gt;
&lt;li&gt;Venture targets AI compute infrastructure gap.&lt;/li&gt;
&lt;li&gt;No name, locations, or investors disclosed yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adam Selipsky, who stepped down as CEO of Amazon Web Services in May 2024 after a tenure marked by aggressive expansion, is returning to lead a new venture focused on building AI data centers. &lt;a href="https://news.google.com/rss/articles/CBMimgFBVV95cUxOYUdXSjlNX3ZnUDViWWsyNEZUSVFBUl9hNUd1U25pV3B5TjBqZFdLUm93T0szRFF6eXZRbkN6YVRLdDNtSWZ3Y0x2YVZQaVY3LW43QlNmVW1WNXJNMzBXREZRaEtCN1RXVEZZOTRzX2Y1WW0xSFZORWhaTi1XMHlzUWQ2TzczNWVRcVZIZHZ0TGd0N004WV9mdlpn?oc=5" rel="noopener noreferrer"&gt;According to GeekWire&lt;/a&gt; The venture is capitalized at $10 billion, a figure that underscores the immense capital required to build out AI compute infrastructure.&lt;/p&gt;

&lt;p&gt;The company has not yet disclosed its name, specific locations, or investor backing. The $10 billion figure, however, places it among the largest dedicated AI data center plays globally, rivaling projects from Crusoe (which claims a 5 GW pipeline) and Google's $11B/year commitment to SpaceX for compute. Selipsky's leadership lends immediate credibility and operational expertise, given his track record scaling AWS's infrastructure to serve millions of customers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters More Than the Press Release Suggests
&lt;/h3&gt;

&lt;p&gt;Selipsky's move is less about one startup and more about the structural shift in AI infrastructure. The hyperscalers—Google Cloud, AWS, Microsoft Azure—are all building their own capacity, but the demand from AI startups and enterprises is outstripping supply. New ventures like this one, backed by veteran operators, aim to fill the gap. The $10 billion figure also signals that capital markets are willing to fund independent infrastructure plays, not just the Big Tech incumbents.&lt;/p&gt;

&lt;p&gt;This is the second high-profile departure from AWS's executive ranks for an AI infrastructure bet. The trend suggests that the bottleneck for AI progress is no longer models or algorithms—it's the physical infrastructure to run them. Selipsky's venture will compete directly with offerings from his former employer, as well as Google Cloud and Microsoft, for the same pool of enterprise AI customers.&lt;/p&gt;

&lt;p&gt;The venture's success will hinge on its ability to secure power, land, and chips—the three scarce resources for AI data centers. With Google booking Intel for 3M+ TPUs in 2028 and Nvidia's Blackwell GPUs in high demand, supply chain access will be critical.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu3vt3j8egmyvajyncdfk.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu3vt3j8egmyvajyncdfk.jpeg" alt="Ahead of re:Invent, Adam Selipsky hints at the AWS next-gen cloud" width="799" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for the venture's official name, first site location, and anchor investor reveal, likely within the next 90 days. Also watch whether Selipsky's venture secures a multi-year GPU reservation deal with Nvidia or another chipmaker—that would signal supply chain access.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://news.google.com/rss/articles/CBMimgFBVV95cUxOYUdXSjlNX3ZnUDViWWsyNEZUSVFBUl9hNUd1U25pV3B5TjBqZFdLUm93T0szRFF6eXZRbkN6YVRLdDNtSWZ3Y0x2YVZQaVY3LW43QlNmVW1WNXJNMzBXREZRaEtCN1RXVEZZOTRzX2Y1WW0xSFZORWhaTi1XMHlzUWQ2TzczNWVRcVZIZHZ0TGd0N004WV9mdlpn?oc=5" rel="noopener noreferrer"&gt;news.google.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[Updated 12 Jun via gn_dc_power]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Information reports that OpenAI is in talks to lease a 10-gigawatt data center campus in Ohio, with potential backing from Nvidia. The project could cost up to $500 billion and would be located on federal land. This development is separate from Selipsky's venture but underscores the same AI infrastructure race. Nvidia's involvement as a backer signals a deepening of its role beyond chip supplier to project financier [per The Information].&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/adam-selipsky-leaves-aws-to-lead" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>OpenAI Acquires Cloud Startup Ona to Power Agent Infrastructure</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 10:18:06 +0000</pubDate>
      <link>https://dev.to/gentic_news/openai-acquires-cloud-startup-ona-to-power-agent-infrastructure-2feh</link>
      <guid>https://dev.to/gentic_news/openai-acquires-cloud-startup-ona-to-power-agent-infrastructure-2feh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;OpenAI acquired cloud startup Ona to support AI agent infrastructure, two days after a $6.6B raise. The deal targets enterprise reliability gaps as OpenAI pivots to B2B.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;OpenAI agreed to acquire Ona, a cloud startup providing infrastructure for AI agents, Bloomberg reported Thursday. The acquisition targets enterprise reliability gaps as OpenAI pushes agents for business workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI acquired Ona, a cloud startup for AI agents&lt;/li&gt;
&lt;li&gt;Deal comes 2 days after $6.6B round at $157B valuation&lt;/li&gt;
&lt;li&gt;Ona provides infrastructure for reliable agent scaling&lt;/li&gt;
&lt;li&gt;OpenAI is exploring $10B enterprise AI joint venture&lt;/li&gt;
&lt;li&gt;Competitors like Anthropic offer agent products (Claude Code)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenAI has agreed to acquire Ona, a startup that offers cloud services to support artificial intelligence agents, part of a bid by the AI developer to make its technology more useful for businesses &lt;a href="https://www.bloomberg.com/news/articles/2026-06-11/openai-to-acquire-cloud-platform-ona-to-support-ai-agents" rel="noopener noreferrer"&gt;According to Bloomberg&lt;/a&gt;. Ona provides infrastructure that enables AI agents to run reliably at scale, addressing a key bottleneck OpenAI's own platform faced in production deployments.&lt;/p&gt;

&lt;p&gt;The deal comes two days after OpenAI closed a $6.6B funding round at a $157B valuation, per the June 9 report. The company has shifted to full focus on the B2B sector, cutting sidequests, as noted in a May 20 report. Ona's cloud services are designed to handle the orchestration, monitoring and scaling of autonomous agents — software that uses large language models to perceive environments, make decisions and take actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Ona matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The acquisition signals OpenAI views agent infrastructure as a competitive moat, not just a feature. Competitors like Anthropic have already released agent-focused products such as Claude Code, which uses AI agents for coding tasks. By owning the cloud layer, OpenAI can control latency, cost and reliability for its agent stack — a contrast to relying on third-party cloud providers like Microsoft Azure, which has been a strategic partner. The deal value was not disclosed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enterprise push accelerates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OpenAI is in advanced talks with TPG, Bain, Brookfield and Advent for a $10B enterprise AI joint venture, per a May 20 report. The Ona acquisition directly supports that push: enterprise customers need agents that don't fail in production. A June 8 study from Anthropic found AI agents failed biology retrieval tasks, missing 261 Ebola sequences — underscoring the reliability challenge Ona's infrastructure aims to solve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI acquired cloud startup Ona to support AI agent infrastructure, two days after a $6.6B raise.&lt;/li&gt;
&lt;li&gt;The deal targets enterprise reliability gaps as OpenAI pivots to B2B.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrrqhh55lmpkxmftxvvr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrrqhh55lmpkxmftxvvr.jpg" alt="Exclusive: OpenAI taps Google in unprecedented cloud deal despite AI ..." width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for OpenAI's Q3 enterprise agent adoption metrics and whether the $10B joint venture with TPG, Bain, Brookfield and Advent closes. Ona's integration timeline and any agent benchmark improvements (e.g., SWE-Bench scores) will signal if the infrastructure bet pays off.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://www.bloomberg.com/news/articles/2026-06-11/openai-to-acquire-cloud-platform-ona-to-support-ai-agents" rel="noopener noreferrer"&gt;bloomberg.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[Updated 12 Jun via openai_blog]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OpenAI confirmed the acquisition in a blog post, stating Ona will expand Codex with secure, persistent cloud environments, enabling long-running AI agents across enterprise workflows [per OpenAI]. This marks the first official confirmation of the deal and clarifies Ona's role in enhancing Codex, OpenAI's coding AI, rather than just general agent infrastructure.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/openai-acquires-cloud-startup-ona" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>Bezos' Prometheus Closes $12B Round at $41B Valuation</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 10:18:04 +0000</pubDate>
      <link>https://dev.to/gentic_news/bezos-prometheus-closes-12b-round-at-41b-valuation-4kn0</link>
      <guid>https://dev.to/gentic_news/bezos-prometheus-closes-12b-round-at-41b-valuation-4kn0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Jeff Bezos' Prometheus raised $12B at $41B valuation, totaling $18.2B with no product. The compute-heavy startup targets physical-world AI but faces skepticism.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Jeff Bezos' AI startup Prometheus closed a $12 billion round at a $41 billion valuation. The 7-month-old company, which has no product, has now raised $18.2 billion total — more than OpenAI's entire pre-2025 funding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$12B round at $41B valuation&lt;/li&gt;
&lt;li&gt;$6.2B seed in November 2025&lt;/li&gt;
&lt;li&gt;Total raised: $18.2B&lt;/li&gt;
&lt;li&gt;No product released yet&lt;/li&gt;
&lt;li&gt;Building AI for engineering, manufacturing, drug design&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prometheus, founded by Jeff Bezos and Stanford professor Vik Bajaj (a Verily co-founder), has raised $12 billion in new funding at a $41 billion valuation, &lt;a href="https://the-decoder.com/jeff-bezos-ai-startup-prometheus-closes-12-billion-round-at-a-41-billion-valuation/" rel="noopener noreferrer"&gt;according to CNBC via The Decoder&lt;/a&gt;. The round follows a $6.2 billion seed in November 2025, making Prometheus one of the most capitalized AI startups ever without a single product release.&lt;/p&gt;

&lt;p&gt;Bezos says the capital is largely destined for compute infrastructure. "A big chunk of the money is going toward compute, since the work is 'very compute intensive' — especially data generation," he told CNBC. The startup has poached employees from OpenAI, Google DeepMind, and Nvidia, but has not demonstrated any output. Bezos calls sharing details "premature," though he adds it's "easy to imagine Amazon or any hyperscaler" using Prometheus' tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No product, $18.2B in the bank&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus is building AI models for physical tasks: engineering, manufacturing, and drug design, targeting the tech, automotive, and aerospace industries. The strategy echoes Bezos' own "regret minimization framework" — bet big on compute now, figure out product later. But the market is growing skeptical of capital-intensive AI moonshots. OpenAI closed a $6.6B round at $157B valuation just days ago, while Anthropic raised $5B at $61.5B. Prometheus' $41B valuation for a pre-product company implies investors are pricing Bezos' track record, not the technology.&lt;/p&gt;

&lt;p&gt;The key question: can Prometheus deliver a physical-world AI model that justifies $18.2B before competitors like Nvidia (which makes the chips Prometheus buys) or Google DeepMind (which has AlphaFold for drug design) release comparable products? Bezos says the startup is building "for the long-term," but at this burn rate, the runway is shorter than it appears.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Jeff Bezos' Prometheus raised $12B at $41B valuation, totaling $18.2B with no product.&lt;/li&gt;
&lt;li&gt;The compute-heavy startup targets physical-world AI but faces skepticism.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgmy05wbif5q0r4doxqwz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgmy05wbif5q0r4doxqwz.jpg" alt="Jeff Bezos reportedly returns to the trenches as co-CEO of new AI ..." width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch for Prometheus' first product announcement — likely a physical-world AI model for engineering or drug design. If none by Q1 2027, investors may question the $41B valuation. Also watch for Nvidia or Google DeepMind releasing competing models.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://the-decoder.com/jeff-bezos-ai-startup-prometheus-closes-12-billion-round-at-a-41-billion-valuation/" rel="noopener noreferrer"&gt;the-decoder.com&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/bezos-prometheus-closes-12b-round" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>startup</category>
      <category>business</category>
      <category>funding</category>
    </item>
    <item>
      <title>SVoT Boosts MLLM Spatial Reasoning by 65% via RL-Verified Visual Chains</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Fri, 12 Jun 2026 04:18:16 +0000</pubDate>
      <link>https://dev.to/gentic_news/svot-boosts-mllm-spatial-reasoning-by-65-via-rl-verified-visual-chains-pao</link>
      <guid>https://dev.to/gentic_news/svot-boosts-mllm-spatial-reasoning-by-65-via-rl-verified-visual-chains-pao</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;SVoT uses RL to verify MLLM spatial reasoning states, achieving up to 65% accuracy gains on OOD tests across five domains including Pacman and Gather.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;SVoT, a new RL framework, verifies intermediate spatial reasoning states in MLLMs via GRPO training. On out-of-distribution tests, it achieves up to 65% absolute accuracy gains across five domains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SVoT achieves up to 65% absolute accuracy gain on OOD tests.&lt;/li&gt;
&lt;li&gt;Trained via GRPO, same algorithm as DeepSeek-R1.&lt;/li&gt;
&lt;li&gt;Introduces Pacman and Gather domains for multi-object reasoning.&lt;/li&gt;
&lt;li&gt;Five domains total, extending classical environments.&lt;/li&gt;
&lt;li&gt;Published on arXiv on 10 Jun 2026.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multimodal large language models (MLLMs) stumble on multi-hop spatial reasoning because they treat state transitions as implicit processes and leave intermediate states unverified. A new paper &lt;a href="https://arxiv.org/abs/2606.11770" rel="noopener noreferrer"&gt;SVoT: State-aware Visualization-of-Thought for Spatial Reasoning via Reinforcement Learning&lt;/a&gt; from Chao Lei, Yanbei Jiang, Markus Hiller and colleagues tackles this head-on with SVoT, a reinforcement learning framework that generates interleaved, verifiable intermediate states and visualizations.&lt;/p&gt;

&lt;h2&gt;
  
  
  How SVoT Works
&lt;/h2&gt;

&lt;p&gt;SVoT integrates transition reasoning chains — explicit textual and visual descriptions of each action's preconditions and effects — into the generation process. It trains via Group Relative Policy Optimization (GRPO), the same algorithm behind DeepSeek-R1, but here instantiated with fine-grained reward design for state verification. The model learns to check its own intermediate reasoning steps before moving to the next, rather than hallucinating a path.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Benchmark Gap
&lt;/h2&gt;

&lt;p&gt;Existing spatial reasoning benchmarks reduce state transitions to single-variable updates, substantially simplifying the problems. The authors counter this by extending classical environments and introducing two novel domains — Pacman and Gather — that require multi-object interactions and numerical reasoning. These domains support quantitative verification of generated intermediate states, something prior benchmarks cannot do.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3mbcr0cq932ob08qfx3e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3mbcr0cq932ob08qfx3e.png" alt="Figure 2: Examples of the CoT (transition reasoning chain) in SVoT used to guide the generation of intermediate state an" width="500" height="196"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;SVoT with transition-aware supervision achieves state-of-the-art performance across all five introduced domains. On out-of-distribution test sets, the absolute accuracy gain reaches 65%. The framework's reliance on RL rather than supervised fine-tuning allows it to generalize beyond the training distribution, a critical property for real-world deployment where environments vary.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34nv1aq7er93di8he8pi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34nv1aq7er93di8he8pi.png" alt="Figure 1: Illustration of the five domains used in SVoT. Coordinates are (row, column), starting from (0,0) at the top-l" width="799" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It Matters
&lt;/h2&gt;

&lt;p&gt;The core insight is that verification must be interleaved, not post-hoc. Chain-of-thought reasoning often fails spatial tasks because the model cannot detect its own errors mid-chain. SVoT's RL-based verification loop mirrors how humans re-check a map after each move. The 65% gain suggests that the bottleneck in MLLM spatial reasoning is not perception but state tracking, and that RL provides a scalable path to fix it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrar8uim498pqfofsiud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrar8uim498pqfofsiud.png" alt="Figure 3: The architectures of MVoT and SVoT." width="428" height="195"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for open-source implementations of SVoT's reward design on GitHub and whether the approach transfers to 3D spatial reasoning benchmarks like Habitat or Matterport3D. Also track if commercial MLLM providers (OpenAI, Google) adopt interleaved verification in their next model releases.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2606.11770" rel="noopener noreferrer"&gt;arxiv.org&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/svot-boosts-mllm-spatial-reasoning" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>How to Cut Agent Token Waste: CLI Over GraphQL + Server-Pushed Hints</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Thu, 11 Jun 2026 22:18:07 +0000</pubDate>
      <link>https://dev.to/gentic_news/how-to-cut-agent-token-waste-cli-over-graphql-server-pushed-hints-2ogd</link>
      <guid>https://dev.to/gentic_news/how-to-cut-agent-token-waste-cli-over-graphql-server-pushed-hints-2ogd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Replace raw GraphQL with typed CLI commands to eliminate JSON assembly errors, then add server-pushed hints via MCP to prevent judgment failures. Your agent burns 1,500+ tokens per operation otherwise.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Replace raw GraphQL with typed CLI commands to eliminate JSON assembly errors, then add server-pushed hints via MCP to prevent judgment failures.&lt;/li&gt;
&lt;li&gt;Your agent burns 1,500+ tokens per operation otherwise.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Problem: Your Agent Is Bleeding Tokens on JSON Assembly
&lt;/h2&gt;

&lt;p&gt;You designed the perfect architecture — direct API calls, no MCP overhead, a clean SKILL.md behavior spec. The agent calls your GraphQL endpoint with curl, reads your docs, and executes. Elegant.&lt;/p&gt;

&lt;p&gt;Then you watch the token counter. A single upload operation that should cost ~200 tokens burns 1,500+. Why? The agent is guessing JSON field formats wrong, getting GraphQL errors, fetching docs across multiple pages to figure out the correct format, and retrying. Every. Single. Time.&lt;/p&gt;

&lt;p&gt;This isn't a documentation problem. It's a structural problem: &lt;strong&gt;LLMs are fundamentally bad at assembling nested JSON payloads from scratch.&lt;/strong&gt; You can fix your docs a hundred times and the agent will find a new field to misformat.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: Typed CLI Arguments
&lt;/h2&gt;

&lt;p&gt;Instead of making the agent assemble raw JSON in curl commands, wrap your API in a CLI with typed arguments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Before: agent assembles raw JSON in curl&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST /graphql &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"query":"mutation { uploadAsset(input: { shotId: \"...\", type: \"start_frame\", provenance: { method: \"ai_generated\", model: \"gpt-image-2\", prompt: \"...\" } }) { id } }"}'&lt;/span&gt;

&lt;span class="c"&gt;# After: typed CLI arguments, zero JSON assembly&lt;/span&gt;
python3 nl.py upload &amp;lt;shotId&amp;gt; start_frame frame.png &lt;span class="nt"&gt;--method&lt;/span&gt; ai_generated &lt;span class="nt"&gt;--model&lt;/span&gt; &lt;span class="s2"&gt;"gpt-image-2"&lt;/span&gt; &lt;span class="nt"&gt;--prompt&lt;/span&gt; &lt;span class="s2"&gt;"Winter city street"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This eliminates the error-recovery loop entirely. The agent passes flags, not JSON. The CLI dispatcher handles type conversion server-side.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bonus: One CLI, Two Audiences
&lt;/h3&gt;

&lt;p&gt;Add a &lt;code&gt;--json&lt;/code&gt; flag so the same CLI serves both the agent (structured data) and you (human-readable output):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For the agent: structured JSON for parsing&lt;/span&gt;
python3 nl.py overview &amp;lt;noteId&amp;gt; &lt;span class="nt"&gt;--json&lt;/span&gt;

&lt;span class="c"&gt;# For you watching: readable progress&lt;/span&gt;
python3 nl.py overview &amp;lt;noteId&amp;gt;
&lt;span class="c"&gt;# Episode 01: The Algorithm Hunter&lt;/span&gt;
&lt;span class="c"&gt;#   [===done===|--review--|......not_started.......] 3/12&lt;/span&gt;
&lt;span class="c"&gt;#   Shot   Status       Rolls    Best   PF&lt;/span&gt;
&lt;span class="c"&gt;#   01A    done         3        48     Y&lt;/span&gt;
&lt;span class="c"&gt;#   01B    review       2        41     Y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Next Level: Server-Pushed Hints
&lt;/h2&gt;

&lt;p&gt;CLI fixed execution errors. But your agent still makes bad decisions — re-rolling without changing prompts, forgetting to use uploaded assets, skipping status updates. These are judgment failures, not execution failures.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8bwqey9aoe7og6nzajf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8bwqey9aoe7og6nzajf.png" alt="Cover image for My server pushes hints to agents — and the 3 iterations that led there" width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The solution: let your server push hints to the agent proactively. When the server detects an impending mistake (e.g., a prompt written without referencing available assets), it injects a hint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pendingHints&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;available_refs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Available refs for prompting: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;refs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`@&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;assetType&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;targetId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;shot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;refs&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches failures before they happen. The agent doesn't have to remember everything — the server nudges it at the critical moment.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Apply This to Your Claude Code Workflow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Audit your agent's token waste&lt;/strong&gt;: Watch for error→doc→retry loops. If you see them, the fix isn't better docs — it's eliminating the assembly step.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build a CLI wrapper&lt;/strong&gt;: Create a typed CLI for your API. Even a simple Python script with argparse is enough. Route all 34 commands through it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add server-pushed hints&lt;/strong&gt;: After each operation, check for common judgment failures and inject hints before the agent's next action. Ask your agent: "Would a nudge here have prevented this?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iterate with reflection&lt;/strong&gt;: Pause production periodically and ask your agent what gaps in your behavior spec caused inefficient actions. Fix those gaps. Repeat.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CLI arguments are inherently type-safe for LLMs&lt;/strong&gt; — no JSON assembly, no field guessing, no error recovery loops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-pushed hints are cheaper than error recovery&lt;/strong&gt; — injecting a hint costs ~50 tokens; recovering from a wrong decision costs 1,500+.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your agent is your best auditor&lt;/strong&gt; — it knows exactly where your spec failed it. Just ask.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't about avoiding MCP. It's about recognizing that the real work starts after the architecture is in place. The agent needs guardrails, not just documentation.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://dev.to/raymondnl/my-server-pushes-hints-to-agents-and-the-3-iterations-that-led-there-51a9"&gt;dev.to&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/how-to-cut-agent-token-waste-cli" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Claude Desktop Spawns 1.8 GB Hyper-V VM on Every Windows Launch</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Thu, 11 Jun 2026 22:18:04 +0000</pubDate>
      <link>https://dev.to/gentic_news/claude-desktop-spawns-18-gb-hyper-v-vm-on-every-windows-launch-4f0c</link>
      <guid>https://dev.to/gentic_news/claude-desktop-spawns-18-gb-hyper-v-vm-on-every-windows-launch-4f0c</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Claude Desktop spawns a 1.8 GB Hyper-V VM on every Windows launch due to 2,689 stale session files, consuming 11% of RAM.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude Desktop on Windows spawns a 1.8 GB Hyper-V VM on every launch, even for chat-only use. A GitHub bug report reveals 2,689 stale session files trigger the infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1.8 GB VM per launch on Windows&lt;/li&gt;
&lt;li&gt;11% of 16 GB RAM consumed&lt;/li&gt;
&lt;li&gt;2,689 stale session files found&lt;/li&gt;
&lt;li&gt;Errors since February 19, 2026&lt;/li&gt;
&lt;li&gt;Only VirtualMachinePlatform enabled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A &lt;a href="https://github.com/anthropics/claude-code/issues/29045" rel="noopener noreferrer"&gt;GitHub issue&lt;/a&gt; filed on February 26, 2026, documents that Anthropic's Claude Desktop app for Windows launches a Hyper-V virtual machine consuming approximately 1.8 GB of RAM every time it starts — even when the user only needs chat functionality. On a 16 GB laptop, this represents over 11% of total memory consumed by infrastructure that isn't being used.&lt;/p&gt;

&lt;p&gt;The bug report, which has garnered 329 points and 232 comments on Hacker News, details that the app triggers the Hyper-V Host Compute Service (vmcompute) via an RPC interface event on every launch. This spawns a vmwp.exe process hosting a full virtual machine, appearing as "Vmmem" in Task Manager at approximately 1,796–1,846 MB. The Hyper-V Compute Admin event log shows repeated errors: "The specified property query is invalid: The virtual machine or container JSON document is invalid. (0xC037010D, 'Invalid JSON document '$'')." These errors have been occurring since at least February 19, 2026, triggered on every boot and app launch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Root Cause: Stale Session Files
&lt;/h3&gt;

&lt;p&gt;Through extensive PowerShell diagnostics, the reporter confirmed that WSL, Hyper-V management tools, Docker, and Windows Sandbox are all disabled. The only enabled virtualization feature is VirtualMachinePlatform. The investigation found 2,689 stale session files in &lt;code&gt;%APPDATA%\Claude\local-agent-m&lt;/code&gt;, likely created by prior use of Cowork or agent mode. These files trigger the VM spawn on every launch, even when the user has no intention of using agentic features.&lt;/p&gt;

&lt;p&gt;The vmcompute service is set to Manual start but is triggered at boot by an RPC interface event (GUID: bc90d167-9470-4139-a9ba-be0bbbf5b74d). The parent process is services.exe (PID 1400), confirming it's a service trigger, not a user-initiated launch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Broader Implications
&lt;/h3&gt;

&lt;p&gt;This bug reflects a deeper pattern of rushed engineering at Anthropic as it scales. The Hacker News thread includes a pointed comment: "I just found a really pointed example of Anthropics lack of craft / rush to build. If you open Claude on Windows, and click Dispatch (under cowork) to start that up, it will tell you that you need permissions windows doesn't have. When you click the buttons for those permissions, it has broken links to macOS system preferences." This comes as &lt;a href="https://gentic.news/anthropic-ipo-considered-as-early-as-late-2026" rel="noopener noreferrer"&gt;Anthropic reportedly considers an IPO&lt;/a&gt; as early as late 2026, and the company is &lt;a href="https://gentic.news/anthropic-projected-to-surpass-openai-in-arr-by-mid-2026" rel="noopener noreferrer"&gt;projected to surpass OpenAI in ARR&lt;/a&gt; by mid-2026. The VM bloat is a concrete example of the infrastructure debt that can accumulate when product velocity outpaces engineering discipline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workaround
&lt;/h3&gt;

&lt;p&gt;Users on the thread report that deleting the stale session files in &lt;code&gt;%APPDATA%\Claude\local-agent-m&lt;/code&gt; resolves the issue temporarily, but the files are recreated on agent mode use. A permanent fix would require Anthropic to either clean up session files on exit or defer VM creation until agent mode is actually requested.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for Anthropic to release a patch in the next 2-3 sprint cycles. If the issue persists into Q3 2026, it could signal deeper infrastructure debt ahead of a potential IPO. Also track whether the stale session file count grows beyond 2,689 as more users adopt agent mode.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fprivate-user-images.githubusercontent.com%2F255574547%2F555498971-3d345f14-abce-442e-9ef2-538fcd749200.png%3Fjwt%3DeyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3ODExMzg3NDYsIm5iZiI6MTc4MTEzODQ0NiwicGF0aCI6Ii8yNTU1NzQ1NDcvNTU1NDk4OTcxLTNkMzQ1ZjE0LWFiY2UtNDQyZS05ZWYyLTUzOGZjZDc0OTIwMC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwNjExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDYxMVQwMDQwNDZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03ZWExNTdiOTM0NTQ2YzdlMGM1OTNmZjM4Zjc0N2E1ZDAxNTk3ZGFmZTZhMTUwODRmMzVjZDFiZmE0MThiMzk0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZyZXNwb25zZS1jb250ZW50LXR5cGU9aW1hZ2UlMkZwbmcifQ.-ZbH5mUZkwbo2segwG5v069GPt4ViVkLzMDwZPudjMw" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fprivate-user-images.githubusercontent.com%2F255574547%2F555498971-3d345f14-abce-442e-9ef2-538fcd749200.png%3Fjwt%3DeyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3ODExMzg3NDYsIm5iZiI6MTc4MTEzODQ0NiwicGF0aCI6Ii8yNTU1NzQ1NDcvNTU1NDk4OTcxLTNkMzQ1ZjE0LWFiY2UtNDQyZS05ZWYyLTUzOGZjZDc0OTIwMC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwNjExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDYxMVQwMDQwNDZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03ZWExNTdiOTM0NTQ2YzdlMGM1OTNmZjM4Zjc0N2E1ZDAxNTk3ZGFmZTZhMTUwODRmMzVjZDFiZmE0MThiMzk0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZyZXNwb25zZS1jb250ZW50LXR5cGU9aW1hZ2UlMkZwbmcifQ.-ZbH5mUZkwbo2segwG5v069GPt4ViVkLzMDwZPudjMw" alt="Image" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/anthropics/claude-code/issues/29045" rel="noopener noreferrer"&gt;github.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[Updated 11 Jun via fortune_tech]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Separately, Anthropic faced backlash this week over a policy in Claude Fable 5's 319-page system card that silently limited responses for AI development work. After researchers accused the company of 'secret sabotage,' Anthropic apologized and announced changes to make such safeguards visible [per Fortune]. Flagged requests will now visibly fall back to Opus 4.8, and API refusals will include a reason. 'We made the wrong tradeoff,' Anthropic said in a statement to WIRED.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/claude-desktop-spawns-1-8-gb-hyper" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>Lung-R1-14B Tops EMR Diagnosis with Knowledge Graph-Guided RL</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Thu, 11 Jun 2026 16:18:04 +0000</pubDate>
      <link>https://dev.to/gentic_news/lung-r1-14b-tops-emr-diagnosis-with-knowledge-graph-guided-rl-2mgh</link>
      <guid>https://dev.to/gentic_news/lung-r1-14b-tops-emr-diagnosis-with-knowledge-graph-guided-rl-2mgh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Lung-R1-14B scored 4.3583 on EMR diagnosis, beating 20 systems using a 59K-node knowledge graph and RL-constrained reasoning.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A new 14B-parameter LLM, Lung-R1, scored 4.3583 on an EMR diagnosis benchmark, beating all 20 rival systems. The model, described in a June 2026 arXiv paper, uses a 59,038-node knowledge graph called LungKG to constrain its reasoning chains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LungKG: 59,038 nodes, 164,308 edges.&lt;/li&gt;
&lt;li&gt;15 entity types, 112 relation types.&lt;/li&gt;
&lt;li&gt;Lung-R1-14B EMR Diagnosis score: 4.3583.&lt;/li&gt;
&lt;li&gt;Beats strongest baseline by 0.1476 points.&lt;/li&gt;
&lt;li&gt;Evaluated across 20 systems on 3 tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pulmonary diagnosis remains a hard problem for LLMs because it requires integrating heterogeneous evidence from electronic medical records (EMRs), not just recalling textbook knowledge. The authors of &lt;a href="https://arxiv.org/abs/2606.11675" rel="noopener noreferrer"&gt;Lung-R1: A Knowledge Graph-Guided LLM for Pulmonary Diagnostic Reasoning&lt;/a&gt; formalize this as the "Pulmonary Knowledge-to-Diagnosis Gap."&lt;/p&gt;

&lt;p&gt;To bridge it, they built LungKG, the first structured pulmonary knowledge graph for diagnostic knowledge organization. LungKG contains 59,038 nodes and 164,308 edges across 15 entity types and 112 relation types, serving as both a reusable resource and the foundation for model adaptation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Lung-R1 works
&lt;/h2&gt;

&lt;p&gt;The training pipeline has three stages: KG-constrained reasoning-chain construction, supervised fine-tuning (SFT), and KG-guided reinforcement learning (RL). The RL stage rewards reasoning paths that stay within the graph's relational structure, penalizing jumps that lack edge support.&lt;/p&gt;

&lt;p&gt;In a 20-system evaluation, Lung-R1-14B achieved state-of-the-art performance across all three tasks: Choice (multiple-choice knowledge), Pulmonary-QA (open-ended questions), and EMR Diagnosis (patient-specific record reasoning). The EMR Diagnosis score of 4.3583 surpassed the strongest non-Lung-R1 baseline by 0.1476 points. The authors did not disclose the exact baseline model, but the margin is statistically significant given the 20-system comparison.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the graph matters
&lt;/h2&gt;

&lt;p&gt;The improvement is modest — 0.1476 points on a 5-point scale — but the approach signals a shift away from pure retrieval-augmented generation (RAG) for clinical reasoning. RAG retrieves text chunks; LungKG retrieves structured relations. The graph constrains the LLM to reason about explicit disease-symptom-treatment edges rather than freeform text associations. This could reduce hallucination in high-stakes diagnostic settings, though the paper does not report hallucination rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Farxiv.org%2Fhtml%2F2606.11675v1%2Ffigures%2Flunghub_pipeline_overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Farxiv.org%2Fhtml%2F2606.11675v1%2Ffigures%2Flunghub_pipeline_overview.png" alt="Figure 2: &lt;br&gt;
Overview of the LungKG-guided Lung-R1 pipeline:&lt;br&gt;
(a) LungKG construction from validated pulmonary sources;&lt;br&gt;
(b)" width="800" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for whether the authors release LungKG as a reusable resource and whether follow-up work reports hallucination rates or ablation of the KG-constrained RL stage. A clinical deployment study at a partner hospital would be the strongest signal of real-world viability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqgleu9bvu4yy9nyxinz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqgleu9bvu4yy9nyxinz.png" alt="Figure 1: EMR diagnosis performance on the EMR Diagnosis task. Lung-R1 achieves state-of-the-art performance at 7B/14B s" width="800" height="512"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2606.11675" rel="noopener noreferrer"&gt;arxiv.org&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/lung-r1-14b-tops-emr-diagnosis" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Nvidia Buys Kumo AI for $400M to Predict from Business Data</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Thu, 11 Jun 2026 08:15:01 +0000</pubDate>
      <link>https://dev.to/gentic_news/nvidia-buys-kumo-ai-for-400m-to-predict-from-business-data-1n38</link>
      <guid>https://dev.to/gentic_news/nvidia-buys-kumo-ai-for-400m-to-predict-from-business-data-1n38</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Nvidia acquired Kumo AI for $400M+ to bring foundation model predictions to enterprise relational data, filling a gap left by LLMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Nvidia acquired Kumo AI for over $400 million, per The Information. The deal targets a gap LLMs have left: predictions from relational database data, not just documents and code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deal valued at more than $400 million.&lt;/li&gt;
&lt;li&gt;Kumo raised $37 million from Sequoia Capital and others.&lt;/li&gt;
&lt;li&gt;KumoRFM outperforms gradient-boosted trees on RelBench benchmark.&lt;/li&gt;
&lt;li&gt;Fine-tuning lifts results by 10% to 30%.&lt;/li&gt;
&lt;li&gt;Customers include DoorDash, Reddit, and Snowflake.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nvidia has acquired Kumo AI, a four-year-old startup that builds foundation models for making predictions from business data, &lt;a href="https://www.forbes.com/sites/janakirammsv/2026/06/10/nvidia-kumo-ai-enterprise-data/" rel="noopener noreferrer"&gt;Fortune reported on June 3&lt;/a&gt;. The Information pegged the deal at more than $400 million. Kumo's three co-founders — CEO Vanja Josifovski, engineering head Hema Raghavan and Stanford professor Jure Leskovec — moved to Nvidia in May, though neither company has formally announced the transaction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Kumo's Technology Matters
&lt;/h3&gt;

&lt;p&gt;Kumo's core model, KumoRFM, is a pre-trained relational graph transformer. It represents a database as a graph, where every record becomes a node and primary-foreign key links become edges. Because the model was pre-trained on thousands of real and synthetic relational datasets, it can make predictions on a database it has never seen, without task-specific training. Users define the prediction — such as which customers will churn in the next 30 days — through a lightweight query language.&lt;/p&gt;

&lt;p&gt;On the RelBench benchmark, which spans 30 predictive tasks across seven domains, Kumo reports that the zero-shot model outperforms gradient-boosted trees built with hand-crafted features, and that fine-tuning lifts results by a further 10% to 30%. The startup, backed by $37 million from investors including Sequoia Capital, shipped a second-generation model in April and counts DoorDash, Reddit and Snowflake among its users.&lt;/p&gt;

&lt;h3&gt;
  
  
  Strategic Implications
&lt;/h3&gt;

&lt;p&gt;The acquisition follows a familiar pattern. Nvidia bought Run:ai for roughly $700 million to own GPU orchestration, picked up the data semantics startup Illumex, and signed the Groq agreement for low-latency inference. Each deal moves Nvidia further from selling chips and closer to owning the software enterprises run on those chips. Kumo extends that motion into predictive analytics, a market served today by gradient-boosted tooling, AutoML vendors and the machine learning services of AWS, Google Cloud and Microsoft.&lt;/p&gt;

&lt;p&gt;The deal also creates an awkward dynamic for Snowflake and Databricks, which position their platforms as the natural home for machine learning on enterprise data and now find a prominent predictive AI vendor inside the company they depend on for accelerated computing.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Challenges Ahead
&lt;/h3&gt;

&lt;p&gt;The evidence for KumoRFM's accuracy comes almost entirely from Kumo's own benchmarks. Independent validation will be critical. Additionally, enterprise adoption of predictive AI on relational data has historically been slow, with most companies still relying on SQL-based analytics rather than ML pipelines. Nvidia will need to integrate Kumo into its existing software stack — possibly through NeMo or CUDA — to drive adoption at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to watch
&lt;/h2&gt;

&lt;p&gt;Watch for Nvidia's integration of KumoRFM into NeMo or CUDA by Q4 2026, and whether Snowflake or Databricks respond with their own predictive AI acquisitions or partnerships.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry88vc0cpg8nia64xfjc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry88vc0cpg8nia64xfjc.jpg" alt="Nvidia HQ" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://www.forbes.com/sites/janakirammsv/2026/06/10/nvidia-kumo-ai-enterprise-data/" rel="noopener noreferrer"&gt;forbes.com&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/nvidia-buys-kumo-ai-for-400m-to" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tech</category>
      <category>product</category>
    </item>
    <item>
      <title>Google Open-Sources DiffusionGemma, 26B Model Hits 1K Tokens/Sec on H100</title>
      <dc:creator>gentic news</dc:creator>
      <pubDate>Thu, 11 Jun 2026 08:14:59 +0000</pubDate>
      <link>https://dev.to/gentic_news/google-open-sources-diffusiongemma-26b-model-hits-1k-tokenssec-on-h100-2nak</link>
      <guid>https://dev.to/gentic_news/google-open-sources-diffusiongemma-26b-model-hits-1k-tokenssec-on-h100-2nak</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Google open-sourced DiffusionGemma, a 26B-parameter diffusion text model hitting 1,000 tokens/sec on H100 — 4x faster than autoregressive models, but with lower quality.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Google released DiffusionGemma on June 10, a 26B-parameter open-weight model that generates text via diffusion. Nvidia claims 1,000 tokens per second on a single H100 GPU — roughly 4x faster than autoregressive models like Gemma 4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;26 billion total parameters, ~4 billion active per token (MoE).&lt;/li&gt;
&lt;li&gt;1,000 tokens per second claimed on a single H100 GPU.&lt;/li&gt;
&lt;li&gt;Apache 2.0 license — fully open-weight.&lt;/li&gt;
&lt;li&gt;Available on Hugging Face: google/diffusiongemma-26B-A4B-it.&lt;/li&gt;
&lt;li&gt;Nvidia hosts free inference on NIM cloud API.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Google released DiffusionGemma, a 26-billion-parameter model that generates text not token by token but through diffusion, similar to how image AI turns noise into a picture. &lt;a href="https://the-decoder.com/googles-new-open-model-diffusiongemma-generates-text-from-noise-instead-of-word-by-word/" rel="noopener noreferrer"&gt;According to The Decoder&lt;/a&gt; and &lt;a href="https://simonwillison.net/2026/Jun/10/diffusiongemma/" rel="noopener noreferrer"&gt;Simon Willison's blog&lt;/a&gt;, the model is available on Hugging Face as &lt;code&gt;google/diffusiongemma-26B-A4B-it&lt;/code&gt; under an Apache 2 license — a significant departure from Google's typically more restricted model releases.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it works and why speed matters
&lt;/h3&gt;

&lt;p&gt;DiffusionGemma eschews the standard autoregressive approach (predicting one token at a time) for a continuous diffusion process that iteratively denoises a latent representation of the entire output sequence. This parallel generation is what enables the speedup: Nvidia claims it hits about 1,000 tokens per second on a single H100 GPU, roughly four times faster than comparable autoregressive models. Simon Willison tested the model via Nvidia's NIM cloud API, reporting 2,409 tokens generated in 4.4 seconds — at least 500 tokens/second, with overhead from Python tooling, so raw inference is likely faster.&lt;/p&gt;

&lt;p&gt;This isn't Google's first diffusion-for-text experiment. Last May, Google briefly released an experimental Gemini Diffusion model; Willison recorded it running at 857 tokens/second at the time. That research has now returned as a fully open-weight Gemma model, suggesting Google is serious about making diffusion-based text generation a production-ready alternative.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quality trade-off and positioning
&lt;/h3&gt;

&lt;p&gt;Output quality is lower, so Google is positioning it as an experimental tool for developers for now. The model is a 26B-parameter Mixture of Experts (26B-A4B), meaning only ~4B parameters are active per token — a design choice that keeps inference cheap. Nvidia is currently hosting the model for free on their NIM cloud API, lowering the barrier for developers to experiment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Community reaction and context
&lt;/h3&gt;

&lt;p&gt;Hacker News commenters noted the strategic significance: "Google keeps flexin'. It's surprising that Gemini isn't more competitive against Claude or OpenAI models for code and agentic use, because it's clear Google still has some of the best AI people in the business." The model's speed makes it particularly relevant for on-device and near-realtime use cases — a domain where Google has invested heavily, from Gemini Nano to TPU v6e deployments.&lt;/p&gt;

&lt;h3&gt;
  
  
  What to watch
&lt;/h3&gt;

&lt;p&gt;Watch for benchmark results on standard NLP tasks (MMLU, HellaSwag, HumanEval) as the community stress-tests DiffusionGemma against Gemma 4 and Llama 4. The key question is whether the quality gap narrows with fine-tuning or larger diffusion steps. Also watch for Nvidia's NIM usage metrics — if developer adoption spikes, it signals real demand for non-autoregressive architectures.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx6ggfanbabd9lzzxth44.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx6ggfanbabd9lzzxth44.png" alt="Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://simonwillison.net/2026/Jun/10/diffusiongemma/" rel="noopener noreferrer"&gt;simonwillison.net&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://gentic.news/article/google-open-sources-diffusiongemma" rel="noopener noreferrer"&gt;gentic.news&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>research</category>
      <category>deeplearning</category>
    </item>
  </channel>
</rss>
