<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Siddhesh Surve</title>
    <description>The latest articles on DEV Community by Siddhesh Surve (@siddhesh_surve).</description>
    <link>https://dev.to/siddhesh_surve</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3674466%2F4395d561-d8af-4cbb-be2a-2fd3696ad2b2.png</url>
      <title>DEV Community: Siddhesh Surve</title>
      <link>https://dev.to/siddhesh_surve</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/siddhesh_surve"/>
    <language>en</language>
    <item>
      <title>🚀 Meta Just Killed Open Source Llama: Welcome to the 'Muse Spark' Era (And What It Means for Developers)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Thu, 14 May 2026 02:52:29 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/meta-just-killed-open-source-llama-welcome-to-the-muse-spark-era-and-what-it-means-for-22fi</link>
      <guid>https://dev.to/siddhesh_surve/meta-just-killed-open-source-llama-welcome-to-the-muse-spark-era-and-what-it-means-for-22fi</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffty9vt46jd0o6fdrseyw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffty9vt46jd0o6fdrseyw.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the last two years, the developer ecosystem has heavily relied on Meta as the champion of open-weight models. We built our local pipelines around Llama 2 and Llama 3, assuming the open-source train would keep rolling. &lt;/p&gt;

&lt;p&gt;That era has officially ended. &lt;/p&gt;

&lt;p&gt;Meta has pivoted away from its open-source Llama strategy, introducing a closed, proprietary AI model called &lt;strong&gt;Muse Spark&lt;/strong&gt;. This isn't just a backend update; it is a fundamental architectural shift that ties natively into the new Meta Glasses and fundamentally changes how we build agentic workflows.&lt;/p&gt;

&lt;p&gt;Having spent over 12 years in the industry—navigating the shifts from legacy Microsoft server architectures to modern distributed systems—I can tell you that platform pivots of this magnitude dictate the next five years of engineering. When you manage large-scale data infrastructure and ML optimization systems, you look for the underlying architectural changes, not just the marketing buzz. &lt;/p&gt;

&lt;p&gt;Here is a deep dive into Muse Spark, the new "Contemplating Mode," and how you can migrate your TypeScript apps to the new proprietary API. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🛑 1. The End of Open Weights
&lt;/h2&gt;

&lt;p&gt;Let's address the elephant in the room. For all practical purposes, Meta has abandoned developing frontier Llama models in favor of the cloud-only Muse Spark. &lt;/p&gt;

&lt;p&gt;Muse Spark was built from scratch by Meta's Superintelligence Labs with entirely new infrastructure and data pipelines. There are no downloadable weights, no self-hosting capabilities, and no clear migration path from your existing local Llama setups. &lt;/p&gt;

&lt;p&gt;If you are building enterprise applications, you now face a choice: stick with older open-source models, migrate to competitors like Mistral or Qwen, or rewrite your vendor-specific APIs to adopt Meta's new proprietary endpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 2. "Contemplating Mode": A Masterclass in ML Optimization
&lt;/h2&gt;

&lt;p&gt;While the loss of open weights hurts, the engineering behind Muse Spark is undeniably impressive. &lt;/p&gt;

&lt;p&gt;In optimizing large-scale ML systems, we constantly battle inference costs and latency. Meta tackled this not just by scaling parameters, but by changing &lt;em&gt;how&lt;/em&gt; the model reasons. Muse Spark introduces a feature called &lt;strong&gt;Contemplating Mode&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Instead of relying on a single, linear chain of thought, Contemplating Mode launches multiple agents that propose solutions, refine them, and aggregate the results in parallel. Furthermore, Meta utilized reinforcement learning to penalize the model for using excessive reasoning tokens—a process they call "thought compression". &lt;/p&gt;

&lt;p&gt;This parallel agent orchestration allows Muse Spark to achieve better performance on complex tasks while incurring latency comparable to much simpler models. &lt;/p&gt;

&lt;h2&gt;
  
  
  🕶️ 3. Meta Glasses &amp;amp; The Voice Mode Integration
&lt;/h2&gt;

&lt;p&gt;The true power of Muse Spark isn't in a browser tab; it is integrated directly into hardware. &lt;/p&gt;

&lt;p&gt;Meta AI, built with Muse Spark, is the core engine powering the voice and multimodal interfaces of the Meta Ray-Ban smart glasses. These glasses are equipped with a 12 MP camera, a six-microphone array system, and a Qualcomm Snapdragon AR1 Gen1 processor. &lt;/p&gt;

&lt;p&gt;Because Muse Spark is natively multimodal (handling text, image, and speech inputs up to 262,000 tokens), it allows the glasses to perform real-time computer vision and voice reasoning. You aren't just dictating text; the AI is actively processing your visual environment and responding contextually through the open-ear speakers. &lt;/p&gt;

&lt;h2&gt;
  
  
  💻 4. The Code: Implementing the New API
&lt;/h2&gt;

&lt;p&gt;If you are ready to make the jump, Meta maintains official client SDKs for the new API, including a dedicated &lt;code&gt;llama-api-typescript&lt;/code&gt; package available on npm. &lt;/p&gt;

&lt;p&gt;Here is a quick look at how you might orchestrate a multi-modal request using the new proprietary TypeScript SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LlamaAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;llama-api-typescript&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Official Meta SDK&lt;/span&gt;

&lt;span class="c1"&gt;// Initialize the client (ensure LLAMA_API_KEY is set in your environment)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LlamaAPIClient&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;analyzeVisualEnvironment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;base64Image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🚀 Initiating Muse Spark Multimodal Analysis...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;muse-spark-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
      &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; 
          &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are an autonomous visual assistant. Analyze the provided image and outline a step-by-step physical action plan.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; 
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; 
          &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is the fastest way to disassemble the hardware shown in this image?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image_url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`data:image/jpeg;base64,&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;base64Image&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="c1"&gt;// Leveraging the new parallel reasoning architecture&lt;/span&gt;
      &lt;span class="na"&gt;extra_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;enable_contemplating_mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error communicating with Muse Spark API:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: While the API retains the "Llama" naming convention for the SDKs, the backend is routing to the new proprietary architecture.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🔮 The Takeaway
&lt;/h2&gt;

&lt;p&gt;The barrier to entry for building AI wrappers just got higher. With models like Muse Spark natively handling complex, multi-agent orchestration, developers need to focus on deep systems integration rather than just prompt engineering.&lt;/p&gt;

&lt;p&gt;We are moving away from the era of hacking together local LLMs and entering a phase where proprietary, cloud-hosted models dictate the hardware ecosystems we wear on our faces.&lt;/p&gt;

&lt;p&gt;Are you planning to migrate your applications to the new Muse Spark API, or are you sticking with the remaining open-source alternatives? &lt;strong&gt;Let me know in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this technical breakdown helpful, drop a ❤️ and bookmark this post! I'll be doing a complete, hands-on teardown of the new SDK and agent orchestration patterns over on the **AI Tooling Academy&lt;/em&gt;* channel soon, so stay tuned.*&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>meta</category>
    </item>
    <item>
      <title>🕵️‍♂️ Google's "Gemini Omni" Just Leaked: The Secret Multimodal Weapon for Google I/O</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 13 May 2026 03:01:45 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/googles-gemini-omni-just-leaked-the-secret-multimodal-weapon-for-google-io-2bfl</link>
      <guid>https://dev.to/siddhesh_surve/googles-gemini-omni-just-leaked-the-secret-multimodal-weapon-for-google-io-2bfl</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdqr02owro7fq6lz6jf5e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdqr02owro7fq6lz6jf5e.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been following the AI arms race this year, you know the vibe is currently "Multimodal or Bust." OpenAI has been teasing its massive visual updates, but Google isn't about to let its home turf at &lt;strong&gt;Google I/O&lt;/strong&gt; go uncontested.&lt;/p&gt;

&lt;p&gt;According to a massive new leak reported by &lt;em&gt;TestingCatalog&lt;/em&gt;, Google is internally testing a next-generation model dubbed &lt;strong&gt;"Gemini Omni."&lt;/strong&gt; This isn't just another incremental update to the Gemini 2.0 or 3.0 lines; this is a native, high-fidelity video-to-audio model designed for real-time interaction.&lt;/p&gt;

&lt;p&gt;If you’re a developer building the next generation of "eyes and ears" for AI agents, this leak just changed your roadmap. Here is what we know about Omni, how it competes with Nano Banana 2, and what the code might look like. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🎥 What is "Gemini Omni"?
&lt;/h2&gt;

&lt;p&gt;The "Omni" designation suggests a unified architecture. While earlier models often relied on separate "vision" and "language" encoders that passed tokens back and forth, Omni is rumored to be a &lt;strong&gt;native multimodal&lt;/strong&gt; model. &lt;/p&gt;

&lt;p&gt;This means it doesn't just "describe" a video frame by frame; it understands the temporal flow of video and audio simultaneously. The leaks point toward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero-Latency Video Reasoning:&lt;/strong&gt; Analyzing live camera feeds with under 200ms of lag.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Audio-Visual Sync:&lt;/strong&gt; Generating realistic audio cues based on visual events (and vice versa).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic Video Control:&lt;/strong&gt; The ability for an AI to "watch" a screen and execute mouse/keyboard actions natively.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ⚔️ The Battle for the "Omni" Title
&lt;/h2&gt;

&lt;p&gt;The timing is spicy. Google is clearly positioning this to counter OpenAI's visual capabilities, but they are also competing with their own internal heavy hitters like &lt;strong&gt;Nano Banana 2&lt;/strong&gt; (the current state-of-the-art for image generation). &lt;/p&gt;

&lt;p&gt;While Nano Banana 2 focuses on high-fidelity image composition, Gemini Omni is built for the &lt;strong&gt;stream&lt;/strong&gt;. For those of us building in the Ads or E-commerce space—where real-time product recognition and visual search are the "Holy Grail"—Omni could be the infrastructure that finally makes "Visual Commerce" viable for the masses.&lt;/p&gt;

&lt;h2&gt;
  
  
  💻 Speculative Implementation: Real-Time Video Analysis
&lt;/h2&gt;

&lt;p&gt;Based on the current Gemini 2.0 Pro API structures, we can anticipate how Omni will handle live video streams. Instead of uploading a static &lt;code&gt;.mp4&lt;/code&gt;, we'll likely be dealing with &lt;strong&gt;MediaStream&lt;/strong&gt; chunks.&lt;/p&gt;

&lt;p&gt;Here is how you might soon implement a "Visual Support Agent" using the Gemini Omni SDK in TypeScript:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@google/generative-ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 🚀 Speculative: Using the new 'omni-video' model&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini-omni-preview&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;startVisualSupport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;videoStream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MediaStream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🎥 Omni is now 'watching' the support session...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startChat&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Help the customer troubleshoot the hardware setup they are showing on camera.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Streaming frames directly to the model for real-time reasoning&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendMessageStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;video_stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;videoStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;audio_sync&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 New Omni-specific flag for audio-visual alignment&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunkText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// The agent can 'see' the user plugging in the wrong cable in real-time&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧠 Why This Matters for Engineering Managers
&lt;/h2&gt;

&lt;p&gt;As an Engineering Manager leading AI initiatives, the arrival of Omni shifts the "Build vs. Buy" calculation for visual AI. &lt;/p&gt;

&lt;p&gt;We are moving away from needing a massive team of CV (Computer Vision) experts to train custom models for object detection. Instead, we can now leverage &lt;strong&gt;foundation video models&lt;/strong&gt; like Omni to handle the heavy lifting, allowing us to focus on the &lt;strong&gt;agentic orchestration&lt;/strong&gt; and the business logic.&lt;/p&gt;

&lt;p&gt;If Omni delivers on the leaked promise of low-latency video reasoning, it will be the final piece of the puzzle for "Workspace Agents" that can actually sit "next" to you, watch your workflow, and offer real-time peer review on your code or designs.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 The Verdict
&lt;/h2&gt;

&lt;p&gt;Google I/O is usually full of "coming soon" promises, but the presence of Omni on the LM Arena and in internal testing suggests a public developer preview is imminent. &lt;/p&gt;

&lt;p&gt;I’ll be doing a deep dive into the specific API limits and throughput benchmarks over on the &lt;strong&gt;AI Tooling Academy&lt;/strong&gt; channel the moment the docs go live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you ready to give your apps a set of eyes, or are the privacy implications of a "live-watching" model still too high for your users?&lt;/strong&gt; Let's discuss in the comments! 👇&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>google</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>🚀 The "Vibe Coding" Era is Over: What AI Founders Are Building Instead</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 05 May 2026 02:56:48 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-vibe-coding-era-is-over-what-ai-founders-are-building-instead-493m</link>
      <guid>https://dev.to/siddhesh_surve/the-vibe-coding-era-is-over-what-ai-founders-are-building-instead-493m</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci46myg7s466wpk8eojd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci46myg7s466wpk8eojd.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been paying attention to the venture capital space, you likely caught Ann Miura-Ko’s latest insights making the rounds on X. The message from top-tier Silicon Valley investors is becoming incredibly clear: the days of hacking together a thin UI over an OpenAI API key and calling it a disruptive startup are coming to a hard stop in 2026.&lt;/p&gt;

&lt;p&gt;Founders are being pushed to build &lt;em&gt;Minimum Viable Companies&lt;/em&gt;, not just Minimum Viable Products. The market is completely saturated with basic AI wrappers. What is actually getting funded and gaining real traction right now? Deep, infrastructural utility.&lt;/p&gt;

&lt;p&gt;Here is exactly how the engineering meta is shifting, and what you should be focusing on if you want to build something that lasts.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. 🛑 Stop Building Wrappers, Start Building Workflows
&lt;/h3&gt;

&lt;p&gt;The first wave of generative AI was all about &lt;em&gt;generation&lt;/em&gt;. The next wave is all about &lt;em&gt;orchestration&lt;/em&gt;. Users don't want another chatbot sitting in a browser tab; they want autonomous systems that remove entire categories of work from their plates. &lt;/p&gt;

&lt;p&gt;If your application just takes user text, sends it to an LLM, and prints the result, you don't have a technical moat. You have a feature that will inevitably be sherlocked by the platform providers themselves. &lt;/p&gt;

&lt;h3&gt;
  
  
  2. 🏗️ The Move to Agentic Infrastructure
&lt;/h3&gt;

&lt;p&gt;Instead of simple request-response cycles, successful products are moving toward agentic infrastructure. This means your code needs to handle state, memory, error recovery, and tool execution in the background.&lt;/p&gt;

&lt;p&gt;Developing the &lt;code&gt;secure-pr-reviewer&lt;/code&gt; GitHub App and deploying it to production on Railway back in January 2026 required exactly this kind of architectural shift. It wasn't enough to just send raw code snippets to an API. Building it required a robust TypeScript and Node.js backend to listen for webhooks, parse the abstract syntax tree of the repository, run the AI security audit, and intelligently comment back on the exact lines of code inside the pull request.&lt;/p&gt;

&lt;p&gt;Here is a simplified look at how that kind of event-driven, agentic infrastructure is structured in Node.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;probot&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;analyzeCodeSecurity&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../services/ai-auditor&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.opened&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.synchronize&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pullRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Fetch the actual diff to provide context, not just a raw prompt&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pulls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;mediaType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;diff&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Initiating security audit for PR #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// The AI service handles the deep reasoning and logic assessment&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;securityReport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;analyzeCodeSecurity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vulnerabilitiesFound&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewComment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`### 🛡️ Automated Security Audit\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;markdownSummary&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="c1"&gt;// Agent autonomously injects its findings into the human workflow&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewComment&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the massive value lies: taking a complex, multi-step human workflow (like reviewing a PR for security vulnerabilities) and automating it entirely in the background so the engineering team doesn't even have to think about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. 📉 The Rise of the "Micro-Team"
&lt;/h3&gt;

&lt;p&gt;Because AI is handling so much of the boilerplate scaffolding and testing, we are seeing the rise of hyper-efficient micro-teams. You don't need a massive engineering pod to ship a scalable MVP anymore. You need one or two deeply technical founders who understand systems architecture and can leverage AI to write the functional components.&lt;/p&gt;

&lt;p&gt;But this requires a solid understanding of fundamental computer science. If you let the AI write the code, &lt;em&gt;you&lt;/em&gt; still have to design the system. &lt;/p&gt;

&lt;h3&gt;
  
  
  💡 The Takeaway
&lt;/h3&gt;

&lt;p&gt;The barrier to building software has dropped to zero, which means the baseline expectations for a startup have skyrocketed. As investors point out, the market is looking for true substance and organic product-market fit. &lt;/p&gt;

&lt;p&gt;To win in 2026, stop optimizing your prompts and start optimizing your architectures. Build systems, build workflows, and build real companies. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;What are you building right now? Are you seeing this same shift away from simple AI wrappers in your own circles? Let's discuss in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>startup</category>
      <category>webdev</category>
      <category>typescript</category>
    </item>
    <item>
      <title>🚨 The "Context Window" is Dead: Anthropic Just Gave Claude Agents Permanent Memory</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 28 Apr 2026 02:32:24 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-context-window-is-dead-anthropic-just-gave-claude-agents-permanent-memory-52hd</link>
      <guid>https://dev.to/siddhesh_surve/the-context-window-is-dead-anthropic-just-gave-claude-agents-permanent-memory-52hd</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp29h7qnpnruox7s06v8b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp29h7qnpnruox7s06v8b.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been building with AI over the last year, you know the absolute biggest bottleneck in agentic engineering: &lt;strong&gt;The Goldfish Problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You spend hours crafting the perfect system prompt. You deploy your AI agent to handle a complex task. It does a great job. But the second that session ends? &lt;em&gt;Poof.&lt;/em&gt; The agent forgets everything. &lt;/p&gt;

&lt;p&gt;To fix this, developers have been duct-taping together complex Vector DBs, RAG pipelines, and rolling context windows just to give their agents a basic sense of object permanence. It is exhausting, expensive, and fragile. &lt;/p&gt;

&lt;p&gt;But as of this week, the game has completely changed. &lt;strong&gt;Anthropic just launched Memory for Claude Managed Agents in public beta&lt;/strong&gt;, and it fundamentally shifts how we will build autonomous systems. &lt;/p&gt;

&lt;p&gt;Here is everything you need to know about the update, why it's better than standard RAG, and how to implement it in your code today. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 What is Claude Agent Memory?
&lt;/h2&gt;

&lt;p&gt;Unlike standard chatbot interactions where context is lost when the window closes, Anthropic’s new Memory feature allows Claude Managed Agents to accumulate knowledge &lt;em&gt;across different sessions&lt;/em&gt; over time. &lt;/p&gt;

&lt;p&gt;But here is the truly brilliant part: &lt;strong&gt;It is a filesystem-based layer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data isn't just floating in a black-box vector space. Claude stores its memories as actual files. This means your agents can read, write, and reference a continuous state, while you (the developer) maintain absolute programmatic control over what is being stored. Early enterprise adopters like Netflix and Rakuten are already using it to automate complex, long-running workflows without constantly having to update manual prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛡️ The "Audit Trail" Superpower
&lt;/h2&gt;

&lt;p&gt;If you are building tools for enterprise, standard RAG pipelines are a compliance nightmare. If an AI hallucinates or leaks data, figuring out &lt;em&gt;why&lt;/em&gt; it retrieved that specific piece of information is incredibly difficult. &lt;/p&gt;

&lt;p&gt;Anthropic designed this new memory system with enterprise governance built-in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Full Auditability:&lt;/strong&gt; Every single memory change is logged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Granular Control:&lt;/strong&gt; You have an audit trail for each session and agent. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rollbacks:&lt;/strong&gt; You can programmatically roll back, redact, or delete specific memories if the agent learns something incorrect or sensitive.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💻 Building a "Smart" PR Reviewer in TypeScript
&lt;/h2&gt;

&lt;p&gt;To understand how powerful this is, let's look at a real-world scenario. &lt;/p&gt;

&lt;p&gt;Imagine you are building a production-ready GitHub App—let's call it &lt;code&gt;secure-pr-reviewer&lt;/code&gt;—using TypeScript and Node.js. &lt;/p&gt;

&lt;p&gt;Without memory, your AI reviewer treats every single Pull Request in a vacuum. It might flag the same internal, safe utility function as a "security risk" 100 times, infuriating your senior engineers who have to manually dismiss the warning every time.&lt;/p&gt;

&lt;p&gt;With Claude's new Memory API, the agent &lt;em&gt;learns&lt;/em&gt; from the team. If a senior dev tells the agent, "This auth pattern is expected in the legacy module," the agent remembers it for the next PR. &lt;/p&gt;

&lt;p&gt;Here is what the implementation logic looks like using the new Managed Agents API paradigm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Assume this webhook fires when a new PR is opened&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handlePullRequestEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[secure-pr-reviewer] Auditing PR #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;...`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 1. Initialize or resume a Managed Agent Session with Memory enabled&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CLAUDE_SECURITY_AGENT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Your pre-configured agent&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`repo-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Scope memory to this specific repo&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Send the PR diff to the agent&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Audit the following diff for security flaws. 
                  Remember our past conversations about approved legacy patterns.
                  \n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; 
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// The agent uses its filesystem memory to check past developer feedback&lt;/span&gt;
  &lt;span class="c1"&gt;// before generating the final report.&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;VULNERABILITY_FOUND&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;postGitHubComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a developer replies to the bot's comment on GitHub saying, &lt;em&gt;"Ignore this specific file path in the future, it's a mock database for testing,"&lt;/em&gt; you simply pass that message back into the session. Claude writes that rule to its memory layer, and it will &lt;em&gt;never&lt;/em&gt; flag that file again. &lt;/p&gt;

&lt;p&gt;No database schemas to update. No RAG pipeline to re-index. The agent just gets smarter. &lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 The Era of Stateful AI
&lt;/h2&gt;

&lt;p&gt;We are officially moving from stateless functions to stateful, autonomous teammates. By providing a transparent, auditable, filesystem-based memory layer, Anthropic is removing the biggest friction point for enterprise AI adoption.&lt;/p&gt;

&lt;p&gt;The feature is available in public beta right now via the Claude Console and APIs. &lt;/p&gt;

&lt;p&gt;Are you going to rip out your custom Vector DBs and switch to native Agent Memory? Let me know what you think of the update in the comments below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and bookmark the code snippet for your next agentic side project!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>node</category>
    </item>
    <item>
      <title>🚀 The "Custom GPT" is Dead: OpenAI Just Dropped Workspace Agents (And They Run in the Background)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Fri, 24 Apr 2026 02:17:19 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-custom-gpt-is-dead-openai-just-dropped-workspace-agents-and-they-run-in-the-background-gb1</link>
      <guid>https://dev.to/siddhesh_surve/the-custom-gpt-is-dead-openai-just-dropped-workspace-agents-and-they-run-in-the-background-gb1</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wb1uenkilwch0wq1pvd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wb1uenkilwch0wq1pvd.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve spent any time tinkering with AI over the last year, you’ve probably built a Custom GPT. You give it a system prompt, maybe upload a PDF or two, and use it as a highly specific, personalized chatbot. &lt;/p&gt;

&lt;p&gt;But there was always one fatal flaw with this workflow: &lt;strong&gt;Custom GPTs are entirely reactive.&lt;/strong&gt; They only work when you are actively sitting at your keyboard, typing prompts, and waiting for a response. &lt;/p&gt;

&lt;p&gt;That era officially ended today. &lt;/p&gt;

&lt;p&gt;OpenAI just announced &lt;strong&gt;Workspace Agents&lt;/strong&gt; in ChatGPT. Powered by their underlying Codex engine, these are not chatbots. They are autonomous, cloud-hosted agents that run in the background, execute multi-step workflows, and operate across your team's tools even after you close your laptop. &lt;/p&gt;

&lt;p&gt;Here is why this completely changes how we build enterprise automation, and what you need to know to start using it today. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 From Chatbots to Background Daemons
&lt;/h2&gt;

&lt;p&gt;The biggest shift with Workspace Agents is the decoupling of the AI from the traditional chat interface. &lt;/p&gt;

&lt;p&gt;Because these agents run in the cloud, they have continuous memory and persistent execution. You don't have to manually prompt them to start working. You can configure an agent to run on a set schedule (e.g., "Pull Jira metrics every Friday at 4 PM and draft a report"), or deploy them directly into communication tools like Slack. &lt;/p&gt;

&lt;p&gt;For instance, an agent deployed in a Slack workspace can proactively monitor incoming messages, route product feedback, answer documentation questions, and autonomously file IT tickets while your engineering team focuses on deep work. &lt;/p&gt;

&lt;h2&gt;
  
  
  💻 The Code: Automating the Automators
&lt;/h2&gt;

&lt;p&gt;To understand how massive this is for developers, think about how we traditionally build workflow automation. &lt;/p&gt;

&lt;p&gt;When I was architecting the &lt;code&gt;secure-pr-reviewer&lt;/code&gt; GitHub App, the infrastructure overhead required just to get an AI to act autonomously was significant. To automatically review code, you have to spin up a Node.js server, use a framework like Probot to listen for webhooks, manually orchestrate the API calls to the LLM, and handle the asynchronous callbacks. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Traditional Automation Stack (TypeScript):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;probot&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;runSecurityAudit&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./ai-service&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Listen for specific platform events&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.opened&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Extract the context manually&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pulls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Orchestrate the LLM call and wait for completion&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;securityReport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runSecurityAudit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 4. Push the formatted result back to the platform&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;comment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`🛡️ Security Audit Complete: \n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;comment&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With Workspace Agents, this entire middleware layer evaporates. &lt;/p&gt;

&lt;p&gt;Instead of writing and hosting webhook listeners, you create a shared agent, grant it access to your integrations, and define the workflow in plain English: &lt;em&gt;"Monitor new PRs in this repository. When opened, read the diff, check against our security guidelines, and post a comment with your findings."&lt;/em&gt; The Codex-powered agent handles the event listening, the context window management, and the API execution natively in the cloud.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛑 The "Human-in-the-Loop" Safeguards
&lt;/h2&gt;

&lt;p&gt;Of course, giving an autonomous agent unmitigated access to your CRM, codebase, or email inbox is terrifying for any enterprise. &lt;/p&gt;

&lt;p&gt;OpenAI clearly anticipated this security anxiety. Workspace Agents come with strict, granular governance. For sensitive actions—like executing a database script, sending an outbound email to a client, or modifying a financial spreadsheet—the agent will automatically pause its execution and ping you for permission. &lt;/p&gt;

&lt;p&gt;It does 99% of the heavy lifting, formats the data, and then essentially asks: &lt;em&gt;"Does this look right before I hit send?"&lt;/em&gt; ## 💸 Availability &amp;amp; The Road Ahead&lt;/p&gt;

&lt;p&gt;Right now, Workspace Agents are rolling out in research preview for ChatGPT Business, Enterprise, Edu, and Teachers plans. &lt;/p&gt;

&lt;p&gt;Here is the kicker: &lt;strong&gt;They are completely free to use until May 6, 2026.&lt;/strong&gt; After that date, OpenAI is shifting them to a credit-based pricing model, a logical move given that running persistent background daemons requires significantly more compute than standard, isolated chat completions. &lt;/p&gt;

&lt;p&gt;We are rapidly moving away from "AI as an autocomplete tool" and entirely into the era of "AI as an asynchronous teammate." &lt;/p&gt;

&lt;p&gt;I will be doing a complete, hands-on teardown of how to build and deploy these specific agents over on the &lt;em&gt;AI Tooling Academy&lt;/em&gt; channel soon, so stay tuned. &lt;/p&gt;

&lt;p&gt;Are you ready to let a cloud-hosted agent manage your Slack channel and codebase, or are the security risks still too high? &lt;strong&gt;Let me know your thoughts in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, smash the ❤️ button and bookmark this post so you remember the May 6th pricing deadline!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>javascript</category>
    </item>
    <item>
      <title>🚀 Qwen 3.6 Max Preview is Here: Why Your AI Coding Agents Are About to Get a Massive Upgrade</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 22 Apr 2026 02:20:11 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/qwen-36-max-preview-is-here-why-your-ai-coding-agents-are-about-to-get-a-massive-upgrade-3i3o</link>
      <guid>https://dev.to/siddhesh_surve/qwen-36-max-preview-is-here-why-your-ai-coding-agents-are-about-to-get-a-massive-upgrade-3i3o</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvs5clvnvjrhdcih9pm26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvs5clvnvjrhdcih9pm26.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you've been building AI-driven workflows lately, you know the struggle. You set up a sophisticated agent to review a pull request or refactor a legacy module, and halfway through the task, it "forgets" its own logic and hallucinates a broken solution. &lt;/p&gt;

&lt;p&gt;Just weeks after dropping the impressive 3.6-Plus model, the team at Alibaba Cloud has quietly unleashed an early look at their true heavyweight: &lt;strong&gt;Qwen 3.6-Max-Preview&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;For those of us building autonomous coding agents and complex backend systems, this isn't just an incremental update. This model is specifically engineered to fix the memory and logic bottlenecks in autonomous development. &lt;/p&gt;

&lt;p&gt;Here is exactly why this release is a massive deal for the developer ecosystem—and how you can integrate its best new feature into your TypeScript apps today. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 The Benchmarks: Dominating Agentic Coding
&lt;/h2&gt;

&lt;p&gt;Most models can write a Python script to reverse a string. Very few models can clone a massive repository, navigate the terminal, read the documentation, and successfully patch a bug without human intervention. &lt;/p&gt;

&lt;p&gt;Qwen 3.6-Max-Preview was built for the latter. According to the release notes, it has taken the &lt;strong&gt;absolute top score on six major coding benchmarks&lt;/strong&gt;, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SWE-bench Pro&lt;/strong&gt; * &lt;strong&gt;Terminal-Bench 2.0&lt;/strong&gt; (+3.8 over the already excellent 3.6-Plus)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SkillsBench&lt;/strong&gt; (A massive +9.9 jump)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NL2Repo&lt;/strong&gt; (+5.0)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What this translates to in the real world is an AI that has a vastly superior grasp of &lt;em&gt;world knowledge&lt;/em&gt; and &lt;em&gt;instruction following&lt;/em&gt;. It doesn't just guess what your codebase does; it logically traces the execution paths.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 The Secret Weapon: &lt;code&gt;preserve_thinking&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;When I'm building automated tools (like a Probot app for CI/CD or a webhook-driven PR reviewer), the biggest issue with LLMs is "context amnesia" during multi-step reasoning. &lt;/p&gt;

&lt;p&gt;Qwen 3.6-Max-Preview supports an incredibly powerful API parameter: &lt;code&gt;preserve_thinking&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;When you enable this, the model retains the internal "thinking" content from &lt;em&gt;all preceding turns&lt;/em&gt; in a conversation. It doesn't just remember what it said; it remembers &lt;em&gt;how it arrived at that conclusion&lt;/em&gt;. For agentic tasks where the AI needs to iteratively debug a problem, this feature is the difference between an endless hallucination loop and a merged pull request.&lt;/p&gt;

&lt;h2&gt;
  
  
  💻 How to Use It in TypeScript
&lt;/h2&gt;

&lt;p&gt;Because Alibaba's Model Studio provides a fully OpenAI-compatible endpoint, migrating your existing Node.js/TypeScript agents to Qwen 3.6-Max-Preview is as simple as changing the Base URL and passing the custom parameters.&lt;/p&gt;

&lt;p&gt;Here is a quick example of how you can wire up an autonomous agent that utilizes persistent reasoning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Point your client to the DashScope compatible endpoint&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DASHSCOPE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[https://dashscope-intl.aliyuncs.com/compatible-mode/v1](https://dashscope-intl.aliyuncs.com/compatible-mode/v1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runAutonomousAudit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;codeDiff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🚀 Booting up Qwen 3.6-Max Agent...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen3.6-max-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are an elite senior engineer performing a complex code audit.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; 
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Analyze this diff and propose architectural improvements:\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;codeDiff&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; 
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;// 2. Inject the Qwen-specific agentic parameters&lt;/span&gt;
    &lt;span class="c1"&gt;// @ts-ignore&lt;/span&gt;
    &lt;span class="na"&gt;extra_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;enable_thinking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;preserve_thinking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 The holy grail for multi-step reasoning&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Process the stream to separate the "Thinking" from the final "Answer"&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;thinking&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;reasoning_content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;thinking&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Print the model's internal logic in gray&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`\x1b[90m&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;thinking&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\x1b[0m`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Print the final output normally&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔮 What’s Next?
&lt;/h2&gt;

&lt;p&gt;It's important to note that this is still a &lt;em&gt;preview&lt;/em&gt; release. The model is under active development, and the Qwen team explicitly noted they are iterating to squeeze even more performance out of it before the official GA launch. &lt;/p&gt;

&lt;p&gt;But if this is just the preview, the ceiling for open-weight and proprietary agentic models in 2026 is looking incredibly high. If you want to start building reliable, autonomous teammates instead of just simple autocomplete scripts, Qwen 3.6-Max is demanding a spot in your tech stack.&lt;/p&gt;

&lt;p&gt;You can test it interactively right now on Qwen Studio, or plug it directly into your apps via the API. &lt;/p&gt;

&lt;p&gt;Are you making the shift toward autonomous coding agents this year? Let me know what you are building in the comments below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and bookmark the code snippet for your next weekend project! I'll be breaking down more of these enterprise AI tools over on the AI Tooling Academy channel soon.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>productivity</category>
    </item>
    <item>
      <title>🚀 Anthropic Just Dropped "Claude Design" (And It Changes Frontend Development Forever)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 21 Apr 2026 03:25:34 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/anthropic-just-dropped-claude-design-and-it-changes-frontend-development-forever-2kl2</link>
      <guid>https://dev.to/siddhesh_surve/anthropic-just-dropped-claude-design-and-it-changes-frontend-development-forever-2kl2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dayjm1jjbcndom56ws3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dayjm1jjbcndom56ws3.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s be real for a second. If you build software, you know the absolute most painful part of the development lifecycle isn't writing the business logic—it's the "mockup phase." &lt;/p&gt;

&lt;p&gt;You have a great idea. You sketch it out. You wait for a designer to build it in Figma. You get a static JPEG. You realize the interactivity doesn't make sense. You send it back. Weeks pass before you even write your first &lt;code&gt;npm install&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We are deep into 2026, and the speed of AI tooling is finally fixing this broken pipeline. Today, Anthropic Labs just released &lt;strong&gt;Claude Design&lt;/strong&gt;, powered by their brand-new &lt;strong&gt;Opus 4.7&lt;/strong&gt; vision model. &lt;/p&gt;

&lt;p&gt;I review a lot of workflow automation tools over on the &lt;em&gt;AI Tooling Academy&lt;/em&gt; channel, but this one is genuinely a step-function improvement for engineering teams. It effectively bridges the massive gap between a product manager's rough idea and a developer's local codebase. &lt;/p&gt;

&lt;p&gt;Here is why Claude Design is about to become a mandatory tool in your stack, and how it directly integrates with your coding environment. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 What is Claude Design?
&lt;/h2&gt;

&lt;p&gt;At its core, Claude Design is an interactive, multi-modal canvas. You don't just prompt it for an image; you collaborate with it to create polished, fully interactive UI prototypes, slide decks, and marketing assets. &lt;/p&gt;

&lt;p&gt;But it’s not just a generic UI generator. Anthropic built this explicitly to integrate into real-world enterprise engineering workflows. &lt;/p&gt;

&lt;h3&gt;
  
  
  🎨 1. It Auto-Ingests Your Codebase's Design System
&lt;/h3&gt;

&lt;p&gt;The biggest problem with AI-generated UI is that it always looks like... well, AI-generated UI. It never matches your company's actual brand. &lt;/p&gt;

&lt;p&gt;During onboarding, Claude Design actually reads your existing codebase and design files. It extracts your typography, CSS variables, color hexes, and React components to build a custom design system. Every prototype it generates from that point forward automatically uses your company's exact styling. &lt;/p&gt;

&lt;h3&gt;
  
  
  ⚙️ 2. "Handoff to Claude Code" (The Killer Feature)
&lt;/h3&gt;

&lt;p&gt;This is the feature that made my jaw drop. &lt;/p&gt;

&lt;p&gt;Traditionally, translating a design into code means staring at a screen, measuring pixel padding, and writing tedious CSS. Claude Design introduces a &lt;strong&gt;Single-Instruction Handoff&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Once your team is happy with the interactive prototype in the canvas, Claude packages the entire project into a "handoff bundle." You can then instantly pass this bundle to &lt;strong&gt;Claude Code&lt;/strong&gt; (Anthropic's CLI agent) to implement the actual logic in your local repository.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine this workflow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your PM creates a functional wireframe in Claude Design.&lt;/li&gt;
&lt;li&gt;They export the bundle.&lt;/li&gt;
&lt;li&gt;You open your terminal and run a command to let your local AI agent scaffold the exact React components:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Speculative workflow based on the new Claude Code Handoff integration&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;claude &lt;span class="nt"&gt;--task&lt;/span&gt; &lt;span class="s2"&gt;"Implement the new billing dashboard using the handoff bundle"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
         &lt;span class="nt"&gt;--bundle&lt;/span&gt; ./claude-design-billing-bundle.zip

&lt;span class="o"&gt;[&lt;/span&gt;Claude Code]: Reading handoff intent...
&lt;span class="o"&gt;[&lt;/span&gt;Claude Code]: Extracting Opus 4.7 design specifications...
&lt;span class="o"&gt;[&lt;/span&gt;Claude Code]: Generating src/components/BillingDashboard.tsx...
&lt;span class="o"&gt;[&lt;/span&gt;Claude Code]: Applying &lt;span class="nb"&gt;local &lt;/span&gt;Tailwind configuration...
✅ Done! Your interactive prototype is now live &lt;span class="k"&gt;in &lt;/span&gt;your codebase.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🎛️ 3. Dynamic Sliders and Inline Editing
&lt;/h3&gt;

&lt;p&gt;Instead of writing endless follow-up prompts like &lt;em&gt;"make the gap between the cards a little wider"&lt;/em&gt;, Claude generates custom UI sliders and knobs on the fly. You can literally drag a slider to adjust layout density, color saturation, or typography scaling in real-time, and Claude handles the underlying CSS updates instantly. &lt;/p&gt;

&lt;h3&gt;
  
  
  🌐 4. The Web Capture Tool
&lt;/h3&gt;

&lt;p&gt;If you want to build a new feature on top of your existing production site, you don't need to rebuild the layout from scratch. You can use Claude Design's web capture tool to grab elements directly from your live website, pulling them into the canvas so your new prototype sits perfectly within your real product's UI.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤝 The Canva Integration
&lt;/h2&gt;

&lt;p&gt;For the founders and full-stack devs who also have to play marketer, Anthropic announced a massive integration with Canva. &lt;/p&gt;

&lt;p&gt;If you use Claude Design to generate a pitch deck, a one-pager, or social media assets, you can export them directly into Canva with a single click. The assets remain fully editable and collaborative, meaning you can do the heavy conceptual lifting with Claude's reasoning, and the final polish in a tool your marketing team already knows how to use.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 The End of the Static Mockup
&lt;/h2&gt;

&lt;p&gt;The days of handing a static image to a developer and saying "make it work" are officially over. &lt;/p&gt;

&lt;p&gt;With models like Opus 4.7 driving the vision and reasoning, we are moving to a world where prototypes are inherently interactive, code-aware, and tied directly to your CLI agents. Tools like this allow us to stop acting as human CSS translators and get back to focusing on high-level architecture and complex systems logic.&lt;/p&gt;

&lt;p&gt;Claude Design is rolling out today in research preview for Claude Pro, Max, Team, and Enterprise subscribers. &lt;/p&gt;

&lt;p&gt;Are you going to test out the Claude Code handoff pipeline? How do you think this impacts the traditional UX/UI design role? &lt;strong&gt;Let me know your thoughts in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and a 🦄! Bookmark this post to keep the workflow handy for your next sprint.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>frontend</category>
      <category>productivity</category>
    </item>
    <item>
      <title>🔥 Google Just Leaked Its "Desktop Agent" (And It Changes How We Build Software)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 15 Apr 2026 02:50:32 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/google-just-leaked-its-desktop-agent-and-it-changes-how-we-build-software-3ga</link>
      <guid>https://dev.to/siddhesh_surve/google-just-leaked-its-desktop-agent-and-it-changes-how-we-build-software-3ga</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5xrienblwmj4usuwjxt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5xrienblwmj4usuwjxt.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the last two years, the tech industry has been stuck in a loop. We open a browser tab, paste a block of code into a chatbot, copy the fixed code, and paste it back into our IDE. It's incredibly helpful, but let's be honest: &lt;strong&gt;it is still highly manual.&lt;/strong&gt; The era of the "reactive chatbot" is officially dying. We are entering the era of the &lt;strong&gt;autonomous workspace&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;According to massive new leaks reported by &lt;em&gt;TestingCatalog&lt;/em&gt;, Google is quietly testing a brand-new &lt;strong&gt;"Agent" tab inside Gemini Enterprise&lt;/strong&gt;, and it looks like a direct, aggressive strike against Anthropic's Claude Cowork and OpenAI's upcoming Codex Superapp. &lt;/p&gt;

&lt;p&gt;If you lead an engineering team or build automated workflows, this is the paradigm shift you need to prepare for before Google I/O. Here is a breakdown of the leak, the new features, and what it means for your daily dev routine. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 The Shift: From Chat to "Task Execution Workspace"
&lt;/h2&gt;

&lt;p&gt;The leak reveals that Gemini is moving away from a simple text input box. The new Agent area features an "Inbox" and a "New Task" UI that fundamentally restructures how the AI operates. &lt;/p&gt;

&lt;p&gt;When you configure a new agentic task, the right-hand panel gives you granular control over:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Goal:&lt;/strong&gt; The overarching objective (e.g., "Audit all incoming pull requests for security flaws").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agents:&lt;/strong&gt; Which specific sub-models or personas to deploy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connected Apps:&lt;/strong&gt; Direct integrations into your enterprise stack (GitHub, Jira, Google Workspace).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Files:&lt;/strong&gt; Contextual data access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Require Human Review:&lt;/strong&gt; The absolute killer feature (more on this below).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't an assistant you chat with. This is a background daemon that executes multi-step workflows. &lt;/p&gt;

&lt;h2&gt;
  
  
  💻 The Code: How Agents Replace Middleware
&lt;/h2&gt;

&lt;p&gt;To understand why this is a massive deal, let's look at how we currently build automation. &lt;/p&gt;

&lt;p&gt;Let's say you built a GitHub App using TypeScript and Probot (something like &lt;code&gt;secure-pr-reviewer&lt;/code&gt;) to automatically scan incoming PRs. Currently, your Node.js server has to manually catch the webhook, parse the diff, send it to an LLM, wait for a response, and post the comment back to GitHub.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "Old" Way (Manual Orchestration):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;probot&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;analyzeDiff&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./llm-service&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.opened&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Fetch the code diff manually&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prDiff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pulls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Wait for the LLM to process it&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;securityReport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;analyzeDiff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prDiff&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Post the comment back to the repo&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;issueComment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`🛡️ Security Audit: \n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;issueComment&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Google Agent Way:&lt;/strong&gt;&lt;br&gt;
With the new Gemini Desktop Agent infrastructure, you wouldn't write this middleware at all. &lt;/p&gt;

&lt;p&gt;You would simply connect the Gemini Agent to your GitHub repository via "Connected Apps," set the &lt;strong&gt;Goal&lt;/strong&gt; to "Monitor new PRs and post a security audit," and let the autonomous agent handle the webhook listening, parsing, and posting entirely in the background. It reduces thousands of lines of boilerplate infrastructure into a single visual workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛑 The "Require Human Review" Toggle
&lt;/h2&gt;

&lt;p&gt;When you are managing a team of engineers working on high-stakes, big data infrastructure, you cannot simply let an AI merge code or execute database migrations autonomously. Hallucinations happen.&lt;/p&gt;

&lt;p&gt;This is why the &lt;strong&gt;"Require Human Review"&lt;/strong&gt; toggle spotted in the leak is the most critical feature for enterprise adoption. &lt;/p&gt;

&lt;p&gt;It proves Google is building for serious engineering environments. The agent can do 99% of the heavy lifting—running the tests, drafting the code, preparing the deployment—but it halts at the final execution step, pinging your "Inbox" for a manager or tech lead to click "Approve." &lt;/p&gt;

&lt;h2&gt;
  
  
  🖥️ The Desktop App Invasion
&lt;/h2&gt;

&lt;p&gt;The leak strongly points toward Google rolling this out as a native &lt;strong&gt;Desktop App&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Why a desktop app? Because web browsers are sandboxed. If an AI agent is going to truly assist you, it needs native file system access, terminal control, and the ability to run local scripts. By bringing Gemini natively to the desktop, Google is preparing to fight OpenAI and Anthropic for the ultimate prize: owning your entire local development environment. &lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 What's Next?
&lt;/h2&gt;

&lt;p&gt;With Google I/O just around the corner, the timing of this leak is no coincidence. The big tech giants are no longer competing on who has the smartest conversational model; they are competing on who can build the most reliable, autonomous robotic employee. &lt;/p&gt;

&lt;p&gt;Will this replace your IDE, or just sit alongside it? We'll find out soon. I'll be doing a complete, hands-on deep dive into setting up these exact automated workflows over on the &lt;em&gt;AI Tooling Academy&lt;/em&gt; channel the second this drops, so stay tuned.&lt;/p&gt;

&lt;p&gt;Are you ready to let an autonomous Google agent take over your background tasks, or are you keeping your automated scripts tightly controlled in-house? &lt;strong&gt;Let me know in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and a 🦄! Bookmark this post to keep the Probot reference handy for your next side project.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>google</category>
      <category>productivity</category>
    </item>
    <item>
      <title>❄️ OpenAI’s Secret “Codex Superapp” Just Leaked: The End of Standalone ChatGPT?</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 14 Apr 2026 02:19:29 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/openais-secret-codex-superapp-just-leaked-the-end-of-standalone-chatgpt-44ma</link>
      <guid>https://dev.to/siddhesh_surve/openais-secret-codex-superapp-just-leaked-the-end-of-standalone-chatgpt-44ma</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jykojak9j72xlr8ssa5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jykojak9j72xlr8ssa5.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are a developer, your current workflow probably looks a bit like this: You have a tab open for ChatGPT, a dedicated AI code editor, a browser window for documentation, and a terminal for executing scripts. Context switching isn't just killing your productivity; it’s fragmenting your AI’s "memory."&lt;/p&gt;

&lt;p&gt;But according to new leaks discovered in the latest Codex client, OpenAI is preparing to nuke this fragmented workflow entirely. &lt;/p&gt;

&lt;p&gt;They are quietly building a &lt;strong&gt;unified "Codex Superapp"&lt;/strong&gt; designed to swallow ChatGPT, the Atlas browser, and your coding tools into a single, omnipotent desktop platform. And more importantly, they are introducing features that turn the AI from a simple chatbot into an autonomous, background-running teammate.&lt;/p&gt;

&lt;p&gt;Here is a breakdown of the massive leaks, the highly anticipated "Scratchpad" feature, and why this fundamentally shifts how we will build software. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  📝 1. The "Scratchpad": True Parallel Execution
&lt;/h2&gt;

&lt;p&gt;Until now, conversing with an AI has been strictly linear. You ask a question, you wait for the stream to finish, you ask the next question.&lt;/p&gt;

&lt;p&gt;The leak reveals a new experimental UI called &lt;strong&gt;Scratchpad&lt;/strong&gt;. Instead of a single chat thread, Scratchpad functions like an interactive TODO list where you can spin up &lt;em&gt;multiple Codex tasks simultaneously&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Think about the implications here. Instead of sequentially prompting your AI to scaffold a project, you can drop a master prompt into the Scratchpad, which then spawns parallel agentic threads. One thread writes the database schema, another drafts the API routes, and a third writes the unit tests—all executing at the exact same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  🫀 2. The "Heartbeat" System &amp;amp; Managed Agents
&lt;/h2&gt;

&lt;p&gt;This is where things get wild. Code references within the Codex client reveal a new &lt;strong&gt;"Heartbeat"&lt;/strong&gt; infrastructure. &lt;/p&gt;

&lt;p&gt;In distributed systems, a heartbeat is used to maintain persistent connections with long-running, autonomous tasks. OpenAI is building native support for &lt;strong&gt;Managed Agents&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Instead of waiting for you to hit "Enter," these background agents can operate autonomously, execute multi-step workflows, and periodically "check in" (the heartbeat) to report progress or ask for human intervention. &lt;/p&gt;

&lt;p&gt;To put this in perspective, imagine you are building a tool like a &lt;code&gt;secure-pr-reviewer&lt;/code&gt; GitHub App in TypeScript. Currently, your Node.js backend has to manually orchestrate sequential API calls to analyze diffs. In a Managed Agent future, your code simply delegates the entire job to a background autonomous process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 🚀 Speculative API: Delegating to a Managed Agent Background Process&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;CodexAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@openai/codex-sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handlePullRequestEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;WebhookEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opened&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[secure-pr-reviewer] Delegating PR #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; to Codex Superapp...`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Instead of waiting for a synchronous chat completion, &lt;/span&gt;
  &lt;span class="c1"&gt;// we spin up a background agent with a 'heartbeat' connection&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;auditTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;CodexAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createManagedTask&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`PR_Security_Audit_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;diff_url&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
      1. Analyze the PR diff for security vulnerabilities (e.g., SQLi, XSS).
      2. If vulnerabilities are found, write a patch.
      3. Commit the patch to a new branch and draft a review comment.
    `&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parallel_execution&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 Utilizing the new Scratchpad logic&lt;/span&gt;
    &lt;span class="na"&gt;onHeartbeat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// The agent checks in autonomously without us polling&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Agent Status: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current_action&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;percent_complete&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;%`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;onComplete&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`✅ Audit complete. Found &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues_found&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; issues.`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Audit delegated successfully.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With OpenClaw's founder recently joining OpenAI, and competitors like Anthropic developing their own desktop agent system (codenamed "Conway"), the race for true autonomous orchestration is escalating rapidly.&lt;/p&gt;

&lt;h2&gt;
  
  
  ❄️ 3. Project "Glacier" (GPT-5.5?)
&lt;/h2&gt;

&lt;p&gt;If an entirely new, unified desktop OS for AI wasn't enough, there is an intense rumor brewing alongside this leak. &lt;/p&gt;

&lt;p&gt;Over the past few days, top OpenAI researchers have been cryptically posting snowflake emojis (❄️) across social media. Insiders speculate this is the codename for &lt;strong&gt;Glacier&lt;/strong&gt;, widely believed to be the GPT-5.5 frontier model. &lt;/p&gt;

&lt;p&gt;OpenAI has a history of coupling massive platform upgrades with new model releases to maximize the shockwave. Releasing a unified desktop Superapp powered by a model capable of orchestrating complex, parallel background tasks would be an absolute paradigm shift. &lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 The Takeaway
&lt;/h2&gt;

&lt;p&gt;We are rapidly moving from an era of "prompt engineering" to "agent orchestration." The developers who win the next decade won't be the ones writing boilerplate code; they will be the ones acting as tech leads for fleets of managed AI agents. &lt;/p&gt;

&lt;p&gt;Given OpenAI's tendency for surprise drops, we could see the Codex Superapp launch in a matter of days. &lt;/p&gt;

&lt;p&gt;Are you ready to give an AI persistent background access to your machine, or are we giving away too much control too fast? &lt;strong&gt;Drop your thoughts in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and bookmark this post! For more deep dives into building automated agentic workflows, make sure to check out my latest videos over at **AI Tooling Academy&lt;/em&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>🚀 OpenAI's Secret "Image V2" Just Leaked on LM Arena: The End of Mangled AI Text?</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 08 Apr 2026 02:43:07 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/openais-secret-image-v2-just-leaked-on-lm-arena-the-end-of-mangled-ai-text-270f</link>
      <guid>https://dev.to/siddhesh_surve/openais-secret-image-v2-just-leaked-on-lm-arena-the-end-of-mangled-ai-text-270f</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtas884emtbvb4jwzzcm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtas884emtbvb4jwzzcm.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you've been using ChatGPT over the weekend and suddenly found yourself being asked to choose between two surprisingly high-quality image generations, congratulations—you might be an unwitting beta tester for OpenAI’s next major release.&lt;/p&gt;

&lt;p&gt;According to a new report from TestingCatalog, OpenAI is quietly running a massive stealth test for its next-generation image generation model, internally dubbed &lt;strong&gt;"Image V2."&lt;/strong&gt; If you build apps, design UIs, or generate commercial assets, this is a massive deal. Here is everything we know about the leaked model, the "code red" pressure from Google, and why this update might finally fix AI's biggest, most annoying flaw. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🕵️‍♂️ The Arena Leak: What is "Image V2"?
&lt;/h2&gt;

&lt;p&gt;Over the past few days, eagle-eyed users on the LM Arena (the premier blind-testing leaderboard for AI models) noticed three mysterious new image generation variants pop up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;packingtape-alpha&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;maskingtape-alpha&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gaffertape-alpha&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end of the weekend, the models were pulled from the Arena, but they are still heavily circulating inside ChatGPT under a strict A/B testing framework. &lt;/p&gt;

&lt;p&gt;This is classic OpenAI. They used this exact same blind-testing playbook back in December 2025 with the "Chestnut" and "Hazelnut" models, which ended up shipping just weeks later as GPT Image 1.5. &lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 The Holy Grail: AI That Can Actually Spell
&lt;/h2&gt;

&lt;p&gt;So, why should developers and designers care? Because early impressions indicate that Image V2 has finally conquered the final boss of AI image generation: &lt;strong&gt;Realistic UI rendering and correctly spelled text.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Historically, asking an AI to generate a UI mockup or a marketing banner resulted in beautiful designs covered in alien hieroglyphics. Image V2 is reportedly delivering pixel-perfect button text, accurate typography, and an incredibly strong compositional understanding. &lt;/p&gt;

&lt;p&gt;If you are a frontend developer, this means you can soon prompt ChatGPT to generate a complete, text-accurate landing page mockup, slice it up, and start coding—without having to mentally translate mangled letters.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚨 The "Code Red" Counter-Attack
&lt;/h2&gt;

&lt;p&gt;It's no secret that OpenAI has been feeling the heat. According to the report, OpenAI has been operating under a CEO-mandated "code red" since late 2025. &lt;/p&gt;

&lt;p&gt;Why? Because Google's &lt;strong&gt;Nano Banana Pro&lt;/strong&gt; and Gemini 3 models have been absolutely eating their lunch, dominating the top spots on the LM Arena leaderboard for months. Image V2 is OpenAI’s direct, aggressive answer to Google's visual dominance.&lt;/p&gt;

&lt;h2&gt;
  
  
  💻 What the API Might Look Like
&lt;/h2&gt;

&lt;p&gt;While pricing and official release dates are still unannounced, history shows OpenAI usually drops the new models into their existing SDK within weeks of these Arena tests. GPT Image 1.5 already slashed API costs by 20%, so we are hoping for competitive pricing here.&lt;/p&gt;

&lt;p&gt;When it does drop, integrating the new model into your Node.js apps will likely be a seamless drop-in replacement. Here is how you'll probably trigger the new high-fidelity UI generations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateUIMockup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;promptText&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🎨 Generating UI Mockup with Image V2...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 The anticipated new model name&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`A modern, clean SaaS dashboard UI. 
             Sidebar on the left with navigation. 
             Main content shows a revenue chart. 
             A bright blue button in the top right that explicitly says "Export Data". 
             High fidelity, web design, vector style.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1024x1024&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;quality&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requesting maximum text clarity&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imageUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`✅ Success! View your mockup here: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;imageUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;imageUrl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;generateUIMockup&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔮 The Verdict
&lt;/h2&gt;

&lt;p&gt;The biggest question now is whether OpenAI will maintain the incredible raw quality seen in the Arena, or if they will dial it back with heavy safety filters and cost-optimizations before the public API launch. &lt;/p&gt;

&lt;p&gt;Either way, the era of AI failing to spell basic words on a button is coming to an end. &lt;/p&gt;

&lt;p&gt;Have you encountered any of the "tape-alpha" models in your ChatGPT sessions this week? Did the text actually make sense? &lt;strong&gt;Let me know what you generated in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and bookmark this post so you're ready to update your API calls the minute the model officially drops!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>openai</category>
      <category>design</category>
    </item>
    <item>
      <title>🚀 The "Legacy Code" Nightmare is Over: How AI Agents are Automating App Modernization</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 07 Apr 2026 02:59:30 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-legacy-code-nightmare-is-over-how-ai-agents-are-automating-app-modernization-33j8</link>
      <guid>https://dev.to/siddhesh_surve/the-legacy-code-nightmare-is-over-how-ai-agents-are-automating-app-modernization-33j8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tyqcgzhzeyuijy30mro.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tyqcgzhzeyuijy30mro.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s be honest for a second. If you’ve been a software engineer for more than a few years, you’ve probably inherited a &lt;strong&gt;"legacy monolith"&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;You know the one I'm talking about. The massive, 15-year-old codebase where business logic is hopelessly tangled with presentation layers, the original developers left a decade ago, and touching a single file breaks production. &lt;/p&gt;

&lt;p&gt;Historically, when upper management says, &lt;em&gt;"We need to move this to the cloud,"&lt;/em&gt; developers groan. The process of migrating and modernizing apps—deciding whether to &lt;strong&gt;Rehost, Refactor, or Rebuild&lt;/strong&gt;—is notoriously painful, expensive, and slow.&lt;/p&gt;

&lt;p&gt;But the meta is shifting. Microsoft just released their highly anticipated &lt;strong&gt;App Modernization Playbook&lt;/strong&gt;, and tucked inside the strategy guide is the absolute game-changer for 2026: &lt;strong&gt;Intelligent Agents&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We are no longer just using AI to write &lt;em&gt;new&lt;/em&gt; code. We are deploying autonomous AI agents to audit, decouple, and refactor our &lt;em&gt;old&lt;/em&gt; code. Here is how this is completely changing the modernization landscape. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🛑 The Old Way: Manual Portfolio Audits
&lt;/h2&gt;

&lt;p&gt;In the past, app modernization started with weeks of painful meetings. Engineers would have to manually audit dozens of applications, mapping out dependencies, and guessing the technical debt. &lt;/p&gt;

&lt;p&gt;According to Microsoft's playbook, the hardest part isn't actually moving code to Azure or AWS—it’s &lt;strong&gt;deciding which apps matter most and what to do with them.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 The New Way: Agentic Discovery &amp;amp; Execution
&lt;/h2&gt;

&lt;p&gt;Instead of humans combing through thousands of lines of spaghetti code, organizations are now pointing &lt;strong&gt;AI Discovery Agents&lt;/strong&gt; at their repositories. &lt;/p&gt;

&lt;p&gt;These agents don't just read the code; they map out execution paths, identify unused endpoints, flag hardcoded credentials, and recommend the exact target architecture (e.g., Serverless, Containerization, or full Microservices). &lt;/p&gt;

&lt;h3&gt;
  
  
  💻 See It In Action: Breaking the Monolith
&lt;/h3&gt;

&lt;p&gt;Let's look at a conceptual example. Imagine you have a massive, tightly coupled Express.js application. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Before: The 10-Year-Old Monolith (&lt;code&gt;server.js&lt;/code&gt;)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 🍝 5,000 lines of tightly coupled spaghetti&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/orders&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="c1"&gt;// Direct DB queries mixed with routing&lt;/span&gt;
       &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT * FROM Orders O JOIN Users U ON O.userId = U.id WHERE U.id = ?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

       &lt;span class="c1"&gt;// Legacy data mutation happening right in the controller&lt;/span&gt;
       &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formattedData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;qty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;

       &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;formattedData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Database error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A human developer would spend hours decoupling the database logic, writing new unit tests, and moving this to a scalable microservice. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An AI Refactoring Agent&lt;/strong&gt; can autonomously parse the Abstract Syntax Tree (AST), isolate the business logic, and generate the scaffolding for a modern, decoupled cloud function (like an Azure Function):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The After: Agent-Generated Microservice&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 🚀 1. The Agent extracts the logic into a Service Layer (orderService.js)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getFormattedOrders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT * FROM Orders O JOIN Users U ON O.userId = U.id WHERE U.id = ?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;qty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// 🚀 2. The Agent generates the Serverless Handler (index.js)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;getFormattedOrders&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../services/orderService.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;⚡ Processing order request via Serverless function.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getFormattedOrders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Failed to fetch orders: &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error fetching orders&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent doesn't just rewrite the code; it &lt;em&gt;re-architects&lt;/em&gt; it for the cloud, ensuring proper separation of concerns without over-engineering the solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  📘 The Microsoft Playbook Strategy
&lt;/h2&gt;

&lt;p&gt;The Microsoft App Modernization Playbook lays out a brilliant, structured approach for utilizing these agents:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Assess Value vs. Complexity:&lt;/strong&gt; Use AI agents to scan your portfolio. High business value + low complexity? That's your easy win for refactoring. Low value + high complexity? Leave it as-is or retire it. Let the data drive the decision, not developer intuition.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Right-Size the Architecture:&lt;/strong&gt; Don't default to Kubernetes for everything. Agents can analyze your traffic patterns and recommend Serverless (Azure Functions) for bursty traffic or Container Apps for consistent workloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate Execution:&lt;/strong&gt; Once the plan is set, deploy execution agents to handle the tedious scaffolding, CI/CD pipeline generation, and initial refactoring passes.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🎯 The Takeaway
&lt;/h2&gt;

&lt;p&gt;We are entering a golden age for developers where we no longer have to be digital archaeologists digging through terrible code from 2012. By leveraging intelligent agents for discovery, assessment, and execution, we can finally focus on building new features instead of constantly putting out legacy fires.&lt;/p&gt;

&lt;p&gt;If you are currently staring down a massive migration project, I highly recommend checking out the full strategy guide. You can grab the free e-book and read the full breakdown directly from Microsoft right here: &lt;a href="https://info.microsoft.com/ww-landing-app-modernization-playbook.html" rel="noopener noreferrer"&gt;The App Modernization Playbook&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What do you think?&lt;/strong&gt; Are you ready to let an AI agent loose on your company's oldest monolithic codebase, or is that a recipe for disaster? Let me know in the comments below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, smash the ❤️ and 🦄 buttons, and bookmark this post for the next time your boss asks about moving to the cloud!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
    <item>
      <title>🚀 Qwen 3.6-Plus Just Dropped: The 1M-Context AI Changing the "Vibe Coding" Game</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Sat, 04 Apr 2026 04:55:21 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/qwen-36-plus-just-dropped-the-1m-context-ai-changing-the-vibe-coding-game-978</link>
      <guid>https://dev.to/siddhesh_surve/qwen-36-plus-just-dropped-the-1m-context-ai-changing-the-vibe-coding-game-978</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3f0z4g6yzy724tldxr8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3f0z4g6yzy724tldxr8.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The AI coding landscape is moving so fast it's almost impossible to keep up. Just when we thought we had our agentic workflows dialed in, Alibaba Cloud dropped a massive update: &lt;strong&gt;Qwen 3.6-Plus&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;If you've been relying on Claude Opus or GPT-4 for your autonomous coding agents, you need to pay attention to this release. Qwen 3.6-Plus is heavily optimized for "vibe coding" and repository-level problem-solving, and the benchmarks show it matching or beating industry heavyweights across the board.&lt;/p&gt;

&lt;p&gt;Here is a breakdown of what makes this new model so powerful, and how you can integrate its killer new features into your own apps today. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 1 Million Context Window (By Default)
&lt;/h2&gt;

&lt;p&gt;Let's start with the sheer size. Qwen 3.6-Plus ships with a &lt;strong&gt;1M context window out of the box&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;For everyday chat, this is overkill. But for autonomous agents? It's mandatory. This massive context allows you to dump entire codebases, API documentations, and massive log files into the prompt without worrying about truncation. &lt;/p&gt;

&lt;p&gt;Combined with its improved spatial intelligence and multimodal reasoning, you can now feed the model UI screenshots alongside thousands of lines of code and ask it to wire up the frontend autonomously. &lt;/p&gt;

&lt;h2&gt;
  
  
  🛡️ The Killer Feature: &lt;code&gt;preserve_thinking&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;When building my &lt;code&gt;secure-pr-reviewer&lt;/code&gt; GitHub App, one of the biggest hurdles is "agent amnesia." When I feed the model a massive pull request containing complex TypeScript type definitions and Node.js backend logic, it needs to reason through the security implications. But historically, if the agent makes a multi-turn conversation (e.g., calling a tool, getting a response, and thinking again), it discards its previous internal "thinking" trace.&lt;/p&gt;

&lt;p&gt;Qwen 3.6-Plus solves this with a brand new API parameter: &lt;strong&gt;&lt;code&gt;preserve_thinking&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When enabled, the model actively retains the internal "thinking" content from &lt;em&gt;all preceding turns&lt;/em&gt; in the conversation. This drastically improves decision consistency for complex, multi-step agentic workflows, ensuring the AI doesn't lose its train of thought when executing complex automated tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  💻 How to Use It (TypeScript Example)
&lt;/h2&gt;

&lt;p&gt;Because Alibaba's Model Studio provides an OpenAI-compatible endpoint, integrating this into your existing Node.js stack is incredibly simple. &lt;/p&gt;

&lt;p&gt;Here is how you can use the official &lt;code&gt;openai&lt;/code&gt; SDK to tap into Qwen 3.6-Plus and enable persistent reasoning for your agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Point the standard OpenAI client to Alibaba's DashScope endpoint&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DASHSCOPE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[https://dashscope-intl.aliyuncs.com/compatible-mode/v1](https://dashscope-intl.aliyuncs.com/compatible-mode/v1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runSecurityAudit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prDiff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🔍 Booting up Qwen 3.6-Plus Agent...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen3.6-plus&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are an autonomous security agent auditing code.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; 
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Please review the following PR diff:\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prDiff&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; 
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;// We pass the Qwen-specific features in the extra_body&lt;/span&gt;
    &lt;span class="c1"&gt;// @ts-ignore&lt;/span&gt;
    &lt;span class="na"&gt;extra_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;enable_thinking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;preserve_thinking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 The magic toggle for agentic workflows&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Qwen returns thinking logic under a custom property before the actual answer&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;thinkingDelta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;reasoning_content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;contentDelta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;thinkingDelta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`\x1b[90m&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;thinkingDelta&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\x1b[0m`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Print thinking in gray&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;contentDelta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;contentDelta&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Print final answer normally&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🏆 The Benchmarks: A New Standard
&lt;/h2&gt;

&lt;p&gt;If you are a numbers person, the benchmark data on Qwen 3.6-Plus is staggering. &lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;SWE-bench Verified&lt;/strong&gt;, it scores a 78.8 (edging out Claude Opus 4.5 at 76.8). It also dominates in complex terminal operations, scoring a 61.6 on &lt;strong&gt;Terminal-Bench 2.0&lt;/strong&gt;. Anthropic and OpenAI have dominated the "Coding Agent" narrative for the last year, but Qwen has officially entered the chat with an "all-rounder" model that organic integrates deep logical reasoning and precise tool execution. &lt;/p&gt;

&lt;h2&gt;
  
  
  🔮 What’s Next?
&lt;/h2&gt;

&lt;p&gt;The API is available immediately via Alibaba Cloud Model Studio, and the team noted that it can be seamlessly integrated into popular open-source coding harnesses like OpenClaw and Cline.&lt;/p&gt;

&lt;p&gt;As the AI models get smarter and context windows expand, we are rapidly moving away from "AI autocomplete" and fully into the era of "AI Coworkers". &lt;/p&gt;

&lt;p&gt;Are you planning to test Qwen 3.6-Plus in your workflows? Drop your thoughts in the comments below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, don't forget to hit the ❤️ and bookmark the code snippet for your next agentic weekend project!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
