<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Siddhesh Surve</title>
    <description>The latest articles on DEV Community by Siddhesh Surve (@siddhesh_surve).</description>
    <link>https://dev.to/siddhesh_surve</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3674466%2F4395d561-d8af-4cbb-be2a-2fd3696ad2b2.png</url>
      <title>DEV Community: Siddhesh Surve</title>
      <link>https://dev.to/siddhesh_surve</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/siddhesh_surve"/>
    <language>en</language>
    <item>
      <title>Microsoft Just Dropped 'Scout': The Always-On AI Agent That Could Kill Zapier 🤯</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 09 Jun 2026 02:35:02 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/microsoft-just-dropped-scout-the-always-on-ai-agent-that-could-kill-zapier-507g</link>
      <guid>https://dev.to/siddhesh_surve/microsoft-just-dropped-scout-the-always-on-ai-agent-that-could-kill-zapier-507g</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4r1eq94n8upftmqf22ri.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4r1eq94n8upftmqf22ri.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the last two years, we’ve been stuck in the "Prompt-and-Wait" era of AI. You ask a question, you get a response, you copy-paste the code, and you move on. But behind the scenes, the big tech giants have been racing toward a completely different paradigm: &lt;strong&gt;Agentic AI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Yesterday, Microsoft quietly opened the floodgates on its new Frontier program, rolling out &lt;strong&gt;Microsoft Scout&lt;/strong&gt;—an "always-on" desktop agent that doesn't wait for your instructions. &lt;/p&gt;

&lt;p&gt;This isn't just another Copilot update. Scout is the first of what Microsoft is calling &lt;strong&gt;Autopilots&lt;/strong&gt;. Here is everything you need to know about this massive shift, why it might replace your existing automation stack, and how you can start thinking about always-on agent architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤖 What Exactly is Microsoft Scout?
&lt;/h2&gt;

&lt;p&gt;Scout is a persistent, native desktop client (available on both macOS and Windows) that continuously runs in the background. Instead of being a floating chat window, Scout carries its own identity and has deep, unprompted access to your entire Microsoft 365 environment, local file system, and codebase.&lt;/p&gt;

&lt;p&gt;Here’s where it gets wild:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Model Agnostic:&lt;/strong&gt; You aren't locked into one LLM. Scout features a model picker that lets you seamlessly swap between Anthropic models and OpenAI's newly released GPT-5.5.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headless Browser Mode:&lt;/strong&gt; Scout can spin up invisible browser sessions to scrape, compile, or execute web-based tasks completely in the background without stealing your focus.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zapier-Style Orchestration:&lt;/strong&gt; It includes a visual, multi-step workflow builder directly inside the app, allowing you to chain complex logical steps without third-party integration tools.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🏗️ The Autopilot Architecture: How to Think Like Scout
&lt;/h2&gt;

&lt;p&gt;From an engineering perspective, Scout is fascinating. It moves AI from a stateless API call to a stateful, event-driven listener. &lt;/p&gt;

&lt;p&gt;If you are building your own agentic applications, you need to transition your mindset from HTTP request/response to persistent event streams. Here is a conceptual example of how you might build a localized, headless "Scout-like" agent using TypeScript and Node.js. &lt;/p&gt;

&lt;p&gt;Instead of waiting for a user prompt, this agent listens to file system changes and autonomously reviews code using a headless process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;chokidar&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chokidar&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AIProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./lib/ai-engine&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;HeadlessBrowser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./lib/browser&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Initialize an always-on watcher (The "Autopilot" pattern)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;watcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chokidar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./src/**/*.ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;persistent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🚀 Always-on Agent initialized. Monitoring file system...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;watcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;change&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[Agent Action] Detected changes in &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Initiating background review.`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 2. Headless context gathering&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;HeadlessBrowser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scrapeContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[https://internal-repo.local/pr/active](https://internal-repo.local/pr/active)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Autonomous AI Execution using an advanced model (e.g., GPT-5.5)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;AIProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-5-5&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;systemRole&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are an autonomous engineering agent.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Review &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; against the following PR context: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prContext&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;autoRemediate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// 4. Action without prompting&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hasVulnerabilities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;autoCommitFixes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reviewTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;remediationCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[Agent Action] Automatically patched and committed fixes for &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Agent encountered a roadblock:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the pattern? The AI isn't triggered by a chat interface; it's triggered by system events, running headless tasks to gather context, and executing logic autonomously.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏰 The Ultimate Moat
&lt;/h2&gt;

&lt;p&gt;Startups have been trying to build "God-mode" AI agents for a while now, but Microsoft has an unfair advantage: &lt;strong&gt;Distribution and Ecosystem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because Microsoft owns the OS (Windows) and the underlying identity layer (Entra), they can give Scout native file-system access and deep governance controls that third-party apps can only dream of. For enterprise organizations in the Frontier program, deploying an agent that is already authenticated and sandboxed by IT is an absolute no-brainer.&lt;/p&gt;

&lt;p&gt;If Scout delivers on its promise, we are looking at the potential end of disjointed automation tools. Why pay for a Zapier subscription when your local OS agent can just watch your folders, read your emails, and execute the API calls directly?&lt;/p&gt;

&lt;h2&gt;
  
  
  👇 What do you think?
&lt;/h2&gt;

&lt;p&gt;Are we ready for always-on AI agents that operate autonomously on our desktops? Will this kill third-party automation tools, or is the ecosystem too locked down?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Let me know your thoughts in the comments below! And if you found this breakdown helpful, drop a ❤️ and follow for more deep dives into the tools shaping the future of software.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>microsoft</category>
      <category>productivity</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Codex Just Became an 'Everything Agent': Sites, Annotations, and 110 New Skills 🤯</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Thu, 04 Jun 2026 02:28:35 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/codex-just-became-an-everything-agent-sites-annotations-and-110-new-skills-3no7</link>
      <guid>https://dev.to/siddhesh_surve/codex-just-became-an-everything-agent-sites-annotations-and-110-new-skills-3no7</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ac5k8ftuih2mqylc4uj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ac5k8ftuih2mqylc4uj.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you thought OpenAI’s Codex was just a glorified autocomplete extension for your IDE, it’s time to recalibrate. &lt;/p&gt;

&lt;p&gt;OpenAI has officially transformed Codex from a developer-only utility into a massive, autonomous agentic workflow engine. With over 5 million weekly active users, the platform has seen a massive shift: 20% of its user base is now made up of non-developers—including data analysts, marketers, and designers—and this demographic is growing three times faster than engineers.&lt;/p&gt;

&lt;p&gt;To support this massive expansion, OpenAI just dropped three game-changing features: &lt;strong&gt;Role-Specific Plugins, Sites, and Annotations&lt;/strong&gt;. Here is a breakdown of why this completely changes how we build and execute work.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔌 1. Role-Specific Plugins: The End of API Glue Code
&lt;/h2&gt;

&lt;p&gt;Previously, integrating AI into business workflows meant building custom API wrappers to connect LLMs to your company's data. Codex now bypasses that entirely with six new role-specific plugins.&lt;/p&gt;

&lt;p&gt;These plugins come bundled with 110 automated skills and connect directly to 62 major enterprise applications out of the box. &lt;/p&gt;

&lt;p&gt;Here is a quick look at how the new plugins map out:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin Category&lt;/th&gt;
&lt;th&gt;Key Integrations&lt;/th&gt;
&lt;th&gt;What It Can Do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Analytics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Snowflake, Databricks Genie, Tableau, Hex&lt;/td&gt;
&lt;td&gt;Explore business data, explain metric changes, and generate dashboards.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Creative Production&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Figma, Canva, Shutterstock, Picsart&lt;/td&gt;
&lt;td&gt;Turn creative briefs into display ads and product lifestyle shots.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sales&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Salesforce, HubSpot, Slack, Outreach&lt;/td&gt;
&lt;td&gt;Build close plans, review at-risk deals, and update customer records.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Product Design&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Figma, Canva&lt;/td&gt;
&lt;td&gt;Audit user flows and turn static ideas into interactive prototypes.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;(Note: Additional plugins for public equity investing and investment banking are also included, with more on the way for legal and corporate finance.)&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🌐 2. Codex Sites: Bye-Bye Static Dashboards
&lt;/h2&gt;

&lt;p&gt;We all know the pain of maintaining lightweight internal tools. Now, Codex is rolling out a feature called &lt;strong&gt;Sites&lt;/strong&gt; (currently in preview for Business and Enterprise workspaces). &lt;/p&gt;

&lt;p&gt;Sites act as a new canvas that takes your analysis, ideas, or documents and instantly generates functional, interactive web applications. Instead of passing around static spreadsheets, you can instruct Codex to spin up a scenario planner, project board, or customer review page that is hosted and shareable via a simple URL. This effectively allows cross-functional teams to bypass front-end development entirely for internal tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 3. Annotations: Fixing AI's Biggest Frustration
&lt;/h2&gt;

&lt;p&gt;If you’ve ever asked an AI to fix a single chart or update a specific function, only to watch it aggressively rewrite your entire file and break your custom formatting, you will love &lt;strong&gt;Annotations&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Annotations act as a localized context-scoping mechanism. You simply highlight a specific part of a site, spreadsheet, or document, and ask Codex to edit &lt;em&gt;just that part&lt;/em&gt;. The model executes the code strictly within that boundary, leaving your surrounding dependencies and styles completely untouched.&lt;/p&gt;




&lt;h2&gt;
  
  
  💻 Developer Workflows: CLI &amp;amp; IDE Powerups
&lt;/h2&gt;

&lt;p&gt;While business users are getting visual tools, developers still get massive power-ups in the terminal and IDE. Codex is heavily leaning into autonomous planning and targeted reviews. &lt;/p&gt;

&lt;p&gt;For example, you can now launch Codex from your command line and ask it to review your uncommitted working tree with hyper-specific instructions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the Codex CLI&lt;/span&gt;
codex

&lt;span class="c"&gt;# Instruct the agent to review your active working tree for specific flaws&lt;/span&gt;
/review Focus on edge cases and security issues

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also orchestrate massive codebase refactors using the &lt;code&gt;$plan&lt;/code&gt; skill directly in your IDE chat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$plan We need to refactor the auth subsystem to: 
- split responsibilities (token parsing vs session loading vs permissions)
- reduce circular imports 
Constraints: No user-visible behavior changes and keep public APIs stable.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🚀 The Verdict
&lt;/h2&gt;

&lt;p&gt;OpenAI is making it clear: the future is not about chatting with an AI; it is about delegating execution. By bundling apps, defining exact boundaries with Annotations, and generating live interfaces with Sites, Codex has leveled up from an assistant to an autonomous teammate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you rolling out Codex to your non-technical teams? Drop your thoughts on these new workflows in the comments below! 👇&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>openai</category>
      <category>chatgpt</category>
    </item>
    <item>
      <title>OpenAI Just Dropped GPT-5.5 and Codex on AWS: The Enterprise AI Game Has Changed 🚀</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 03 Jun 2026 02:24:43 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/openai-just-dropped-gpt-55-and-codex-on-aws-the-enterprise-ai-game-has-changed-4iia</link>
      <guid>https://dev.to/siddhesh_surve/openai-just-dropped-gpt-55-and-codex-on-aws-the-enterprise-ai-game-has-changed-4iia</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4c6tji87mai2xmewcpfg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4c6tji87mai2xmewcpfg.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been trying to build production-grade AI features in a large enterprise, you know the biggest bottleneck isn't the code—it's the procurement, security reviews, and compliance hurdles. &lt;/p&gt;

&lt;p&gt;Today, that barrier was entirely smashed. OpenAI and AWS just announced a massive expansion of their partnership, making &lt;strong&gt;GPT-5.5, GPT-5.4, and Codex generally available on Amazon Bedrock&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;This isn't just another API wrapper; it's a foundational shift in how organizations will build and deploy Agentic AI and software engineering workflows. &lt;/p&gt;

&lt;p&gt;Here is exactly what launched, why it matters, and what you can start building today.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤯 The Big Three: What Just Launched?
&lt;/h2&gt;

&lt;p&gt;The new integration brings OpenAI's frontier capabilities directly into the AWS environments where millions of customers already operate. &lt;/p&gt;

&lt;h3&gt;
  
  
  1. GPT-5.5 &amp;amp; GPT-5.4 on Bedrock
&lt;/h3&gt;

&lt;p&gt;OpenAI's latest and most capable frontier models are now running on Amazon Bedrock's next-generation inference engine. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;GPT-5.5&lt;/strong&gt; is engineered to grasp intent faster and autonomously execute multi-step tasks.&lt;/li&gt;
&lt;li&gt;  The pricing for these models perfectly matches OpenAI's first-party rates. &lt;/li&gt;
&lt;li&gt;  Crucially for enterprise budgets, inference usage counts directly toward your existing AWS cloud commitments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Codex for Enterprise Teams
&lt;/h3&gt;

&lt;p&gt;Codex, OpenAI's software engineering agent currently used by more than 5 million people weekly, is officially available on AWS. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Teams can use Codex to write, refactor, debug, test, and validate code across massive codebases.&lt;/li&gt;
&lt;li&gt;  It is accessible via the Bedrock API, Codex CLI, the Codex desktop app, and IDE integrations (including Visual Studio Code, JetBrains, and Xcode). &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Bedrock Managed Agents (Powered by OpenAI)
&lt;/h3&gt;

&lt;p&gt;Moving beyond single-prompt chatbots, AWS is offering &lt;strong&gt;Bedrock Managed Agents&lt;/strong&gt; built specifically with the OpenAI agent harness. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  This infrastructure is designed to unlock faster execution, sharper reasoning, and reliable steering for long-running workflows.&lt;/li&gt;
&lt;li&gt;  It handles the difficult aspects of deployment, orchestration, tool use, and governance, accelerating the transition from prototype to production.&lt;/li&gt;
&lt;li&gt;  Every agent operates with its own identity and logs every action for complete auditability.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔒 Why This is a Developer's Dream
&lt;/h2&gt;

&lt;p&gt;The most significant advantage of this release is &lt;strong&gt;Zero-Friction Security&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;By running model inference through Amazon Bedrock, every API call automatically inherits the AWS governance controls you already have in place. This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  IAM permissions for strict access control.&lt;/li&gt;
&lt;li&gt;  VPC and PrivateLink isolation to keep traffic off the public internet.&lt;/li&gt;
&lt;li&gt;  KMS encryption for your data.&lt;/li&gt;
&lt;li&gt;  AWS CloudTrail integration for comprehensive audit logging.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Furthermore, your prompts and responses are explicitly not used to train models and are never shared with model providers. &lt;/p&gt;

&lt;h3&gt;
  
  
  💻 Code Example: Invoking GPT-5.5 via AWS SDK
&lt;/h3&gt;

&lt;p&gt;Here is a conceptual example of how seamless it is to invoke GPT-5.5 using the standard AWS Bedrock SDK in your Node.js backend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;BedrockRuntimeClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;InvokeModelCommand&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@aws-sdk/client-bedrock-runtime&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Initialize the Bedrock client using your existing AWS credentials&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BedrockRuntimeClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;us-east-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateWithGPT55&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InvokeModelCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="c1"&gt;// Point directly to the new OpenAI GPT-5.5 model on Bedrock&lt;/span&gt;
    &lt;span class="na"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai.gpt-5-5&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="na"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;command&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Agent Response:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error invoking GPT-5.5 on AWS:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;generateWithGPT55&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Analyze this logs dataset and outline a multi-step remediation plan.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(Note: Model IDs and exact payload structures will depend on the final AWS Bedrock API spec for OpenAI models).&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔮 What’s Next: Project Daybreak
&lt;/h2&gt;

&lt;p&gt;This launch is just the beginning. During the announcement, OpenAI teased that their highly anticipated cybersecurity initiative, &lt;strong&gt;Daybreak&lt;/strong&gt;, is coming to AWS soon.&lt;/p&gt;

&lt;p&gt;Daybreak is designed to fundamentally change how software is built and defended. It includes cyber models and Codex Security, which will help security teams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify vulnerabilities early in the lifecycle.&lt;/li&gt;
&lt;li&gt;Conduct secure code reviews and threat modeling.&lt;/li&gt;
&lt;li&gt;Generate automated patch validations and dependency risk analyses.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When Daybreak arrives on Bedrock, security teams will be able to seamlessly adopt these AI-assisted defense tools through the exact same AWS operational frameworks they already rely on.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>programming</category>
      <category>news</category>
    </item>
    <item>
      <title>The $575B AI Bet: What Big Tech's Infrastructure War Means for Everyday Developers</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 02 Jun 2026 02:39:36 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-575b-ai-bet-what-big-techs-infrastructure-war-means-for-everyday-developers-52fb</link>
      <guid>https://dev.to/siddhesh_surve/the-575b-ai-bet-what-big-techs-infrastructure-war-means-for-everyday-developers-52fb</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr0oydye13bs65n1rsftf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr0oydye13bs65n1rsftf.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We are currently witnessing the largest infrastructure build-out since the World Wars. With hundreds of billions being poured into AI data centers and compute power, the landscape of software engineering is fundamentally shifting. &lt;/p&gt;

&lt;p&gt;While the giants play a massive game of margin and market share, what does this actually mean for those of us writing code, building apps, and managing production systems? &lt;/p&gt;

&lt;p&gt;Here is a breakdown of why the data stack and AI have completely fused, and how you can position yourself to win in this new era of engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏗️ The Fusion of Data and AI Workflows
&lt;/h2&gt;

&lt;p&gt;In the past, you had your application layer, and somewhere far away, a data engineering team managed the pipelines. That boundary is gone. Building intelligent applications today means your core product &lt;em&gt;is&lt;/em&gt; the data pipeline. &lt;/p&gt;

&lt;p&gt;When you are architecting systems that need to process massive streams of events—like real-time ad bidding or personalized recommendation engines—you can't just slap an API wrapper around an LLM and call it a day. The infrastructure has to be deeply integrated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Example: Creating a Context-Aware Event Processor
&lt;/h3&gt;

&lt;p&gt;Here is a simplified example using TypeScript and Node.js of how we can start embedding AI reasoning directly into an event stream, rather than treating it as an afterthought:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;EventStream&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./lib/streaming&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AIProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./lib/ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AdEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Processing high-throughput events with inline AI evaluation&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processEventStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;EventStream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AdEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// 1. Fetch real-time user embeddings (The Data layer)&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userProfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetchUserEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="c1"&gt;// 2. Inline AI evaluation for hyper-personalization (The AI layer)&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;AIProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fast-inference-v2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Evaluate intent based on context: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; and profile: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userProfile&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="c1"&gt;// 3. Execute downstream logic&lt;/span&gt;
      &lt;span class="nf"&gt;executeBid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Pipeline failure for event &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern is becoming the standard. The engineers who will thrive over the next few years are the ones who can bridge the gap between heavy data infrastructure and application logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ The Rise of the "Builder-Marketer"
&lt;/h2&gt;

&lt;p&gt;Another massive shift is how products go to market. The barrier to building software has dropped to near zero. Anyone can spin up a SaaS clone over the weekend. So, what is the moat?&lt;/p&gt;

&lt;p&gt;The moat is &lt;strong&gt;distribution and community&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You can't just be a developer anymore; you have to understand the tools and the market. If you are reviewing emerging tech, creating tutorials, or sharing your development journey through video content, you are building a distribution channel that cannot be easily replicated by a new competitor. The "Right to Win" in software now requires a relentless focus on execution and a direct line to your audience.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 Key Takeaways for 2026
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Understand the Infrastructure:&lt;/strong&gt; Don't just learn how to prompt; learn how the models are served, how context windows manage memory, and how vector databases scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build Your Distribution:&lt;/strong&gt; Whether it's writing articles or producing video content analyzing new tools, start building an audience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Merge the Stacks:&lt;/strong&gt; Stop treating data engineering and full-stack development as separate disciplines. They are one and the same now.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tools are evolving faster than ever, but the fundamental principles of building scalable, robust systems remain.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What are your thoughts on the current state of AI infrastructure? Are you seeing this fusion of data and app logic in your own projects? Let's discuss in the comments below! 👇&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>career</category>
    </item>
    <item>
      <title>Anthropic Just Dropped Claude Opus 4.8: The Era of 'Dynamic Workflows' is Here 🚀</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Fri, 29 May 2026 23:25:51 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/anthropic-just-dropped-claude-opus-48-the-era-of-dynamic-workflows-is-here-3oo8</link>
      <guid>https://dev.to/siddhesh_surve/anthropic-just-dropped-claude-opus-48-the-era-of-dynamic-workflows-is-here-3oo8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydpgjdcztp892zx2po1a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydpgjdcztp892zx2po1a.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been tracking the evolution of Large Language Models this year, you know the bottleneck isn’t usually raw intelligence anymore—it’s orchestration. How do you get an AI to refactor a massive, messy, 100,000-line monolithic codebase without it hallucinating halfway through or losing context?&lt;/p&gt;

&lt;p&gt;Yesterday, Anthropic released &lt;strong&gt;Claude Opus 4.8&lt;/strong&gt;, and it completely shifts the paradigm. This isn't just a minor model bump; it's a foundational upgrade focused heavily on &lt;strong&gt;Agentic AI&lt;/strong&gt; and enterprise-scale execution. &lt;/p&gt;

&lt;p&gt;If you build AI applications, automated workflows, or just use AI to write code, here is exactly why Opus 4.8 is a game-changer.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤯 1. "Dynamic Workflows": Massively Parallel Subagents
&lt;/h2&gt;

&lt;p&gt;The standout feature of this release is the introduction of &lt;strong&gt;Dynamic Workflows&lt;/strong&gt; in Claude Code. &lt;/p&gt;

&lt;p&gt;We are finally moving past the linear "prompt-and-wait" model. Opus 4.8 is designed to plan a massive task and then dynamically spin up &lt;strong&gt;hundreds of parallel subagents&lt;/strong&gt; in a single session. &lt;/p&gt;

&lt;p&gt;Imagine you need to execute a codebase-scale migration. Opus 4.8 can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Map the architecture.&lt;/li&gt;
&lt;li&gt;Spin up 50 isolated subagents to update individual microservices concurrently.&lt;/li&gt;
&lt;li&gt;Run the existing test suite as its quality bar.&lt;/li&gt;
&lt;li&gt;Verify its own outputs before reporting back to you for the final PR merge.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is the kind of heavy-lifting, Big Data infrastructure capability that transforms an LLM from a "coding assistant" into a full-fledged autonomous engineer.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎛️ 2. Effort Control (Stop Wasting Tokens)
&lt;/h2&gt;

&lt;p&gt;Not every task needs the AI to ponder the universe. Opus 4.8 introduces a new &lt;strong&gt;Effort Control&lt;/strong&gt; slider in claude.ai. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Low Effort:&lt;/strong&gt; Faster responses, drastically slower rate limit consumption (perfect for boilerplate or quick regex fixes).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;High/Extra Effort:&lt;/strong&gt; Claude stops to "think" more frequently and deeply, maximizing reasoning for complex, long-running asynchronous workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Best of all? The base pricing hasn't changed. It’s still $5/M input and $25/M output tokens, but you now have surgical control over how your compute budget is spent.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ 3. Mid-Flight Prompt Updates (The Messages API Upgrade)
&lt;/h2&gt;

&lt;p&gt;This is a massive win for developers building agentic wrappers. The Messages API now accepts &lt;code&gt;system&lt;/code&gt; entries &lt;em&gt;inside&lt;/em&gt; the messages array. &lt;/p&gt;

&lt;p&gt;Previously, if you wanted to update an agent's permissions or token budget while it was running a multi-step task, you had to break the prompt cache or route it clumsily through a user turn. Now, you can inject system updates mid-task seamlessly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Example: Injecting System Instructions Mid-Task
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Simulating an agent in the middle of a massive log analysis task
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze this 10GB distributed system log and find the latency spike root cause.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting parallel log analysis across 5 nodes...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;# 🔥 NEW IN 4.8: Injecting a system-level constraint mid-conversation 
&lt;/span&gt;    &lt;span class="c1"&gt;# without breaking the flow or treating it as a user message.
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SYSTEM UPDATE: Memory budget critical. Cease deep analysis. Output ONLY the exact timestamp and microservice name of the failure. Do not explain your reasoning.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Continue execution.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🛡️ 4. The End of "Confident Hallucinations"
&lt;/h2&gt;

&lt;p&gt;According to the release notes and early testers (including the CEO of Cognition, the team behind Devin), Opus 4.8 fixes the verbosity and tool-calling hiccups of 4.7.&lt;/p&gt;

&lt;p&gt;More importantly, it is &lt;strong&gt;4x less likely to let flaws in its own code pass unremarked.&lt;/strong&gt; Instead of confidently claiming it fixed a bug while secretly breaking two other things, Opus 4.8 proactively flags uncertainties in its inputs and outputs. For autonomous workloads that need to run unattended overnight, this honesty is critical.&lt;/p&gt;




&lt;h2&gt;
  
  
  What’s Next? (Enter: Project Glasswing)
&lt;/h2&gt;

&lt;p&gt;Anthropic casually dropped a teaser at the end of their announcement: &lt;strong&gt;Claude Mythos Preview&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is an upcoming class of models with even higher intelligence than Opus, currently being tested by a small group for advanced cybersecurity work. If Opus 4.8 is the orchestration king, Mythos looks like it might break the intelligence ceiling entirely in the coming weeks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Have your say 👇
&lt;/h3&gt;

&lt;p&gt;Are you building Agentic AI workflows? How are you handling the orchestration problem today, and will you be testing out Opus 4.8's dynamic workflows?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Drop your thoughts in the comments!&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you enjoyed this breakdown, hit the ❤️ and follow me for more deep dives into Large Language Models, cloud computing scaling, and the future of software architecture!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>anthropic</category>
      <category>claude</category>
    </item>
    <item>
      <title>xAI Just Dropped 'Grok Build': The Terminal-Native Agentic AI Changing How We Code</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 27 May 2026 02:48:26 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/xai-just-dropped-grok-build-the-terminal-native-agentic-ai-changing-how-we-code-3bi1</link>
      <guid>https://dev.to/siddhesh_surve/xai-just-dropped-grok-build-the-terminal-native-agentic-ai-changing-how-we-code-3bi1</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42onmbp18s34na9dw2up.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42onmbp18s34na9dw2up.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been paying attention to the rapid evolution of Agentic AI this year, you know the battleground has shifted from web interfaces to where the real work happens: &lt;strong&gt;the terminal&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Yesterday, xAI quietly dropped a massive bombshell for developers: &lt;strong&gt;Grok Build&lt;/strong&gt;, a powerful new coding agent and CLI designed for professional software engineering and complex workflows. Currently in early beta for SuperGrok and X Premium Plus subscribers, this isn't just another autocomplete wrapper. It's a full-fledged autonomous agent ecosystem living right inside your command line.&lt;/p&gt;

&lt;p&gt;Here is everything you need to know about Grok Build, why it matters, and how you can start orchestrating it today.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 What Makes Grok Build Different?
&lt;/h2&gt;

&lt;p&gt;We are moving past the era of single-prompt chatbots. The future belongs to &lt;strong&gt;Agentic AI&lt;/strong&gt;—systems that can reason, plan, spin up sub-tasks, and execute complex operations autonomously. Grok Build brings this paradigm directly to your local development environment. &lt;/p&gt;

&lt;h3&gt;
  
  
  1. Plan, Review, Approve (No More Rogue AI)
&lt;/h3&gt;

&lt;p&gt;One of the biggest headaches with coding agents is when they go off the rails and rewrite half your codebase before you realize what's happening. Grok Build introduces a strictly governed &lt;strong&gt;Plan Mode&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Before it executes a complex task, it generates a &lt;code&gt;plan.md&lt;/code&gt;. You can review the steps, leave comments on individual items, or completely rewrite the strategy. Once you give the green light, changes appear as clean diffs.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Parallel Subagents: The "Hive Mind" Approach
&lt;/h3&gt;

&lt;p&gt;This is where Grok Build flexes its muscles. For massive undertakings (like finding the source of a p99 latency regression across microservices), Grok Build doesn't just read logs linearly. It delegates work to &lt;strong&gt;specialized subagents that run in parallel&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;It can spin up isolated worktrees, sending one agent to investigate DB query plans, another to check cache hit rates, and a third to profile a pricing engine—all at the exact same time. &lt;/p&gt;

&lt;h3&gt;
  
  
  3. Out-of-the-Box MCP &amp;amp; Plugin Support
&lt;/h3&gt;

&lt;p&gt;Grok Build doesn't force you into a walled garden. It instantly picks up your repository's conventions. Your &lt;code&gt;AGENTS.md&lt;/code&gt;, existing hooks, plugins, and &lt;strong&gt;MCP (Model Context Protocol) servers&lt;/strong&gt; work seamlessly from day one. &lt;/p&gt;




&lt;h2&gt;
  
  
  💻 Getting Started with Grok Build
&lt;/h2&gt;

&lt;p&gt;Ready to spin it up? If you have the required subscription, installation is a breeze. &lt;/p&gt;

&lt;h3&gt;
  
  
  1. The One-Line Install
&lt;/h3&gt;

&lt;p&gt;Pop open your terminal and run the bootstrap script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;https://x.ai/cli/install.sh]&lt;span class="o"&gt;(&lt;/span&gt;https://x.ai/cli/install.sh&lt;span class="o"&gt;)&lt;/span&gt; | bash

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once installed, just type &lt;code&gt;grok-build&lt;/code&gt; to authenticate and start your first session.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Orchestrating with Headless Mode
&lt;/h3&gt;

&lt;p&gt;For engineers building CI/CD pipelines or custom agent orchestration apps, Grok Build includes a headless mode (&lt;code&gt;-p&lt;/code&gt;). This allows you to trigger agents inside bash scripts and automations without human intervention.&lt;/p&gt;

&lt;p&gt;Here’s a quick example of how you might use Grok Build headlessly to automate dependency audits in a nightly CI run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Nightly Security &amp;amp; Dependency Audit via Grok Build&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Initiating Grok Build headless agent..."&lt;/span&gt;

&lt;span class="c"&gt;# Run Grok Build completely headless to audit and output a report&lt;/span&gt;
grok-build &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Audit the package.json for deprecated libraries. Generate a plan to update them to the latest stable versions, run the test suite, and output the results to audit_report.md."&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; audit_report.md &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;audit_report.md
  &lt;span class="c"&gt;# You could easily pipe this output directly into an automated PR reviewer tool&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Audit failed to generate."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 The Verdict: A Step Toward Autonomous Engineering
&lt;/h2&gt;

&lt;p&gt;Grok Build isn't just a new CLI; it’s a foundational piece of infrastructure for the next generation of AI-driven development. By combining human-in-the-loop approvals with massively parallel execution and automation readiness, xAI is positioning Grok as a serious contender for enterprise-grade Agentic workflows.&lt;/p&gt;

&lt;p&gt;Whether you are optimizing massive e-commerce architectures or just trying to build faster, having a fleet of parallel subagents living in your terminal is a massive leap forward.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you planning to test out Grok Build? How do you see terminal-native agents fitting into your daily workflow? Let's discuss in the comments below! 👇&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and follow for more deep dives into Agentic AI, LLM memory architectures, and the tooling that's shaping the future of software engineering.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>tools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>🚀 Cursor Just Dropped Composer 2.5: Why The AI Coding War Just Got Serious</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 20 May 2026 02:01:31 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/cursor-just-dropped-composer-25-why-the-ai-coding-war-just-got-serious-18pp</link>
      <guid>https://dev.to/siddhesh_surve/cursor-just-dropped-composer-25-why-the-ai-coding-war-just-got-serious-18pp</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnzhydynpypfuzghkywhx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnzhydynpypfuzghkywhx.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you thought AI coding assistants were already moving fast, buckle up. Cursor just dropped &lt;strong&gt;Composer 2.5&lt;/strong&gt;, their smartest and most capable coding model yet. &lt;/p&gt;

&lt;p&gt;While previous iterations were great at churning out boilerplate, Composer 2.5 represents a massive leap in handling &lt;em&gt;long-horizon&lt;/em&gt; coding work. We are talking about the kind of complex, multi-step problems that take hundreds of tool calls to get right.&lt;/p&gt;

&lt;p&gt;Here is everything you need to know about the update, the tech behind it, and why it is a big deal for developers. 👇&lt;/p&gt;




&lt;h3&gt;
  
  
  🧠 Smarter on the Hard Stuff
&lt;/h3&gt;

&lt;p&gt;The biggest bottleneck with AI coding tools has always been sustained context. They start strong, but lose the plot after a few files. &lt;/p&gt;

&lt;p&gt;Cursor tackled this head-on by scaling their training significantly. Composer 2.5 was trained on &lt;strong&gt;25x more synthetic RL (Reinforcement Learning) tasks&lt;/strong&gt; than its predecessor. &lt;/p&gt;

&lt;p&gt;However, simply scaling data is not enough. When a model's rollout spans hundreds of thousands of tokens, it becomes incredibly difficult to assign credit—meaning the AI struggles to know &lt;em&gt;which&lt;/em&gt; specific decision helped or hurt the outcome. &lt;/p&gt;

&lt;p&gt;To fix this, Cursor introduced &lt;strong&gt;targeted textual feedback during RL&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Instead of waiting for the end of a rollout to penalize the model, they provide feedback directly at the exact point where the model messed up. For example, if the model makes a bad tool call or provides a confusing explanation, it receives a localized hint describing the desired improvement. This shapes crucial behaviors like communication style and effort calibration, making the AI genuinely more pleasant to collaborate with.&lt;/p&gt;




&lt;h3&gt;
  
  
  💻 The Code: How to Leverage Long-Horizon Agents
&lt;/h3&gt;

&lt;p&gt;Because Composer 2.5 is built for sustained work, you can give it much more complex architecture tasks. Instead of asking for a single function, you can set up a &lt;code&gt;.cursorrules&lt;/code&gt; file to define a long-running agentic workflow.&lt;/p&gt;

&lt;p&gt;Here is an example of how you might instruct a long-horizon model like Composer 2.5 to autonomously refactor a legacy codebase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# .cursorrules&lt;/span&gt;

You are an expert systems architect. Your task is to refactor the legacy &lt;span class="sb"&gt;`auth`&lt;/span&gt; module into a modern, scalable service.

&lt;span class="gu"&gt;## Workflow Execution Steps:&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; &lt;span class="gs"&gt;**Analyze:**&lt;/span&gt; Read all files in the &lt;span class="sb"&gt;`/src/legacy_auth`&lt;/span&gt; directory.
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Plan:**&lt;/span&gt; Draft a migration plan and wait for my approval before writing code.
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Execute:**&lt;/span&gt; Implement the new JWT-based auth flow across all middleware.
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Test:**&lt;/span&gt; Generate unit tests for the new implementation. 
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Verify:**&lt;/span&gt; Run the tests using the terminal tool. If any fail, autonomously fix the errors until all tests pass.

&lt;span class="ge"&gt;*Note: If you encounter a missing dependency, use the terminal to install it.*&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With Composer 2.5's improved tool use and behavioral shaping, it can actually execute a multi-step loop like this without hallucinating halfway through.&lt;/p&gt;




&lt;h3&gt;
  
  
  🏗️ The Elephant in the Room: The Kimi Base
&lt;/h3&gt;

&lt;p&gt;There was a lot of community noise around Composer 2 being built on top of Moonshot AI's open-source Kimi K2.5 checkpoint. Cursor acknowledged this, and confirmed that Composer 2.5 also builds on the same Kimi K2.5 open-source checkpoint.&lt;/p&gt;

&lt;p&gt;However, Cursor's secret sauce is their post-training. The continued pretraining and massive RL pipeline are what give Composer its specific developer-centric "feel".&lt;/p&gt;




&lt;h3&gt;
  
  
  💸 The Pricing is Absurdly Good
&lt;/h3&gt;

&lt;p&gt;Despite matching or beating frontier models on benchmarks, Cursor has kept the price aggressively low.&lt;/p&gt;

&lt;p&gt;Composer 2.5 is priced identically to Composer 2: &lt;strong&gt;$0.50 per million input tokens and $2.50 per million output tokens&lt;/strong&gt;. This is a fraction of the cost of OpenAI's GPT-5.5 or Anthropic's Opus 4.7.&lt;/p&gt;




&lt;h3&gt;
  
  
  🚀 What's Next: The SpaceXAI Collab
&lt;/h3&gt;

&lt;p&gt;Cursor is not stopping here. They announced that they are currently working with &lt;strong&gt;SpaceXAI&lt;/strong&gt; to train a significantly larger model entirely from scratch. Using 10x more total compute on the Colossus 2 supercomputer, this upcoming model is expected to be a massive leap in capability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you using Composer 2.5 yet?&lt;/strong&gt; Drop your thoughts in the comments below! 👇&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>OpenAI Just Turned ChatGPT into a Financial Advisor (Here's How to Build Your Own)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 19 May 2026 02:26:18 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/openai-just-turned-chatgpt-into-a-financial-advisor-heres-how-to-build-your-own-1i6k</link>
      <guid>https://dev.to/siddhesh_surve/openai-just-turned-chatgpt-into-a-financial-advisor-heres-how-to-build-your-own-1i6k</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3riz6m8jk1ldfhi9rfi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3riz6m8jk1ldfhi9rfi.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you've been putting off organizing your finances, OpenAI just eliminated your last excuse. &lt;/p&gt;

&lt;p&gt;OpenAI has officially launched a new "Personal Finance" experience directly inside ChatGPT. This isn't just a prompt template or a custom GPT; this is a native, deep integration with your actual bank accounts, powered by Plaid. &lt;/p&gt;

&lt;p&gt;This marks OpenAI's biggest leap into consumer financial services, transforming the chatbot from a simple text generator into a highly personalized financial analyst. &lt;/p&gt;

&lt;p&gt;Here is exactly what this new feature does, the privacy implications you need to know about, and how you can build a similar workflow yourself using the Plaid and OpenAI APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤯 What the ChatGPT-Plaid Integration Actually Does
&lt;/h2&gt;

&lt;p&gt;The magic here is the context. Budgeting apps have existed for decades, but they typically only show you static charts and fixed categories. ChatGPT combines your raw financial data with conversational reasoning capabilities.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Massive Connectivity:&lt;/strong&gt; The Plaid integration allows ChatGPT to securely connect to over 12,000 financial institutions, including Chase, Fidelity, Schwab, Robinhood, American Express, and Capital One.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;The Financial Dashboard:&lt;/strong&gt; Once synced, ChatGPT automatically generates a unified dashboard displaying your portfolio performance, spending trends, recurring subscriptions, and upcoming payments.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Natural Language Analysis:&lt;/strong&gt; You can ask complex, contextual questions like, "What did my recent vacation actually cost me?" or "Help me build a plan to buy a house in my area in the next 5 years". &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This entire system is powered by OpenAI's newly updated &lt;strong&gt;GPT-5.5&lt;/strong&gt; model, which has been specifically fine-tuned and benchmarked with finance experts to handle personal finance queries with enhanced reasoning.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛡️ The Privacy and Security Catch
&lt;/h2&gt;

&lt;p&gt;Handing your entire financial history over to an AI model sounds like a security nightmare, but the architecture is strictly sandboxed to prevent catastrophic hallucinations. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Read-Only Access:&lt;/strong&gt; ChatGPT cannot move your money, pay bills, change account settings, or make trades. The connection is entirely read-only.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Data Deletion:&lt;/strong&gt; You can disconnect your accounts at any time from the settings menu. Once you do, OpenAI states that your synced financial data is deleted from their systems within 30 days.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Financial Memories:&lt;/strong&gt; ChatGPT saves contextual details about your mortgage, savings goals, or private loans as "Financial Memories" so it doesn't treat every query in isolation. You have full control to review and delete these memories at any point.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💸 The $200/Month Paywall
&lt;/h2&gt;

&lt;p&gt;There is one massive hurdle for the average user: &lt;strong&gt;Price&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;As of launch, this personal finance suite is exclusively available to &lt;strong&gt;ChatGPT Pro&lt;/strong&gt; subscribers located in the United States. The Pro tier currently costs a staggering $200 per month. While OpenAI plans to eventually roll this feature out to Plus ($20) and free users, there is no set timeline yet. &lt;/p&gt;

&lt;h2&gt;
  
  
  💻 Build It Yourself: The Developer Approach
&lt;/h2&gt;

&lt;p&gt;As developers, paying $2,400 a year to analyze our own data feels fundamentally wrong. If you want to replicate the core functionality of ChatGPT's new tool, you can build a streamlined Node.js pipeline using the Plaid API to fetch your transactions and the OpenAI API to analyze them.&lt;/p&gt;

&lt;p&gt;Here is a basic TypeScript implementation to get you started:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;PlaidApi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;PlaidEnvironments&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;plaid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Initialize the Plaid Client for sandbox/development&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;plaidClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PlaidApi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;basePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PlaidEnvironments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;development&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PLAID-CLIENT-ID&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PLAID_CLIENT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PLAID-SECRET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PLAID_SECRET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Initialize the OpenAI Client&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;analyzeMySpending&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Fetching transactions from Plaid...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Retrieve the last 30 days of transactions&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;plaidClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transactionsGet&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;access_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2026-04-18&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="na"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2026-05-18&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. Format the transaction data for the LLM context window&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transactions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; 
    &lt;span class="s2"&gt;`[&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;] &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - $&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (Category: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;personal_finance_category&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;primary&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unknown&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;
  &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Analyzing data with OpenAI...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 5. Send the structured data to GPT for financial reasoning&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Or route this to a local model if privacy is a top concern&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are an elite, analytical financial advisor. 
                  Review the user's transactions, identify the top 3 spending categories, 
                  flag any unusually high recurring subscriptions, and provide one actionable tip to increase their savings rate.`&lt;/span&gt; 
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Here is my transaction history for the last 30 days:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; 
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;### AI Financial Report ###&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;aiResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By hooking this script up to a simple cron job and piping the output to Slack or a local dashboard, you can build your own automated financial analyst for pennies on the dollar compared to the Pro subscription.&lt;/p&gt;

&lt;p&gt;The AI landscape is rapidly moving from simple text generation to autonomous, data-connected workflows. OpenAI's move into personal finance is a massive indicator that foundation models are pushing hard into our most critical personal infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Make sure to never commit your Plaid API keys, and always keep your dependencies updated!)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openai</category>
      <category>ai</category>
      <category>fintech</category>
      <category>webdev</category>
    </item>
    <item>
      <title>🚀 Meta Just Killed Open Source Llama: Welcome to the 'Muse Spark' Era (And What It Means for Developers)</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Thu, 14 May 2026 02:52:29 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/meta-just-killed-open-source-llama-welcome-to-the-muse-spark-era-and-what-it-means-for-22fi</link>
      <guid>https://dev.to/siddhesh_surve/meta-just-killed-open-source-llama-welcome-to-the-muse-spark-era-and-what-it-means-for-22fi</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffty9vt46jd0o6fdrseyw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffty9vt46jd0o6fdrseyw.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the last two years, the developer ecosystem has heavily relied on Meta as the champion of open-weight models. We built our local pipelines around Llama 2 and Llama 3, assuming the open-source train would keep rolling. &lt;/p&gt;

&lt;p&gt;That era has officially ended. &lt;/p&gt;

&lt;p&gt;Meta has pivoted away from its open-source Llama strategy, introducing a closed, proprietary AI model called &lt;strong&gt;Muse Spark&lt;/strong&gt;. This isn't just a backend update; it is a fundamental architectural shift that ties natively into the new Meta Glasses and fundamentally changes how we build agentic workflows.&lt;/p&gt;

&lt;p&gt;Having spent over 12 years in the industry—navigating the shifts from legacy Microsoft server architectures to modern distributed systems—I can tell you that platform pivots of this magnitude dictate the next five years of engineering. When you manage large-scale data infrastructure and ML optimization systems, you look for the underlying architectural changes, not just the marketing buzz. &lt;/p&gt;

&lt;p&gt;Here is a deep dive into Muse Spark, the new "Contemplating Mode," and how you can migrate your TypeScript apps to the new proprietary API. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🛑 1. The End of Open Weights
&lt;/h2&gt;

&lt;p&gt;Let's address the elephant in the room. For all practical purposes, Meta has abandoned developing frontier Llama models in favor of the cloud-only Muse Spark. &lt;/p&gt;

&lt;p&gt;Muse Spark was built from scratch by Meta's Superintelligence Labs with entirely new infrastructure and data pipelines. There are no downloadable weights, no self-hosting capabilities, and no clear migration path from your existing local Llama setups. &lt;/p&gt;

&lt;p&gt;If you are building enterprise applications, you now face a choice: stick with older open-source models, migrate to competitors like Mistral or Qwen, or rewrite your vendor-specific APIs to adopt Meta's new proprietary endpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 2. "Contemplating Mode": A Masterclass in ML Optimization
&lt;/h2&gt;

&lt;p&gt;While the loss of open weights hurts, the engineering behind Muse Spark is undeniably impressive. &lt;/p&gt;

&lt;p&gt;In optimizing large-scale ML systems, we constantly battle inference costs and latency. Meta tackled this not just by scaling parameters, but by changing &lt;em&gt;how&lt;/em&gt; the model reasons. Muse Spark introduces a feature called &lt;strong&gt;Contemplating Mode&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Instead of relying on a single, linear chain of thought, Contemplating Mode launches multiple agents that propose solutions, refine them, and aggregate the results in parallel. Furthermore, Meta utilized reinforcement learning to penalize the model for using excessive reasoning tokens—a process they call "thought compression". &lt;/p&gt;

&lt;p&gt;This parallel agent orchestration allows Muse Spark to achieve better performance on complex tasks while incurring latency comparable to much simpler models. &lt;/p&gt;

&lt;h2&gt;
  
  
  🕶️ 3. Meta Glasses &amp;amp; The Voice Mode Integration
&lt;/h2&gt;

&lt;p&gt;The true power of Muse Spark isn't in a browser tab; it is integrated directly into hardware. &lt;/p&gt;

&lt;p&gt;Meta AI, built with Muse Spark, is the core engine powering the voice and multimodal interfaces of the Meta Ray-Ban smart glasses. These glasses are equipped with a 12 MP camera, a six-microphone array system, and a Qualcomm Snapdragon AR1 Gen1 processor. &lt;/p&gt;

&lt;p&gt;Because Muse Spark is natively multimodal (handling text, image, and speech inputs up to 262,000 tokens), it allows the glasses to perform real-time computer vision and voice reasoning. You aren't just dictating text; the AI is actively processing your visual environment and responding contextually through the open-ear speakers. &lt;/p&gt;

&lt;h2&gt;
  
  
  💻 4. The Code: Implementing the New API
&lt;/h2&gt;

&lt;p&gt;If you are ready to make the jump, Meta maintains official client SDKs for the new API, including a dedicated &lt;code&gt;llama-api-typescript&lt;/code&gt; package available on npm. &lt;/p&gt;

&lt;p&gt;Here is a quick look at how you might orchestrate a multi-modal request using the new proprietary TypeScript SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LlamaAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;llama-api-typescript&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Official Meta SDK&lt;/span&gt;

&lt;span class="c1"&gt;// Initialize the client (ensure LLAMA_API_KEY is set in your environment)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LlamaAPIClient&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;analyzeVisualEnvironment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;base64Image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🚀 Initiating Muse Spark Multimodal Analysis...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;muse-spark-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
      &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; 
          &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are an autonomous visual assistant. Analyze the provided image and outline a step-by-step physical action plan.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; 
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; 
          &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is the fastest way to disassemble the hardware shown in this image?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image_url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`data:image/jpeg;base64,&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;base64Image&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="c1"&gt;// Leveraging the new parallel reasoning architecture&lt;/span&gt;
      &lt;span class="na"&gt;extra_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;enable_contemplating_mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error communicating with Muse Spark API:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: While the API retains the "Llama" naming convention for the SDKs, the backend is routing to the new proprietary architecture.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🔮 The Takeaway
&lt;/h2&gt;

&lt;p&gt;The barrier to entry for building AI wrappers just got higher. With models like Muse Spark natively handling complex, multi-agent orchestration, developers need to focus on deep systems integration rather than just prompt engineering.&lt;/p&gt;

&lt;p&gt;We are moving away from the era of hacking together local LLMs and entering a phase where proprietary, cloud-hosted models dictate the hardware ecosystems we wear on our faces.&lt;/p&gt;

&lt;p&gt;Are you planning to migrate your applications to the new Muse Spark API, or are you sticking with the remaining open-source alternatives? &lt;strong&gt;Let me know in the comments below!&lt;/strong&gt; 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this technical breakdown helpful, drop a ❤️ and bookmark this post! I'll be doing a complete, hands-on teardown of the new SDK and agent orchestration patterns over on the **AI Tooling Academy&lt;/em&gt;* channel soon, so stay tuned.*&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>meta</category>
    </item>
    <item>
      <title>🕵️‍♂️ Google's "Gemini Omni" Just Leaked: The Secret Multimodal Weapon for Google I/O</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Wed, 13 May 2026 03:01:45 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/googles-gemini-omni-just-leaked-the-secret-multimodal-weapon-for-google-io-2bfl</link>
      <guid>https://dev.to/siddhesh_surve/googles-gemini-omni-just-leaked-the-secret-multimodal-weapon-for-google-io-2bfl</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdqr02owro7fq6lz6jf5e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdqr02owro7fq6lz6jf5e.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been following the AI arms race this year, you know the vibe is currently "Multimodal or Bust." OpenAI has been teasing its massive visual updates, but Google isn't about to let its home turf at &lt;strong&gt;Google I/O&lt;/strong&gt; go uncontested.&lt;/p&gt;

&lt;p&gt;According to a massive new leak reported by &lt;em&gt;TestingCatalog&lt;/em&gt;, Google is internally testing a next-generation model dubbed &lt;strong&gt;"Gemini Omni."&lt;/strong&gt; This isn't just another incremental update to the Gemini 2.0 or 3.0 lines; this is a native, high-fidelity video-to-audio model designed for real-time interaction.&lt;/p&gt;

&lt;p&gt;If you’re a developer building the next generation of "eyes and ears" for AI agents, this leak just changed your roadmap. Here is what we know about Omni, how it competes with Nano Banana 2, and what the code might look like. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🎥 What is "Gemini Omni"?
&lt;/h2&gt;

&lt;p&gt;The "Omni" designation suggests a unified architecture. While earlier models often relied on separate "vision" and "language" encoders that passed tokens back and forth, Omni is rumored to be a &lt;strong&gt;native multimodal&lt;/strong&gt; model. &lt;/p&gt;

&lt;p&gt;This means it doesn't just "describe" a video frame by frame; it understands the temporal flow of video and audio simultaneously. The leaks point toward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero-Latency Video Reasoning:&lt;/strong&gt; Analyzing live camera feeds with under 200ms of lag.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Audio-Visual Sync:&lt;/strong&gt; Generating realistic audio cues based on visual events (and vice versa).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic Video Control:&lt;/strong&gt; The ability for an AI to "watch" a screen and execute mouse/keyboard actions natively.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ⚔️ The Battle for the "Omni" Title
&lt;/h2&gt;

&lt;p&gt;The timing is spicy. Google is clearly positioning this to counter OpenAI's visual capabilities, but they are also competing with their own internal heavy hitters like &lt;strong&gt;Nano Banana 2&lt;/strong&gt; (the current state-of-the-art for image generation). &lt;/p&gt;

&lt;p&gt;While Nano Banana 2 focuses on high-fidelity image composition, Gemini Omni is built for the &lt;strong&gt;stream&lt;/strong&gt;. For those of us building in the Ads or E-commerce space—where real-time product recognition and visual search are the "Holy Grail"—Omni could be the infrastructure that finally makes "Visual Commerce" viable for the masses.&lt;/p&gt;

&lt;h2&gt;
  
  
  💻 Speculative Implementation: Real-Time Video Analysis
&lt;/h2&gt;

&lt;p&gt;Based on the current Gemini 2.0 Pro API structures, we can anticipate how Omni will handle live video streams. Instead of uploading a static &lt;code&gt;.mp4&lt;/code&gt;, we'll likely be dealing with &lt;strong&gt;MediaStream&lt;/strong&gt; chunks.&lt;/p&gt;

&lt;p&gt;Here is how you might soon implement a "Visual Support Agent" using the Gemini Omni SDK in TypeScript:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@google/generative-ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 🚀 Speculative: Using the new 'omni-video' model&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini-omni-preview&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;startVisualSupport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;videoStream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MediaStream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🎥 Omni is now 'watching' the support session...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startChat&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Help the customer troubleshoot the hardware setup they are showing on camera.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Streaming frames directly to the model for real-time reasoning&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendMessageStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;video_stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;videoStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;audio_sync&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 👈 New Omni-specific flag for audio-visual alignment&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunkText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// The agent can 'see' the user plugging in the wrong cable in real-time&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧠 Why This Matters for Engineering Managers
&lt;/h2&gt;

&lt;p&gt;As an Engineering Manager leading AI initiatives, the arrival of Omni shifts the "Build vs. Buy" calculation for visual AI. &lt;/p&gt;

&lt;p&gt;We are moving away from needing a massive team of CV (Computer Vision) experts to train custom models for object detection. Instead, we can now leverage &lt;strong&gt;foundation video models&lt;/strong&gt; like Omni to handle the heavy lifting, allowing us to focus on the &lt;strong&gt;agentic orchestration&lt;/strong&gt; and the business logic.&lt;/p&gt;

&lt;p&gt;If Omni delivers on the leaked promise of low-latency video reasoning, it will be the final piece of the puzzle for "Workspace Agents" that can actually sit "next" to you, watch your workflow, and offer real-time peer review on your code or designs.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 The Verdict
&lt;/h2&gt;

&lt;p&gt;Google I/O is usually full of "coming soon" promises, but the presence of Omni on the LM Arena and in internal testing suggests a public developer preview is imminent. &lt;/p&gt;

&lt;p&gt;I’ll be doing a deep dive into the specific API limits and throughput benchmarks over on the &lt;strong&gt;AI Tooling Academy&lt;/strong&gt; channel the moment the docs go live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you ready to give your apps a set of eyes, or are the privacy implications of a "live-watching" model still too high for your users?&lt;/strong&gt; Let's discuss in the comments! 👇&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>google</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>🚀 The "Vibe Coding" Era is Over: What AI Founders Are Building Instead</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 05 May 2026 02:56:48 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-vibe-coding-era-is-over-what-ai-founders-are-building-instead-493m</link>
      <guid>https://dev.to/siddhesh_surve/the-vibe-coding-era-is-over-what-ai-founders-are-building-instead-493m</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci46myg7s466wpk8eojd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci46myg7s466wpk8eojd.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been paying attention to the venture capital space, you likely caught Ann Miura-Ko’s latest insights making the rounds on X. The message from top-tier Silicon Valley investors is becoming incredibly clear: the days of hacking together a thin UI over an OpenAI API key and calling it a disruptive startup are coming to a hard stop in 2026.&lt;/p&gt;

&lt;p&gt;Founders are being pushed to build &lt;em&gt;Minimum Viable Companies&lt;/em&gt;, not just Minimum Viable Products. The market is completely saturated with basic AI wrappers. What is actually getting funded and gaining real traction right now? Deep, infrastructural utility.&lt;/p&gt;

&lt;p&gt;Here is exactly how the engineering meta is shifting, and what you should be focusing on if you want to build something that lasts.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. 🛑 Stop Building Wrappers, Start Building Workflows
&lt;/h3&gt;

&lt;p&gt;The first wave of generative AI was all about &lt;em&gt;generation&lt;/em&gt;. The next wave is all about &lt;em&gt;orchestration&lt;/em&gt;. Users don't want another chatbot sitting in a browser tab; they want autonomous systems that remove entire categories of work from their plates. &lt;/p&gt;

&lt;p&gt;If your application just takes user text, sends it to an LLM, and prints the result, you don't have a technical moat. You have a feature that will inevitably be sherlocked by the platform providers themselves. &lt;/p&gt;

&lt;h3&gt;
  
  
  2. 🏗️ The Move to Agentic Infrastructure
&lt;/h3&gt;

&lt;p&gt;Instead of simple request-response cycles, successful products are moving toward agentic infrastructure. This means your code needs to handle state, memory, error recovery, and tool execution in the background.&lt;/p&gt;

&lt;p&gt;Developing the &lt;code&gt;secure-pr-reviewer&lt;/code&gt; GitHub App and deploying it to production on Railway back in January 2026 required exactly this kind of architectural shift. It wasn't enough to just send raw code snippets to an API. Building it required a robust TypeScript and Node.js backend to listen for webhooks, parse the abstract syntax tree of the repository, run the AI security audit, and intelligently comment back on the exact lines of code inside the pull request.&lt;/p&gt;

&lt;p&gt;Here is a simplified look at how that kind of event-driven, agentic infrastructure is structured in Node.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;probot&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;analyzeCodeSecurity&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../services/ai-auditor&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Probot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.opened&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pull_request.synchronize&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pullRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Fetch the actual diff to provide context, not just a raw prompt&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pulls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;mediaType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;diff&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Initiating security audit for PR #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prDetails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pull_number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// The AI service handles the deep reasoning and logic assessment&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;securityReport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;analyzeCodeSecurity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vulnerabilitiesFound&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewComment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`### 🛡️ Automated Security Audit\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;securityReport&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;markdownSummary&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="c1"&gt;// Agent autonomously injects its findings into the human workflow&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;octokit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewComment&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the massive value lies: taking a complex, multi-step human workflow (like reviewing a PR for security vulnerabilities) and automating it entirely in the background so the engineering team doesn't even have to think about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. 📉 The Rise of the "Micro-Team"
&lt;/h3&gt;

&lt;p&gt;Because AI is handling so much of the boilerplate scaffolding and testing, we are seeing the rise of hyper-efficient micro-teams. You don't need a massive engineering pod to ship a scalable MVP anymore. You need one or two deeply technical founders who understand systems architecture and can leverage AI to write the functional components.&lt;/p&gt;

&lt;p&gt;But this requires a solid understanding of fundamental computer science. If you let the AI write the code, &lt;em&gt;you&lt;/em&gt; still have to design the system. &lt;/p&gt;

&lt;h3&gt;
  
  
  💡 The Takeaway
&lt;/h3&gt;

&lt;p&gt;The barrier to building software has dropped to zero, which means the baseline expectations for a startup have skyrocketed. As investors point out, the market is looking for true substance and organic product-market fit. &lt;/p&gt;

&lt;p&gt;To win in 2026, stop optimizing your prompts and start optimizing your architectures. Build systems, build workflows, and build real companies. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;What are you building right now? Are you seeing this same shift away from simple AI wrappers in your own circles? Let's discuss in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>startup</category>
      <category>webdev</category>
      <category>typescript</category>
    </item>
    <item>
      <title>🚨 The "Context Window" is Dead: Anthropic Just Gave Claude Agents Permanent Memory</title>
      <dc:creator>Siddhesh Surve</dc:creator>
      <pubDate>Tue, 28 Apr 2026 02:32:24 +0000</pubDate>
      <link>https://dev.to/siddhesh_surve/the-context-window-is-dead-anthropic-just-gave-claude-agents-permanent-memory-52hd</link>
      <guid>https://dev.to/siddhesh_surve/the-context-window-is-dead-anthropic-just-gave-claude-agents-permanent-memory-52hd</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp29h7qnpnruox7s06v8b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp29h7qnpnruox7s06v8b.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve been building with AI over the last year, you know the absolute biggest bottleneck in agentic engineering: &lt;strong&gt;The Goldfish Problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You spend hours crafting the perfect system prompt. You deploy your AI agent to handle a complex task. It does a great job. But the second that session ends? &lt;em&gt;Poof.&lt;/em&gt; The agent forgets everything. &lt;/p&gt;

&lt;p&gt;To fix this, developers have been duct-taping together complex Vector DBs, RAG pipelines, and rolling context windows just to give their agents a basic sense of object permanence. It is exhausting, expensive, and fragile. &lt;/p&gt;

&lt;p&gt;But as of this week, the game has completely changed. &lt;strong&gt;Anthropic just launched Memory for Claude Managed Agents in public beta&lt;/strong&gt;, and it fundamentally shifts how we will build autonomous systems. &lt;/p&gt;

&lt;p&gt;Here is everything you need to know about the update, why it's better than standard RAG, and how to implement it in your code today. 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 What is Claude Agent Memory?
&lt;/h2&gt;

&lt;p&gt;Unlike standard chatbot interactions where context is lost when the window closes, Anthropic’s new Memory feature allows Claude Managed Agents to accumulate knowledge &lt;em&gt;across different sessions&lt;/em&gt; over time. &lt;/p&gt;

&lt;p&gt;But here is the truly brilliant part: &lt;strong&gt;It is a filesystem-based layer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data isn't just floating in a black-box vector space. Claude stores its memories as actual files. This means your agents can read, write, and reference a continuous state, while you (the developer) maintain absolute programmatic control over what is being stored. Early enterprise adopters like Netflix and Rakuten are already using it to automate complex, long-running workflows without constantly having to update manual prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛡️ The "Audit Trail" Superpower
&lt;/h2&gt;

&lt;p&gt;If you are building tools for enterprise, standard RAG pipelines are a compliance nightmare. If an AI hallucinates or leaks data, figuring out &lt;em&gt;why&lt;/em&gt; it retrieved that specific piece of information is incredibly difficult. &lt;/p&gt;

&lt;p&gt;Anthropic designed this new memory system with enterprise governance built-in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Full Auditability:&lt;/strong&gt; Every single memory change is logged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Granular Control:&lt;/strong&gt; You have an audit trail for each session and agent. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rollbacks:&lt;/strong&gt; You can programmatically roll back, redact, or delete specific memories if the agent learns something incorrect or sensitive.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💻 Building a "Smart" PR Reviewer in TypeScript
&lt;/h2&gt;

&lt;p&gt;To understand how powerful this is, let's look at a real-world scenario. &lt;/p&gt;

&lt;p&gt;Imagine you are building a production-ready GitHub App—let's call it &lt;code&gt;secure-pr-reviewer&lt;/code&gt;—using TypeScript and Node.js. &lt;/p&gt;

&lt;p&gt;Without memory, your AI reviewer treats every single Pull Request in a vacuum. It might flag the same internal, safe utility function as a "security risk" 100 times, infuriating your senior engineers who have to manually dismiss the warning every time.&lt;/p&gt;

&lt;p&gt;With Claude's new Memory API, the agent &lt;em&gt;learns&lt;/em&gt; from the team. If a senior dev tells the agent, "This auth pattern is expected in the legacy module," the agent remembers it for the next PR. &lt;/p&gt;

&lt;p&gt;Here is what the implementation logic looks like using the new Managed Agents API paradigm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Assume this webhook fires when a new PR is opened&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handlePullRequestEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[secure-pr-reviewer] Auditing PR #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;...`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 1. Initialize or resume a Managed Agent Session with Memory enabled&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CLAUDE_SECURITY_AGENT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Your pre-configured agent&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`repo-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Scope memory to this specific repo&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Send the PR diff to the agent&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Audit the following diff for security flaws. 
                  Remember our past conversations about approved legacy patterns.
                  \n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; 
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// The agent uses its filesystem memory to check past developer feedback&lt;/span&gt;
  &lt;span class="c1"&gt;// before generating the final report.&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;VULNERABILITY_FOUND&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;postGitHubComment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a developer replies to the bot's comment on GitHub saying, &lt;em&gt;"Ignore this specific file path in the future, it's a mock database for testing,"&lt;/em&gt; you simply pass that message back into the session. Claude writes that rule to its memory layer, and it will &lt;em&gt;never&lt;/em&gt; flag that file again. &lt;/p&gt;

&lt;p&gt;No database schemas to update. No RAG pipeline to re-index. The agent just gets smarter. &lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 The Era of Stateful AI
&lt;/h2&gt;

&lt;p&gt;We are officially moving from stateless functions to stateful, autonomous teammates. By providing a transparent, auditable, filesystem-based memory layer, Anthropic is removing the biggest friction point for enterprise AI adoption.&lt;/p&gt;

&lt;p&gt;The feature is available in public beta right now via the Claude Console and APIs. &lt;/p&gt;

&lt;p&gt;Are you going to rip out your custom Vector DBs and switch to native Agent Memory? Let me know what you think of the update in the comments below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this breakdown helpful, drop a ❤️ and bookmark the code snippet for your next agentic side project!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>node</category>
    </item>
  </channel>
</rss>
