<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Richard Dillon</title>
    <description>The latest articles on DEV Community by Richard Dillon (@richard_dillon_b9c238186e).</description>
    <link>https://dev.to/richard_dillon_b9c238186e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849330%2F15d5a6d5-ef2a-430b-9760-3ac77ede5242.png</url>
      <title>DEV Community: Richard Dillon</title>
      <link>https://dev.to/richard_dillon_b9c238186e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/richard_dillon_b9c238186e"/>
    <language>en</language>
    <item>
      <title>Deep Agents: Building Long-Running Autonomous Agents with LangChain's New Framework</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 20 Apr 2026 12:03:14 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/deep-agents-building-long-running-autonomous-agents-with-langchains-new-framework-1bpn</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/deep-agents-building-long-running-autonomous-agents-with-langchains-new-framework-1bpn</guid>
      <description>&lt;h1&gt;
  
  
  Deep Agents: Building Long-Running Autonomous Agents with LangChain's New Framework
&lt;/h1&gt;

&lt;p&gt;The era of single-turn, reactive agents is ending. As engineering teams push toward workflows that span hours instead of seconds—internal coding pipelines, multi-stage research synthesis, iterative design systems—the limitations of flat tool-calling architectures become painfully obvious. You cannot orchestrate a code review that spawns test generation, runs CI, waits for human approval, and then deploys without a fundamentally different approach to agent architecture. LangChain's Deep Agents framework, announced alongside &lt;a href="https://www.langchain.com/blog/march-2026-langchain-newsletter" rel="noopener noreferrer"&gt;Deep Agents Deploy&lt;/a&gt; in March 2026, represents the clearest attempt yet to provide production-grade abstractions for this class of problem.&lt;/p&gt;

&lt;p&gt;This isn't just another wrapper. Deep Agents introduces a layered architecture where planning loops, persistent memory, and sub-agent delegation are first-class concerns—not afterthoughts bolted onto a ReAct loop. The framework sits deliberately above LangGraph (which gives you fine-grained state machine control) and the standard LangChain abstractions (which optimize for quick iteration). If you're building agents that need to survive restarts, coordinate specialized sub-agents, and maintain coherent long-term memory across sessions, this is the stack to understand.&lt;/p&gt;

&lt;p&gt;The use cases driving adoption are telling: &lt;a href="https://www.langchain.com/blog/march-2026-langchain-newsletter" rel="noopener noreferrer"&gt;Open SWE&lt;/a&gt; for internal coding agents that autonomously fix bugs across repositories, GTM workflow automation that coordinates across CRM systems and communication channels, and design iteration systems like the Moda case study where agents loop through revision cycles with human feedback gates. In this deep-dive, we'll dissect the architecture, walk through the key APIs, examine deployment considerations, and build a working research-then-draft agent from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Deep Agents Architecture: Planning, Memory, and Sub-Agents
&lt;/h2&gt;

&lt;p&gt;The Deep Agents framework rests on three architectural pillars that distinguish it from simpler agent implementations: planning loops that decompose goals into task graphs, persistent memory that survives across sessions and crashes, and sub-agent delegation that enables specialized capabilities to run in parallel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Planning Mechanism
&lt;/h3&gt;

&lt;p&gt;Traditional agents operate on flat tool-calling sequences—the model decides the next action based on the current state, executes it, and repeats. This works for simple tasks but falls apart when you need hierarchical goal decomposition. Deep Agents introduces a planning layer that generates task graphs before execution begins. The planning model analyzes the high-level objective, identifies dependencies between subtasks, and produces a directed acyclic graph (DAG) of work items. Each node in this graph can represent a direct tool call, a sub-agent invocation, or a human approval checkpoint.&lt;/p&gt;

&lt;p&gt;This approach parallels findings from the &lt;a href="https://arxiv.org/pdf/2602.07359" rel="noopener noreferrer"&gt;W&amp;amp;D paper on parallel tool calling&lt;/a&gt;, which demonstrated that research agents achieve significant speedups when tool calls without mutual dependencies execute concurrently. Deep Agents extends this to sub-agent coordination: a parent agent can spawn multiple child agents in a single planning step, wait for their results in parallel, and then synthesize findings before proceeding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory Subsystem
&lt;/h3&gt;

&lt;p&gt;The memory architecture draws heavily from recent research on temporal reasoning in conversational AI. The &lt;a href="https://arxiv.org/html/2604.14362v1" rel="noopener noreferrer"&gt;APEX-MEM framework&lt;/a&gt; introduced property graphs with timestamp annotations for resolving when facts were learned and whether they remain valid. Deep Agents implements a similar approach: events are stored in append-only logs with temporal metadata, and the memory retrieval system can answer queries like "what did we learn about this repository before the last deployment?" rather than just "what do we know about this repository?"&lt;/p&gt;

&lt;p&gt;The MongoDB integration announced in the &lt;a href="https://blog.langchain.com/announcing-the-langchain-mongodb-partnership-the-ai-agent-stack-that-runs-on-the-database-you-already-trust/" rel="noopener noreferrer"&gt;LangChain + MongoDB partnership&lt;/a&gt; provides durable checkpointing out of the box. Every state transition, planning decision, and sub-agent result gets persisted to MongoDB Atlas, enabling crash recovery and long-running workflows that span days.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sub-Agent Delegation
&lt;/h3&gt;

&lt;p&gt;Sub-agents aren't just function calls with extra steps—they're fully autonomous agents with their own planning loops, memory contexts, and tool access. The parent agent maintains a registry of available sub-agents with capability descriptions, and the planning model can decide to delegate tasks to the most appropriate specialist. This mirrors the "harness" concept from LangChain's architecture documentation: the separation between orchestration logic (what needs to happen and when) and compute execution (the actual work). The parent agent is the orchestrator; sub-agents are the compute layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Code Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's build a research-then-draft agent that demonstrates the core Deep Agents patterns. This agent accepts a topic, spawns specialized sub-agents for search and summarization, uses long-term memory to avoid redundant work, and produces a structured Markdown report.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
research_agent.py
A Deep Agent that coordinates search and summarization sub-agents
to produce research reports with persistent memory.

Requires:
- langchain-core &amp;gt;= 0.3.42
- langchain-deepagents &amp;gt;= 0.1.8
- langchain-mongodb &amp;gt;= 0.2.1
- langchain-community &amp;gt;= 0.3.20 (for TavilySearchResults)
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_deepagents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DeepAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SubAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PlanningConfig&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_mongodb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MongoDBCheckpointer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MongoDBMemory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TavilySearchResults&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatAnthropic&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the planning model - Claude 3.5 Sonnet works well for planning tasks
&lt;/span&gt;&lt;span class="n"&gt;planning_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;  &lt;span class="c1"&gt;# Low temperature for more deterministic planning
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure MongoDB checkpointer for crash recovery
# This persists every state transition to Atlas
&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoDBCheckpointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;connection_string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;database_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deep_agents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_agent_checkpoints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Enable TTL for automatic cleanup of old checkpoints
&lt;/span&gt;    &lt;span class="n"&gt;ttl_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;86400&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;  &lt;span class="c1"&gt;# 30 days retention
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Long-term memory backend with temporal resolution
# Stores facts with timestamps for "when did we learn this?" queries
&lt;/span&gt;&lt;span class="n"&gt;memory_backend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoDBMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;connection_string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;database_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deep_agents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Enable vector search for semantic retrieval
&lt;/span&gt;    &lt;span class="n"&gt;vector_index_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory_vector_index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding_dimensions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define the search sub-agent's tool
&lt;/span&gt;&lt;span class="n"&gt;search_tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TavilySearchResults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;search_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;advanced&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;include_raw_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_key_claims&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Extract key claims and their supporting evidence from content.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# In production, this would call a specialized extraction model
&lt;/span&gt;    &lt;span class="c1"&gt;# Here we're showing the tool interface pattern
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claim&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;]}]&lt;/span&gt;

&lt;span class="c1"&gt;# Define the search sub-agent
&lt;/span&gt;&lt;span class="n"&gt;search_subagent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SubAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Searches the web for current information on a topic. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use this for gathering primary sources and recent developments.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_tool&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="c1"&gt;# Sub-agents can have their own planning constraints
&lt;/span&gt;    &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Memory scope: this sub-agent shares memory with parent
&lt;/span&gt;    &lt;span class="n"&gt;memory_scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shared&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define the summarization sub-agent
&lt;/span&gt;&lt;span class="n"&gt;summarization_subagent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SubAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarization_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Condenses and synthesizes information from multiple sources. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use this after gathering raw sources to produce coherent summaries.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;extract_key_claims&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memory_scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shared&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure planning behavior
&lt;/span&gt;&lt;span class="n"&gt;planning_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PlanningConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;# Maximum depth of planning recursion
&lt;/span&gt;    &lt;span class="n"&gt;max_planning_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Timeout for individual sub-agent invocations (seconds)
&lt;/span&gt;    &lt;span class="n"&gt;sub_agent_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Enable parallel sub-agent execution when dependencies allow
&lt;/span&gt;    &lt;span class="n"&gt;parallel_execution&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Replan if a sub-agent fails rather than failing the whole workflow
&lt;/span&gt;    &lt;span class="n"&gt;replan_on_failure&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the Deep Agent
&lt;/span&gt;&lt;span class="n"&gt;research_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DeepAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_report_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Produces comprehensive research reports by coordinating &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search and summarization specialists.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;planning_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;planning_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memory_backend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory_backend&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sub_agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_subagent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summarization_subagent&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;planning_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;planning_config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Path to custom instructions file
&lt;/span&gt;    &lt;span class="n"&gt;instructions_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./AGENTS.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Example invocation
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Generate a research report on the given topic.
    Uses session_id for memory continuity across invocations.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;research_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;objective&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research the topic &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; and produce a &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;structured Markdown report with citations.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;markdown&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_sources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="c1"&gt;# Enable LangSmith tracing for observability
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;callbacks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;  &lt;span class="c1"&gt;# LangSmith callbacks auto-inject when configured
&lt;/span&gt;            &lt;span class="c1"&gt;# Memory retrieval configuration
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory_config&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve_before_planning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_memory_items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="c1"&gt;# Skip sources we've already processed in this session
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dedupe_by&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;

&lt;span class="c1"&gt;# Run the agent
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

    &lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;generate_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Recent advances in parallel tool calling for AI agents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-session-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;AGENTS.md&lt;/code&gt; file referenced above provides custom instructions that guide agent behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Research Report Agent Instructions&lt;/span&gt;

&lt;span class="gu"&gt;## Planning Guidelines&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Always check memory for previously researched sources before searching
&lt;span class="p"&gt;-&lt;/span&gt; Spawn search_agent first to gather sources, then summarization_agent to synthesize
&lt;span class="p"&gt;-&lt;/span&gt; If fewer than 3 high-quality sources found, replan with broader search terms

&lt;span class="gu"&gt;## Output Requirements&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Include inline citations linking to source URLs
&lt;span class="p"&gt;-&lt;/span&gt; Structure reports with: Executive Summary, Key Findings, Detailed Analysis, Sources
&lt;span class="p"&gt;-&lt;/span&gt; Flag any conflicting information between sources

&lt;span class="gu"&gt;## Safety Constraints&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Do not include speculation presented as fact
&lt;span class="p"&gt;-&lt;/span&gt; Mark any information older than 6 months as potentially outdated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To view execution traces and debug planning decisions, access the LangSmith dashboard where each planning step, sub-agent invocation, and memory operation appears as a nested span with latency and token cost attribution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment Patterns: Deep Agents Deploy and Sandbox Execution
&lt;/h2&gt;

&lt;p&gt;Running long-running agents in production requires infrastructure that most teams don't want to build themselves. Deep Agents Deploy, &lt;a href="https://www.langchain.com/blog/march-2026-langchain-newsletter" rel="noopener noreferrer"&gt;announced as "an open alternative to Claude Managed Agents"&lt;/a&gt;, provides a hosted runtime specifically designed for agents that execute over minutes or hours rather than seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Infrastructure Abstraction
&lt;/h3&gt;

&lt;p&gt;Deep Agents Deploy handles the unglamorous but critical concerns: process isolation, crash recovery, timeout management, and resource allocation. When your agent spawns ten sub-agents in parallel for a research task, the runtime distributes these across worker pools and manages backpressure. When a network partition causes a checkpoint failure, the runtime automatically retries from the last durable state.&lt;/p&gt;

&lt;p&gt;This separation mirrors the architecture pattern OpenAI introduced in their &lt;a href="https://openai.com/index/the-next-evolution-of-the-agents-sdk/" rel="noopener noreferrer"&gt;Agents SDK evolution&lt;/a&gt;: the harness (orchestration logic) runs on managed infrastructure while compute (the actual LLM calls and tool executions) can be distributed across different environments. Deep Agents Deploy implements this pattern with first-class support for LangGraph-based sub-agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hardware Acceleration
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.langchain.com/blog/nvidia-enterprise" rel="noopener noreferrer"&gt;NVIDIA enterprise integration&lt;/a&gt; enables hardware acceleration for compute-intensive sub-agents. When a Deep Agent spawns multiple sub-agents that each need to process large documents or run inference on specialized models, the &lt;code&gt;langchain-nvidia&lt;/code&gt; package routes these to NIM microservices running on GPU infrastructure. This becomes significant when your planning DAG has wide parallelism—ten sub-agents each running a 70B parameter model benefit substantially from NIM's optimized batch inference.&lt;/p&gt;

&lt;h3&gt;
  
  
  Durability and Cost
&lt;/h3&gt;

&lt;p&gt;MongoDB-backed checkpoints provide durability guarantees that in-memory state cannot. If your agent is three hours into a complex workflow and the host process dies, resumption picks up from the last checkpoint rather than starting over. The &lt;a href="https://blog.langchain.com/announcing-the-langchain-mongodb-partnership-the-ai-agent-stack-that-runs-on-the-database-you-already-trust/" rel="noopener noreferrer"&gt;MongoDB partnership&lt;/a&gt; specifically targets this use case: vector search for memory retrieval and document checkpointing unified in a single database you're likely already running.&lt;/p&gt;

&lt;p&gt;Cost attribution in long-running agents requires tracking spend across sub-agent invocations. LangSmith Fleet provides identity-based cost tracking where each Deep Agent instance accumulates costs from all its sub-agent invocations, enabling accurate per-workflow billing and optimization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security Considerations
&lt;/h3&gt;

&lt;p&gt;For agents that execute code—and most production agents eventually do—sandboxing is non-negotiable. LangSmith Sandboxes provide isolated execution environments for code-generation sub-agents, preventing arbitrary code execution from compromising your infrastructure. For runtime policy enforcement on tool calls, the &lt;a href="https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-source-runtime-security-for-ai-agents/" rel="noopener noreferrer"&gt;Agent Governance Toolkit&lt;/a&gt; integrates with Deep Agents to enforce constraints like "this agent cannot call external APIs after 6 PM" or "this agent cannot modify files outside the /workspace directory."&lt;/p&gt;

&lt;h2&gt;
  
  
  Evaluation and Observability for Long-Running Agents
&lt;/h2&gt;

&lt;p&gt;Traditional LLM evaluation—comparing model outputs to gold-standard responses—breaks down for autonomous agents. When an agent takes 47 steps to complete a task, the final output might be correct even if step 23 was wildly inefficient. Conversely, an incorrect final output might result from a single bad decision at step 12 that cascaded through the rest of the workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Observability Gap
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.langchain.com/state-of-agent-engineering" rel="noopener noreferrer"&gt;State of Agent Engineering survey&lt;/a&gt; found that 89% of production teams use observability for their agents, but evaluation remains the biggest blocker to deployment confidence. Teams can see what their agents are doing but struggle to systematically assess whether the agents are doing it well.&lt;/p&gt;

&lt;p&gt;LangSmith addresses this with trace-based evaluation: each planning step, sub-agent result, and memory operation gets captured as a span that can be individually scored. You define evaluators that examine not just the final output but the quality of intermediate decisions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langsmith.evaluation&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunEvaluator&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PlanningQualityEvaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RunEvaluator&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Evaluates whether the agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s planning was efficient.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;evaluate_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Extract planning steps from the trace
&lt;/span&gt;        &lt;span class="n"&gt;planning_spans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;child_runs&lt;/span&gt; 
                         &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;planning_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

        &lt;span class="c1"&gt;# Score based on planning efficiency metrics
&lt;/span&gt;        &lt;span class="n"&gt;replan_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;planning_spans&lt;/span&gt; 
                          &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Penalize excessive replanning
&lt;/span&gt;        &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;replan_count&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;planning_efficiency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Skills as Evaluated Capabilities
&lt;/h3&gt;

&lt;p&gt;Deep Agents introduces "Skills"—reusable, versioned capabilities that agents can attach. A "web_research" skill encapsulates the ability to search, filter, and synthesize web content. Each skill has associated evaluators that run during CI to ensure capability regressions don't ship to production. When you upgrade from &lt;code&gt;web_research@1.2&lt;/code&gt; to &lt;code&gt;web_research@1.3&lt;/code&gt;, the skill's evaluation suite must pass before deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Harness Optimization
&lt;/h3&gt;

&lt;p&gt;The hill-climbing approach to harness configuration uses evaluation feedback loops to optimize agent behavior. You define success metrics (task completion rate, average step count, cost per task), run the agent against a benchmark suite, and systematically adjust harness parameters—&lt;code&gt;max_planning_depth&lt;/code&gt;, &lt;code&gt;sub_agent_timeout&lt;/code&gt;, memory retrieval limits—to improve metrics. This transforms agent tuning from intuition-driven to data-driven.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;When to choose Deep Agents over LangGraph&lt;/strong&gt;: Deep Agents adds value when your workflow requires persistent memory across sessions, autonomous goal decomposition, or sub-agent coordination. If you're building a customer support bot that handles single queries, LangGraph's lower-level abstractions are simpler and faster to iterate on. If you're building an internal coding agent that maintains context across PRs, learns from past reviews, and coordinates linter, test-runner, and documentation sub-agents, Deep Agents provides the right level of abstraction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory backend selection&lt;/strong&gt;: The &lt;a href="https://blog.langchain.com/announcing-the-langchain-mongodb-partnership-the-ai-agent-stack-that-runs-on-the-database-you-already-trust/" rel="noopener noreferrer"&gt;MongoDB Checkpointer&lt;/a&gt; makes sense for teams already running Atlas—you get vector search for memory retrieval and checkpointing in a single managed database. For teams with existing PostgreSQL infrastructure, PostgresSaver provides equivalent durability with pgvector for semantic retrieval. The temporal reasoning features from the &lt;a href="https://arxiv.org/html/2604.14362v1" rel="noopener noreferrer"&gt;APEX-MEM research&lt;/a&gt; are currently better supported in the MongoDB backend.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration path&lt;/strong&gt;: Existing LangGraph agents can be wrapped as sub-agents within a Deep Agent orchestrator. This enables incremental migration: start by wrapping your most complex workflow as a Deep Agent while keeping simpler agents on LangGraph. The shared memory scope ensures the orchestrator and sub-agents maintain consistent context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost and latency tradeoffs&lt;/strong&gt;: Long-running agents accumulate token costs across planning iterations and sub-agent invocations. Set &lt;code&gt;max_planning_depth&lt;/code&gt; limits based on your cost tolerance. Monitor &lt;code&gt;speculative_waste_ratio&lt;/code&gt; in LangSmith—this metric shows how often the planning model generates plans that get abandoned due to execution failures. A high ratio indicates either overly ambitious planning or unreliable tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team readiness&lt;/strong&gt;: Deep Agents requires observability maturity. If your team isn't already using LangSmith tracing for existing LLM workflows, start there before deploying autonomous agents. The &lt;a href="https://www.langchain.com/state-of-agent-engineering" rel="noopener noreferrer"&gt;State of Agent Engineering&lt;/a&gt; findings make this clear: teams without observability infrastructure struggle to debug and optimize agent behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security posture&lt;/strong&gt;: Enable Sandboxes for any code-execution sub-agents—this is not optional for production deployments. Integrate with the &lt;a href="https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-source-runtime-security-for-ai-agents/" rel="noopener noreferrer"&gt;Agent Governance Toolkit&lt;/a&gt; for runtime policy enforcement. Define explicit constraints on what tools each sub-agent can call and under what conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Build This Week
&lt;/h2&gt;

&lt;p&gt;Build a &lt;strong&gt;code review coordination agent&lt;/strong&gt; that demonstrates the full Deep Agents architecture:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parent Agent&lt;/strong&gt;: Accepts a PR URL, plans the review workflow, coordinates sub-agents, maintains memory of past reviews on this repository&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static Analysis Sub-Agent&lt;/strong&gt;: Runs linters and type checkers, reports findings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security Review Sub-Agent&lt;/strong&gt;: Scans for common vulnerabilities, checks dependency updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test Coverage Sub-Agent&lt;/strong&gt;: Identifies untested code paths, suggests test cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation Sub-Agent&lt;/strong&gt;: Checks for missing docstrings, outdated README sections&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Configure MongoDB checkpointing so the review survives interruptions. Use the &lt;code&gt;AGENTS.md&lt;/code&gt; file to encode your team's code review standards. Set up LangSmith tracing and build a custom evaluator that scores review thoroughness against a benchmark of manually-reviewed PRs.&lt;/p&gt;

&lt;p&gt;The goal isn't a production-ready tool—it's hands-on experience with planning DAGs, sub-agent coordination, persistent memory, and trace-based evaluation. These are the primitives that every non-trivial agent will require as we move from demos to deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/blog/march-2026-langchain-newsletter" rel="noopener noreferrer"&gt;March 2026: LangChain Newsletter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/state-of-agent-engineering" rel="noopener noreferrer"&gt;State of Agent Engineering - LangChain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/blog/nvidia-enterprise" rel="noopener noreferrer"&gt;LangChain Announces Enterprise Agentic AI Platform Built with NVIDIA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/announcing-the-langchain-mongodb-partnership-the-ai-agent-stack-that-runs-on-the-database-you-already-trust/" rel="noopener noreferrer"&gt;Announcing the LangChain + MongoDB Partnership: The AI Agent Stack That Runs On The Database You Already Trust&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/the-next-evolution-of-the-agents-sdk/" rel="noopener noreferrer"&gt;The next evolution of the Agents SDK - OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2602.07359" rel="noopener noreferrer"&gt;W&amp;amp;D: Scaling Parallel Tool Calling for Efficient Deep Research Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2604.14362v1" rel="noopener noreferrer"&gt;APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-source-runtime-security-for-ai-agents/" rel="noopener noreferrer"&gt;Introducing the Agent Governance Toolkit - Microsoft Open Source&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of the **Agentic Engineering Weekly&lt;/em&gt;* series — a deep-dive every Monday into the frameworks,&lt;br&gt;
patterns, and techniques shaping the next generation of AI systems.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Agentic Engineering Weekly series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building something agentic? Drop a comment — I'd love to feature reader projects.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI Weekly: Agent Wars Escalate as Anthropic Reclaims Benchmark Crown and Infrastructure Reality Bites</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 20 Apr 2026 12:02:37 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-agent-wars-escalate-as-anthropic-reclaims-benchmark-crown-and-infrastructure-reality-2fec</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-agent-wars-escalate-as-anthropic-reclaims-benchmark-crown-and-infrastructure-reality-2fec</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly: Agent Wars Escalate as Anthropic Reclaims Benchmark Crown and Infrastructure Reality Bites
&lt;/h1&gt;

&lt;p&gt;The battle for AI supremacy entered a new phase this week as Anthropic's Claude Opus 4.7 narrowly reclaimed the top spot on agentic coding benchmarks, while OpenAI responded by expanding Codex's desktop automation capabilities in a direct challenge to Anthropic's computer use features. Meanwhile, a sobering Reuters analysis put hard numbers on the gap between AI ambitions and physical reality—a $7 trillion gap, to be precise.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Beefs Up Codex With Desktop Control, Taking Direct Aim at Anthropic
&lt;/h2&gt;

&lt;p&gt;OpenAI announced significant enhancements to its Codex agent this week, rolling out expanded desktop automation capabilities that position the tool as a direct competitor to Anthropic's computer use features. The update, detailed in OpenAI's &lt;a href="https://openai.com/index/next-phase-of-enterprise-ai/" rel="noopener noreferrer"&gt;enterprise AI roadmap&lt;/a&gt;, gives Codex substantially more power over user desktop environments, including the ability to navigate file systems, manipulate application windows, and execute multi-step workflows across different programs.&lt;/p&gt;

&lt;p&gt;The timing is no coincidence. Anthropic's computer use capabilities, introduced with Claude 3.5 Sonnet in late 2024, established an early lead in the "agents that control your screen" category. OpenAI's enhanced Codex represents a calculated move to close that gap before enterprise adoption patterns solidify. According to &lt;a href="https://venturebeat.com/" rel="noopener noreferrer"&gt;VentureBeat's coverage&lt;/a&gt; of the announcement, the new features include improved error recovery when desktop automation encounters unexpected UI states—a critical pain point in earlier agentic systems.&lt;/p&gt;

&lt;p&gt;Industry observers note this escalation reflects broader "agent wars" dynamics between major AI labs. As models converge on similar benchmark performance, the competitive differentiation increasingly comes from how effectively these systems can operate autonomously in real-world computing environments. OpenAI appears to be betting that enterprise customers will favor tighter integration with existing productivity workflows over raw model capability alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic CPO Departs Figma Board Amid Reports of Competing Product Launch
&lt;/h2&gt;

&lt;p&gt;In a move that sent ripples through both AI and design tool communities, Anthropic's Chief Product Officer departed from Figma's board of directors this week. &lt;a href="https://techcrunch.com/category/artificial-intelligence/" rel="noopener noreferrer"&gt;TechCrunch reported&lt;/a&gt; that the departure stemmed from plans to launch a product that would compete directly with Figma's collaborative design platform.&lt;/p&gt;

&lt;p&gt;The resignation highlights an increasingly awkward tension: AI companies that once positioned themselves as infrastructure providers are now eyeing the application layer—including tools built by their own board affiliates. For Figma, which has spent years building a collaborative design ecosystem, the prospect of an AI-native competitor backed by one of the leading foundation model companies represents an existential-level threat.&lt;/p&gt;

&lt;p&gt;The departure raises broader questions about AI companies' expansion strategies. Anthropic has historically emphasized its role as a model provider and safety research organization, but the design tool space represents lucrative territory where AI capabilities could fundamentally reshape workflows. Single-prompt generation of complex design assets, AI-driven prototyping, and intelligent design system management are all areas where foundation model capabilities could disrupt incumbent tools.&lt;/p&gt;

&lt;p&gt;Industry analysts suggest this move may accelerate consolidation in the design tool market, as traditional players race to integrate AI capabilities before AI-native alternatives mature.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;The agentic AI landscape continues its rapid evolution, with new data underscoring both the opportunity and organizational challenges ahead. According to UiPath's &lt;a href="https://huggingface.co/blog/daya-shankar/agentic-ai-trends-2026" rel="noopener noreferrer"&gt;2026 report&lt;/a&gt;, 78% of executives say they need to reinvent their operating models to capture agentic AI value—a stark admission that current organizational structures weren't designed for autonomous AI systems.&lt;/p&gt;

&lt;p&gt;Architecture patterns are maturing quickly. The era of solo agents handling tasks in isolation is giving way to &lt;a href="https://arxiv.org/html/2511.17332v2" rel="noopener noreferrer"&gt;multi-agent systems&lt;/a&gt; featuring centralized control layers that coordinate specialized agents, handle inter-agent communication, and maintain coherent state across complex workflows. This "orchestration layer" approach mirrors microservices patterns from traditional software architecture.&lt;/p&gt;

&lt;p&gt;Quality control practices are adapting accordingly. Agentic QA—where AI agents review AI-generated code for security vulnerabilities, architectural consistency, and adherence to coding standards—is becoming &lt;a href="https://huggingface.co/blog/Svngoku/agentic-coding-trends-2026" rel="noopener noreferrer"&gt;standard practice&lt;/a&gt; in mature development organizations. The irony of AI checking AI isn't lost on practitioners, but the approach has proven more scalable than human review alone.&lt;/p&gt;

&lt;p&gt;For those tracking the space, the &lt;a href="https://github.com/caramaschiHG/awesome-ai-agents-2026" rel="noopener noreferrer"&gt;awesome-ai-agents-2026&lt;/a&gt; GitHub repository now catalogs over 340 resources across 20 categories, updated monthly. Key framework categories that have emerged include IDE-native agents, terminal/CLI agents, autonomous software engineers, and multi-agent orchestration platforms—reflecting the field's rapid specialization.&lt;/p&gt;

&lt;h2&gt;
  
  
  InsightFinder Raises $15M to Debug Where AI Agents Go Wrong
&lt;/h2&gt;

&lt;p&gt;InsightFinder announced a &lt;a href="https://techcrunch.com/category/artificial-intelligence/" rel="noopener noreferrer"&gt;$15 million funding round&lt;/a&gt; this week, targeting the growing observability gap as enterprises deploy increasingly autonomous AI agents. The company's platform focuses specifically on helping organizations diagnose why and where AI agent workflows fail—a problem that grows more critical as agents handle longer, more complex task chains.&lt;/p&gt;

&lt;p&gt;The funding reflects a maturing market reality: deploying AI agents is relatively straightforward; understanding what they're doing and why they fail is considerably harder. Traditional application performance monitoring tools weren't designed for systems where the "application logic" is an emergent property of model behavior rather than deterministic code paths.&lt;/p&gt;

&lt;p&gt;InsightFinder's approach involves capturing detailed traces of agent reasoning, tool calls, and environmental interactions, then using specialized models to identify failure patterns and root causes. Enterprise customers report that debugging time for agent failures has dropped by 60-70% compared to manual log analysis.&lt;/p&gt;

&lt;p&gt;The round signals broader investor confidence in the AI operations tooling layer. As &lt;a href="https://venturebeat.com/" rel="noopener noreferrer"&gt;VentureBeat noted&lt;/a&gt;, the "AI observability" category has attracted over $200 million in venture funding in 2026 alone, suggesting that operational maturity—not just raw capability—is becoming the key enterprise adoption bottleneck.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Opus 4.7 Reclaims Top Spot With 64.3% SWE-bench Pro Score
&lt;/h2&gt;

&lt;p&gt;Anthropic &lt;a href="https://venturebeat.com/technology/anthropic-releases-claude-opus-4-7-narrowly-retaking-lead-for-most-powerful-generally-available-llm" rel="noopener noreferrer"&gt;released Claude Opus 4.7&lt;/a&gt; this week, narrowly reclaiming the crown for most powerful generally available large language model. The new release achieved 64.3% on SWE-bench Pro agentic coding tasks—a significant jump from the 53.4% score posted by its predecessor and enough to edge past both GPT-5.4 and &lt;a href="https://techcrunch.com/2026/02/19/googles-new-gemini-pro-model-has-record-benchmark-scores-again/" rel="noopener noreferrer"&gt;Gemini 3.1 Pro&lt;/a&gt; on key benchmarks.&lt;/p&gt;

&lt;p&gt;The visual processing improvements are equally striking. Performance on XBOW tests—which measure fine-grained visual understanding—jumped from 54.5% to 98.5%, suggesting Anthropic has made substantial progress on multimodal capabilities that translate directly to computer use and GUI automation tasks.&lt;/p&gt;

&lt;p&gt;Notably absent from the public release: Mythos, Anthropic's more powerful successor model that reportedly surpasses Opus 4.7 by a significant margin. According to the &lt;a href="https://venturebeat.com/technology/anthropic-releases-claude-opus-4-7-narrowly-retaking-lead-for-most-powerful-generally-available-llm" rel="noopener noreferrer"&gt;VentureBeat report&lt;/a&gt;, Mythos remains restricted to enterprise security partners, suggesting Anthropic is taking a staged approach to releasing its most capable systems.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/salttechno/LLM-Model-Comparison-2026" rel="noopener noreferrer"&gt;LLM comparison data&lt;/a&gt; tracking these releases shows benchmark competition has intensified dramatically—margins between leading models are now measured in single percentage points rather than the double-digit gaps common in 2024.&lt;/p&gt;

&lt;h2&gt;
  
  
  $7 Trillion Reality Check: AI Infrastructure Dreams Hit Power Wall
&lt;/h2&gt;

&lt;p&gt;A sobering &lt;a href="https://www.reuters.com/commentary/breakingviews/ai-dreams-crash-into-stark-7-trln-reality-2026-04-07/" rel="noopener noreferrer"&gt;Reuters analysis&lt;/a&gt; published this week put hard numbers on the gap between AI ambitions and physical reality. Meta, xAI, and other major players are collectively targeting data centers consuming 110 gigawatts—roughly equivalent to Japan's entire electricity consumption. The price tag for this infrastructure build-out: approximately $7 trillion.&lt;/p&gt;

&lt;p&gt;NVIDIA CEO Jensen Huang's recent estimate that each major AI data center requires a minimum $60 billion investment underscores the scale challenge. These aren't software problems solvable with clever engineering; they're physical infrastructure constraints that require years of permitting, construction, and grid upgrades to address.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.reuters.com/technology/artificial-intelligence/" rel="noopener noreferrer"&gt;Reuters analysis&lt;/a&gt; questions whether current AI business models can sustain these capital requirements. Revenue from AI services would need to grow by orders of magnitude to justify infrastructure investments at this scale—a bet that assumes continued exponential capability improvements translating into proportional commercial value.&lt;/p&gt;

&lt;p&gt;Power availability has emerged as the critical bottleneck. Even companies with unlimited capital face multi-year timelines to secure sufficient electricity, leading to creative solutions like locating facilities near nuclear plants or acquiring entire utility companies. The irony isn't lost on observers: the most advanced AI systems humanity has built are ultimately constrained by our ability to move electrons.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google AI Mode Now Enables Side-by-Side Web Exploration
&lt;/h2&gt;

&lt;p&gt;Google rolled out a significant update to its AI Mode this week, introducing side-by-side web exploration that lets users browse source materials while interacting with AI-generated responses. The feature, announced alongside the global rollout of &lt;a href="https://techcrunch.com/2026/02/19/googles-new-gemini-pro-model-has-record-benchmark-scores-again/" rel="noopener noreferrer"&gt;Gemini 3.1 Flash Live&lt;/a&gt;, addresses persistent user demand for source verification in AI-assisted search.&lt;/p&gt;

&lt;p&gt;The implementation reflects Google's attempt to thread a difficult needle: providing the convenience of AI-synthesized answers while preserving the web's role as a navigable information ecosystem. Users can now click on any cited source within an AI Mode response to open it in a split-screen view, examining the original context without losing their AI conversation thread.&lt;/p&gt;

&lt;p&gt;For publishers who have worried about AI reducing web traffic, the feature offers a partial olive branch—though whether it meaningfully increases click-through rates remains to be seen. Early data from Google suggests users do engage with source materials roughly 40% of the time when the side-by-side option is available.&lt;/p&gt;

&lt;p&gt;The broader trend points toward hybrid AI-augmented browsing becoming the default web experience. Rather than choosing between traditional search and AI chat interfaces, users increasingly expect both simultaneously—a pattern that &lt;a href="https://venturebeat.com/" rel="noopener noreferrer"&gt;VentureBeat suggests&lt;/a&gt; will reshape how websites are designed and optimized.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adobe Firefly Assistant Aims to Unify Creative Suite With Single-Prompt Control
&lt;/h2&gt;

&lt;p&gt;Adobe launched its Firefly AI Assistant this week, a cross-application tool that spans Photoshop, Premiere Pro, Illustrator, and other Creative Cloud applications. The headline capability: a single natural language prompt can orchestrate actions across multiple apps simultaneously—edit an image in Photoshop, apply matching color grading in Premiere, and generate complementary vector assets in Illustrator, all from one command.&lt;/p&gt;

&lt;p&gt;The integration represents Adobe's most aggressive move yet to defend its creative tool dominance against AI-native competitors. As &lt;a href="https://techcrunch.com/category/artificial-intelligence/" rel="noopener noreferrer"&gt;TechCrunch coverage&lt;/a&gt; notes, startups have been chipping away at individual Creative Cloud use cases with AI-first approaches; Adobe's response is to leverage its multi-application ecosystem as a moat that single-purpose AI tools can't easily cross.&lt;/p&gt;

&lt;p&gt;Early user feedback suggests the workflow consolidation delivers genuine productivity gains for complex projects, though the learning curve for effective prompting remains steep. Power users report that crafting prompts that reliably produce desired cross-application results requires substantial experimentation.&lt;/p&gt;

&lt;p&gt;The assistant also introduces persistent project context—it remembers style decisions, brand guidelines, and user preferences across sessions and applications. For creative teams working on large-scale projects, this contextual memory could prove more valuable than the generation capabilities themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;p&gt;The infrastructure reality check from Reuters deserves close attention in coming months—if power constraints force AI labs to slow capability scaling, the entire competitive dynamic could shift toward efficiency optimization rather than raw capability races. Meanwhile, the narrowing benchmark gaps between Claude, GPT, and Gemini suggest we're approaching a regime where model differentiation comes from tooling, ecosystem integration, and deployment flexibility rather than benchmark scores. Expect the agent wars to intensify as that new competitive landscape takes shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.reuters.com/technology/artificial-intelligence/" rel="noopener noreferrer"&gt;AI News | Latest Headlines and Developments | Reuters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/category/artificial-intelligence/" rel="noopener noreferrer"&gt;AI News &amp;amp; Artificial Intelligence | TechCrunch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://venturebeat.com/" rel="noopener noreferrer"&gt;VentureBeat | Transformative tech coverage that matters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/next-phase-of-enterprise-ai/" rel="noopener noreferrer"&gt;The next phase of enterprise AI | OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reuters.com/commentary/breakingviews/ai-dreams-crash-into-stark-7-trln-reality-2026-04-07/" rel="noopener noreferrer"&gt;AI dreams crash into stark $7 trln reality | Reuters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2511.17332v2" rel="noopener noreferrer"&gt;Agentifying Agentic AI - AAAI 2026 Bridge Program&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/blog/daya-shankar/agentic-ai-trends-2026" rel="noopener noreferrer"&gt;Latest Agentic AI Trends to Watch in 2026 | Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/caramaschiHG/awesome-ai-agents-2026" rel="noopener noreferrer"&gt;caramaschiHG/awesome-ai-agents-2026 | GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/blog/Svngoku/agentic-coding-trends-2026" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends - Implementation Guide | Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2026/02/19/googles-new-gemini-pro-model-has-record-benchmark-scores-again/" rel="noopener noreferrer"&gt;Google's new Gemini Pro model has record benchmark scores | TechCrunch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/salttechno/LLM-Model-Comparison-2026" rel="noopener noreferrer"&gt;salttechno/LLM-Model-Comparison-2026 | GitHub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://venturebeat.com/technology/anthropic-releases-claude-opus-4-7-narrowly-retaking-lead-for-most-powerful-generally-available-llm" rel="noopener noreferrer"&gt;Anthropic releases Claude Opus 4.7 | VentureBeat&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every week, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
    <item>
      <title>LangSmith Fleet: Managing Agent Identity, Permissions, and Skills at Enterprise Scale</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 13 Apr 2026 12:05:36 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/langsmith-fleet-managing-agent-identity-permissions-and-skills-at-enterprise-scale-19p7</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/langsmith-fleet-managing-agent-identity-permissions-and-skills-at-enterprise-scale-19p7</guid>
      <description>&lt;h1&gt;
  
  
  LangSmith Fleet: Managing Agent Identity, Permissions, and Skills at Enterprise Scale
&lt;/h1&gt;

&lt;p&gt;The LangChain team quietly shipped one of the most significant architectural changes to LangSmith in March 2026, and it wasn't a new model integration or a flashy UI overhaul. They renamed Agent Builder to &lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;Fleet&lt;/a&gt;—a seemingly cosmetic change that signals a fundamental shift in how enterprises should think about their AI agent portfolios. This isn't about building better individual agents anymore; it's about managing dozens or hundreds of agents as a coordinated organizational capability. If you're running more than a handful of agents in production, this is the infrastructure layer you didn't know you needed.&lt;/p&gt;

&lt;p&gt;The timing is deliberate. LangChain's &lt;a href="https://www.langchain.com/state-of-agent-engineering" rel="noopener noreferrer"&gt;State of Agent Engineering report&lt;/a&gt; revealed a troubling gap: 89% of organizations have observability in place, but most lack centralized governance for their agent portfolios. Teams are spinning up agents in silos, duplicating prompt engineering effort, and losing track of which agents have access to what resources. Fleet directly addresses this governance vacuum by introducing three primitives that enterprise deployments desperately need: agent identity, role-based permissions, and reusable Skills.&lt;/p&gt;

&lt;p&gt;With &lt;a href="https://www.langchain.com/blog/nvidia-enterprise" rel="noopener noreferrer"&gt;300+ enterprise customers now processing over 15 billion traces&lt;/a&gt;, LangSmith has accumulated hard-won lessons about what breaks at scale. Fleet codifies these patterns into infrastructure that prevents the debugging nightmare of "which agent caused this production incident?" before it happens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fleet Architecture: Identity and Permissions Model
&lt;/h2&gt;

&lt;p&gt;The core insight behind Fleet's design is deceptively simple: every agent in your organization needs a stable, auditable identity that persists across deployments, versions, and team ownership changes. This sounds obvious until you realize how most teams currently manage agents—as anonymous graph definitions deployed via CI/CD with no central registry.&lt;/p&gt;

&lt;p&gt;Fleet's identity system assigns each agent a unique organizational identifier that ties together its definition, deployment history, performance metrics, and access patterns. This identity travels with the agent across environments. When your compliance checking agent runs in staging versus production, it's the same identity with environment-specific configuration, not two unrelated deployments you have to mentally correlate.&lt;/p&gt;

&lt;p&gt;The permission model layers role-based access control on top of these identities. Fleet distinguishes between several permission levels: viewers can observe agent behavior and metrics, editors can modify prompts and tool configurations, deployers can push agents to production environments, and administrators can retire agents or transfer ownership. These map cleanly to organizational realities—your ML platform team shouldn't need to ask a Slack channel before deploying a new agent version, but they probably shouldn't be editing the legal team's contract review prompts without approval.&lt;/p&gt;

&lt;p&gt;Sharing mechanisms enable cross-team agent reuse without the current anti-pattern of exporting agent definitions as JSON and importing them into another workspace. When a shared agent is updated, teams consuming it can see the update and choose to adopt it—version control semantics applied to agent definitions. This preserves provenance: you always know which team owns the canonical definition and what modifications downstream teams have made.&lt;/p&gt;

&lt;p&gt;Integration with enterprise identity providers happens at the organization level. Fleet supports SSO and SAML authentication, which means your existing Okta groups or Azure AD roles can map directly to Fleet permissions. The compliance team's AD group gets viewer access to all agents; the ML platform team's group gets deployer access. No separate permission system to maintain.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;audit trail functionality&lt;/a&gt; captures agent lifecycle events comprehensively: creation timestamp and creator identity, every modification with diff visibility, deployment events across environments, permission grants and revocations. This isn't just for compliance checkbox exercises—it's the forensic evidence you need when a production agent starts behaving unexpectedly six weeks after someone made a "minor prompt tweak."&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills: Equipping Agents with Reusable Capabilities
&lt;/h2&gt;

&lt;p&gt;If identity and permissions form Fleet's governance layer, Skills represent its knowledge management layer. A Skill is a modular, shareable knowledge package that can be attached to any agent in your fleet. The &lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;March 2026 newsletter announced the first open-source Skills&lt;/a&gt; alongside the Fleet rename, establishing an ecosystem pattern that will likely expand significantly.&lt;/p&gt;

&lt;p&gt;The architectural insight here addresses a real pain point: domain expertise shouldn't be trapped inside individual agent definitions. When your engineering team figures out the optimal way to interact with your internal APIs—the authentication dance, rate limiting patterns, error recovery logic—that knowledge currently lives in one agent's prompts. Every subsequent agent that needs API access must rediscover or copy-paste this expertise.&lt;/p&gt;

&lt;p&gt;Skills separate domain expertise from agent orchestration logic. A "Company API Integration" Skill encapsulates authentication patterns, retry strategies, and response parsing conventions. A "Legal Document Formatting" Skill knows your organization's citation styles, confidentiality markings, and section numbering conventions. These Skills attach to agents without modifying the agent's core orchestration graph.&lt;/p&gt;

&lt;p&gt;The skill attachment model supports both static and dynamic binding. Static attachment happens at agent definition time—you declare that your contract review agent always uses the Legal Document Formatting Skill. Dynamic attachment allows agents to discover and request Skills at runtime based on task requirements, though this requires more sophisticated capability negotiation that most teams won't need initially.&lt;/p&gt;

&lt;p&gt;Version management for Skills solves the knowledge distribution problem. When your platform team improves the API Integration Skill—say, adding support for a new authentication method—that improvement propagates to all agents using the Skill. Teams can pin to specific Skill versions for stability or track the latest version for continuous improvement. The semantics mirror dependency management in traditional software development, which makes the mental model accessible to engineering teams.&lt;/p&gt;

&lt;p&gt;The skill definition format establishes a standard structure that includes capability declarations (what the Skill enables), knowledge content (prompts, examples, patterns), tool bindings (if the Skill requires specific tool access), and configuration parameters (organization-specific customization points). This standardization means Skills can be shared not just within organizations but eventually across the emerging ecosystem of &lt;a href="https://arxiv.org/html/2604.05387v1" rel="noopener noreferrer"&gt;function calling improvements&lt;/a&gt; that the broader community is developing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Code Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's build a practical Fleet setup with multiple agents, custom Skills, and proper permission configuration. This example creates a document processing fleet for a legal department with shared compliance capabilities.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# fleet_setup.py
# Requires: langsmith&amp;gt;=0.3.0, langchain&amp;gt;=0.4.0, langgraph&amp;gt;=0.3.0
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langsmith&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langsmith.fleet&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Fleet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;Skill&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;PermissionSet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SkillAttachment&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the LangSmith client with Fleet capabilities
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;fleet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Fleet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;organization_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-org-id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define agent identity with comprehensive metadata
# This identity persists across deployments and environment changes
&lt;/span&gt;&lt;span class="n"&gt;contract_reviewer_identity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contract-reviewer-v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;display_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Contract Review Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;owner_team&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;legal-ops&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;purpose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reviews vendor contracts for compliance issues and risk factors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;deployment_environments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;legal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compliance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vendor-management&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;# Metadata for organizational classification
&lt;/span&gt;    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cost_center&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LEGAL-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_classification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidential&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review_frequency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quarterly&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Register the identity with Fleet - this creates the audit trail
&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contract_reviewer_identity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Agent registered with ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define a reusable Skill for compliance checking
# Skills encapsulate domain expertise separate from orchestration
&lt;/span&gt;&lt;span class="n"&gt;compliance_checking_skill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Skill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;corporate-compliance-rules&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Encapsulates corporate compliance requirements for contract review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="c1"&gt;# Knowledge content that will be injected into agent context
&lt;/span&gt;    &lt;span class="n"&gt;knowledge_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    ## Corporate Compliance Requirements

    All vendor contracts must be reviewed against these criteria:

    1. DATA HANDLING: Contracts involving PII must include:
       - Data processing addendum (DPA)
       - SOC 2 Type II certification requirement
       - Data residency clauses for EU customers (GDPR)
       - Breach notification timeline (max 72 hours)

    2. FINANCIAL TERMS: Flag for legal review if:
       - Auto-renewal clauses exceed 1 year
       - Liability caps below $1M for critical services
       - Payment terms shorter than Net 30
       - Price escalation clauses above 5% annually

    3. TERMINATION RIGHTS: Require:
       - Termination for convenience with 90-day notice
       - Immediate termination for material breach
       - Data return/deletion obligations post-termination

    4. INDEMNIFICATION: Standard requirements:
       - Mutual indemnification for IP infringement
       - Vendor indemnification for data breaches caused by vendor
       - Carve-outs for gross negligence
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="c1"&gt;# Configuration parameters that can be customized per-organization
&lt;/span&gt;    &lt;span class="n"&gt;config_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;liability_threshold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto_renewal_max_years&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;breach_notification_hours&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;# Tags for discoverability in the Skill registry
&lt;/span&gt;    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;legal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compliance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contracts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vendor-management&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Register the Skill with Fleet
&lt;/span&gt;&lt;span class="n"&gt;registered_skill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register_skill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;compliance_checking_skill&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skill registered with ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;registered_skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;skill_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define the agent's state structure
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ContractReviewState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;contract_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;compliance_issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;risk_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;skill_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;  &lt;span class="c1"&gt;# Injected by Fleet at runtime
&lt;/span&gt;
&lt;span class="c1"&gt;# Build the LangGraph workflow for contract review
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_data_handling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ContractReviewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Check contract against data handling requirements from Skill.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Skill context is automatically injected by Fleet
&lt;/span&gt;    &lt;span class="n"&gt;skill_rules&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;skill_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;knowledge_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Your LLM call here would use skill_rules in the prompt
&lt;/span&gt;    &lt;span class="c1"&gt;# This is where domain expertise from the Skill gets applied
&lt;/span&gt;    &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Simplified example - real implementation would use LLM
&lt;/span&gt;    &lt;span class="n"&gt;contract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contract_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data processing addendum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dpa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DATA_HANDLING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;finding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing Data Processing Addendum (DPA)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;remediation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Request DPA from vendor before signing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compliance_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_financial_terms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ContractReviewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Check financial terms against Skill-defined thresholds.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;skill_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;skill_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;config&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;liability_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;skill_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;liability_threshold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation would parse contract and check against thresholds
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compliance_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_risk_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ContractReviewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Aggregate issues into overall risk score.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compliance_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="n"&gt;high_severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;medium_severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIUM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Risk score: 0-100, higher = more risk
&lt;/span&gt;    &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;high_severity&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;medium_severity&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;recommendation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;APPROVE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REVIEW&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recommendation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Construct the graph
&lt;/span&gt;&lt;span class="n"&gt;workflow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ContractReviewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_handling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analyze_data_handling&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_terms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analyze_financial_terms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_calculation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;calculate_risk_score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_handling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_terms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_terms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_calculation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_entry_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_handling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_finish_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_calculation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Attach the Skill to the agent identity
# This creates the binding between agent and reusable knowledge
&lt;/span&gt;&lt;span class="n"&gt;skill_attachment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SkillAttachment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;skill_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registered_skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;skill_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version_constraint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;~1.2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Accept 1.2.x patches automatically
&lt;/span&gt;    &lt;span class="n"&gt;config_overrides&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;liability_threshold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Org-specific override
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach_skill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skill_attachment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skill attached to agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure permissions for the agent
# RBAC model controls who can view, edit, deploy, or retire
&lt;/span&gt;&lt;span class="n"&gt;permissions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PermissionSet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rules&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="c1"&gt;# Legal ops team has full control
&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;principal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;group:legal-ops&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;# General counsel can view and deploy
&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;principal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;group:general-counsel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deployer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;# All legal staff can view metrics and outputs
&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;principal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;group:legal-all&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;# ML platform team can deploy but not modify prompts
&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;principal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;group:ml-platform&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deployer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;# Compliance team needs read access for audits
&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;principal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;group:compliance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_permissions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;permissions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Permissions configured&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Register the compiled graph with the agent identity
# This connects the LangGraph definition to the Fleet identity
&lt;/span&gt;&lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;environment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Query the Fleet to verify setup
&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_agents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;team&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;legal-ops&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;display_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Fleet ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Skills: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_agent_skills&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Permissions: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_permissions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Example: Run the agent with Fleet context injection
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fleet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registered_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fleet_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contract_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        VENDOR SERVICES AGREEMENT
        This agreement between Acme Corp and BigCloud Inc...
        Payment terms: Net 15
        Auto-renewal: 3 years
        Liability cap: $500,000
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;environment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Review Result:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Risk Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;risk_score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Recommendation: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recommendation&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Issues Found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;compliance_issues&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation demonstrates Fleet's key capabilities: persistent agent identity, Skills as reusable knowledge packages, and permission configuration that maps to organizational structure. The &lt;code&gt;skill_context&lt;/code&gt; injection pattern keeps domain expertise separate from orchestration logic while making it available where needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability and Governance at Fleet Scale
&lt;/h2&gt;

&lt;p&gt;Fleet-level observability transforms how you monitor agent portfolios. Instead of drilling into individual agent dashboards and mentally aggregating patterns, Fleet provides organizational views that answer questions like "which agents had the highest error rates this week?" or "which teams are consuming the most LLM tokens?"&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;aggregated metrics dashboards&lt;/a&gt; span invocation counts, error rates, latency distributions, and token consumption across your entire agent fleet. You can slice these metrics by team, deployment environment, or custom tags. When leadership asks "how much are we spending on AI agents in the legal department?", you can answer with actual data rather than estimates.&lt;/p&gt;

&lt;p&gt;Polly, LangSmith's AI assistant that &lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;reached general availability&lt;/a&gt; alongside Fleet, can analyze fleet-wide patterns and suggest optimizations. This isn't just a chatbot interface to your metrics—Polly can identify correlations across agents that humans miss: "Three agents in the legal-ops team show similar latency spikes every Monday morning, likely correlated with the weekly contract upload batch job."&lt;/p&gt;

&lt;p&gt;The correlation between agent identity and LangSmith traces enables per-agent performance drilling that was previously impossible. Every trace automatically tags with the Fleet identity, so you can see exactly how the contract-reviewer-v1 agent performed across all its invocations—not just one deployment, but the complete behavioral history. This longitudinal view reveals drift patterns: if an agent's error rate creeps up over weeks, you can correlate with Skill updates or permission changes that might explain the regression.&lt;/p&gt;

&lt;p&gt;Compliance reporting leverages the audit trail to generate reports showing which agents accessed which tools and when. For regulated industries, this isn't optional—&lt;a href="https://arxiv.org/pdf/2603.27075" rel="noopener noreferrer"&gt;legal frameworks are struggling to keep pace with agentic AI&lt;/a&gt;, and organizations that can demonstrate clear governance have an advantage. Fleet's event export integrates with enterprise monitoring systems like Datadog and Splunk, feeding agent lifecycle events into existing security and compliance workflows.&lt;/p&gt;

&lt;p&gt;Cost attribution by agent identity enables chargeback and budget planning at the organizational level. When the CFO asks why the AI infrastructure budget tripled, you can show exactly which teams and which agents drove the increase—and more importantly, what business value they delivered.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;If you're operating five or more agents in production, Fleet's identity system addresses a concrete debugging problem: the "which agent did this?" nightmare. When a customer reports unexpected behavior, you need to trace from the symptom back to a specific agent, version, and invocation. Without stable identities, this requires manual correlation across deployment logs, trace IDs, and team knowledge about what's running where.&lt;/p&gt;

&lt;p&gt;Skills reduce duplicated prompt engineering effort across your organization. If you've ever watched multiple teams independently solve the same problem—formatting API responses correctly, handling authentication retries, structuring outputs for downstream systems—you've experienced the knowledge fragmentation that Skills address. The investment in creating a Skill pays dividends across every agent that attaches it.&lt;/p&gt;

&lt;p&gt;The permission model becomes essential before you give non-engineering teams access to modify agent behavior. The trend toward &lt;a href="https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/6-core-capabilities-to-scale-agent-adoption-in-2026/" rel="noopener noreferrer"&gt;democratized agent creation&lt;/a&gt; is accelerating, and without permission controls, you face either a security nightmare or a bottleneck where engineers review every change. Fleet's RBAC model lets you give marketing the ability to tweak their content agent's prompts without giving them access to the production deployment pipeline.&lt;/p&gt;

&lt;p&gt;The migration path from Agent Builder is relatively smooth: existing agents automatically receive Fleet identities when you enable Fleet on your workspace. Skill extraction is manual—you'll need to identify reusable knowledge currently embedded in agent prompts and factor it into Skills. This is work worth doing regardless of Fleet, as it forces you to document tribal knowledge that currently exists only in specific prompts.&lt;/p&gt;

&lt;p&gt;Evaluate Fleet alongside whatever agent registry solution you currently have (or more likely, don't have). Many teams are using internal wikis, spreadsheets, or Notion pages to track agents—fine for two or three agents, increasingly untenable as portfolios grow. Fleet provides a programmatic registry with API access, which enables automation that informal registries can't support.&lt;/p&gt;

&lt;p&gt;Start with identity and permissions before investing heavily in Skills. Governance foundations enable safe skill sharing later; without them, Skills become another vector for uncontrolled changes propagating across your fleet. Get the audit trail and permission model in place first, then layer Skills on top once you've established who can modify what.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Build This Week
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Project: Agent Fleet Inventory and Skill Extraction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before you can benefit from Fleet, you need to understand what you're managing. This week's project is a comprehensive inventory of your current agent portfolio with identification of skill extraction candidates.&lt;/p&gt;

&lt;p&gt;Start by cataloging every agent your organization runs: the ones in production, the ones in staging that "will ship soon," and the forgotten experiments still consuming compute in someone's sandbox. For each agent, document: owner team, purpose, tools it accesses, data it processes, current deployment status, and rough monthly cost.&lt;/p&gt;

&lt;p&gt;Next, analyze the prompts across these agents looking for duplicated expertise. Common patterns include: API interaction conventions, output formatting requirements, domain-specific terminology definitions, error handling approaches, and citation/attribution styles. These are your Skill extraction candidates.&lt;/p&gt;

&lt;p&gt;Create one Skill from the most duplicated pattern you find. Define its knowledge content, configuration parameters, and tags. Attach it to two existing agents and verify they behave consistently. This gives you hands-on experience with the Skill model before you scale.&lt;/p&gt;

&lt;p&gt;Finally, draft an RBAC policy for your fleet. Which teams should be viewers, editors, deployers, or administrators for which agents? Map this to your existing identity provider groups. You don't need Fleet to implement the policy—having it documented means you're ready when Fleet permissions become available in your workspace.&lt;/p&gt;

&lt;p&gt;The deliverable is a Fleet readiness document: agent inventory, three skill extraction candidates with rough definitions, and an RBAC policy ready for implementation. This preparation work ensures you can adopt Fleet deliberately rather than reactively when the tooling matures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;March 2026: LangChain Newsletter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/blog/nvidia-enterprise" rel="noopener noreferrer"&gt;LangChain Announces Enterprise Agentic AI Platform Built with NVIDIA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;On Agent Frameworks and Agent Observability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.langchain.com/state-of-agent-engineering" rel="noopener noreferrer"&gt;State of Agent Engineering - LangChain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/6-core-capabilities-to-scale-agent-adoption-in-2026/" rel="noopener noreferrer"&gt;6 core capabilities to scale agent adoption in 2026 - Microsoft&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2603.27075" rel="noopener noreferrer"&gt;Mind The Gap: How The Technical Mechanism Of Agentic AI Outpace Global Legal Frameworks&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://arxiv.org/html/2604.05387v1" rel="noopener noreferrer"&gt;Data-Driven Function Calling Improvements in Large Language Models&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of the **Agentic Engineering Weekly&lt;/em&gt;* series — a deep-dive every Monday into the frameworks,&lt;br&gt;
patterns, and techniques shaping the next generation of AI systems.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Agentic Engineering Weekly series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building something agentic? Drop a comment — I'd love to feature reader projects.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI Weekly: Intel-Google CPU Alliance, Meta's Proprietary Pivot, and the $7 Trillion Infrastructure Reality Check</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 13 Apr 2026 12:04:55 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-intel-google-cpu-alliance-metas-proprietary-pivot-and-the-7-trillion-infrastructure-n4b</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-intel-google-cpu-alliance-metas-proprietary-pivot-and-the-7-trillion-infrastructure-n4b</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly: Intel-Google CPU Alliance, Meta's Proprietary Pivot, and the $7 Trillion Infrastructure Reality Check
&lt;/h1&gt;

&lt;p&gt;The battle lines in AI infrastructure are being redrawn this week. As GPU costs and power demands reach breaking points, we're seeing major players hedge their bets—Intel and Google doubling down on CPU-based inference, Big Tech signing nuclear power deals, and Meta surprising everyone by abandoning its open-source commitment for a proprietary model that crushes the competition. Meanwhile, Chinese AI continues its relentless advance with Zhipu AI's massive new open-source release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Intel and Google Forge Expanded Partnership to Double Down on AI-Optimized CPUs
&lt;/h2&gt;

&lt;p&gt;Intel and Google &lt;a href="https://www.reuters.com/business/intel-google-double-down-ai-cpus-with-expanded-partnership-2026-04-09/" rel="noopener noreferrer"&gt;announced an expanded partnership&lt;/a&gt; on April 9, 2026, signaling a significant strategic bet that CPU-based AI inference can offer a viable alternative to the GPU-dominated landscape that NVIDIA currently controls.&lt;/p&gt;

&lt;p&gt;The partnership focuses on developing AI-optimized CPU architectures specifically designed for inference workloads—the computationally intensive task of running trained models in production. While GPUs have dominated AI training and increasingly inference, the collaboration suggests both companies see an opportunity in the growing cost pressures facing enterprise AI deployments.&lt;/p&gt;

&lt;p&gt;The timing is notable. Data center operators are grappling with skyrocketing power consumption and GPU procurement costs that have become untenable for many organizations. CPUs, while historically slower for AI workloads, offer advantages in power efficiency, existing infrastructure compatibility, and procurement flexibility that enterprise customers increasingly value.&lt;/p&gt;

&lt;p&gt;For Intel, the partnership represents a lifeline in the AI hardware race where the company has struggled to compete with NVIDIA's dominance. For Google, which operates one of the world's largest inference infrastructures for services like Search and Gemini, diversifying beyond GPUs and its own TPUs makes strategic sense as AI becomes core to every product.&lt;/p&gt;

&lt;p&gt;The real question is whether optimized CPU inference can close the performance gap enough to matter for latency-sensitive applications—or whether this remains primarily a cost optimization play for batch workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Big Tech Pours Billions into Next-Gen Nuclear as AI Power Demands Explode
&lt;/h2&gt;

&lt;p&gt;The AI infrastructure buildout has collided head-on with physical reality: these systems need staggering amounts of electricity, and the tech industry is now &lt;a href="https://www.reuters.com/legal/litigation/big-tech-puts-financial-heft-behind-next-gen-nuclear-power-ai-demand-surges-2026-04-10/" rel="noopener noreferrer"&gt;putting financial heft behind next-gen nuclear power&lt;/a&gt; to secure it.&lt;/p&gt;

&lt;p&gt;Meta, xAI, and other hyperscalers are now targeting data center deployments requiring a combined 110 GW of power—roughly equivalent to the entire electricity consumption of Germany. NVIDIA CEO Jensen Huang has publicly estimated that each major AI infrastructure buildout requires a minimum $60 billion investment, a figure that doesn't even account for the long-term power generation infrastructure needed to sustain operations.&lt;/p&gt;

&lt;p&gt;Startups like Oklo are capitalizing on this demand, securing long-term power purchase agreements with tech customers desperate for clean, reliable baseload power. The attraction of nuclear is clear: unlike solar and wind, it provides consistent output regardless of weather, and next-generation small modular reactor designs promise faster deployment timelines than traditional nuclear plants.&lt;/p&gt;

&lt;p&gt;But the numbers remain daunting. As &lt;a href="https://www.reuters.com/commentary/breakingviews/ai-dreams-crash-into-stark-7-trln-reality-2026-04-07/" rel="noopener noreferrer"&gt;Reuters analysis notes&lt;/a&gt;, AI infrastructure ambitions are crashing into a $7 trillion reality check when accounting for sustainable power solutions. The gap between AI's appetite for compute and the planet's ability to power it sustainably is arguably the defining constraint of this technology era—one that neither algorithmic efficiency gains nor hardware improvements alone can solve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Meta Breaks from Open-Source Playbook with Proprietary Muse Spark Model
&lt;/h2&gt;

&lt;p&gt;In a move that caught the AI community off guard, Meta &lt;a href="https://venturebeat.com/technology/goodbye-llama-meta-launches-new-proprietary-ai-model-muse-spark-first-since" rel="noopener noreferrer"&gt;launched Muse Spark&lt;/a&gt;, its first proprietary model since the company committed to its open-weight Llama strategy. The shift represents a significant philosophical reversal for a company that had positioned open-source AI as a competitive moat against OpenAI and Google.&lt;/p&gt;

&lt;p&gt;The performance justification is substantial. Muse Spark achieved an Artificial Analysis Intelligence Index score of 52, nearly tripling the 18 scored by Llama 4 Maverick. On the CharXiv Reasoning benchmark, Muse Spark posted an 86.4, outperforming both Claude Opus 4.6's 65.3 and GPT-5.4's 82.8. These aren't incremental improvements—they suggest Meta has been holding back capabilities in its public releases.&lt;/p&gt;

&lt;p&gt;The timing provides context for the decision. Chinese models from Alibaba and DeepSeek now account for 41% of downloads on Hugging Face, effectively commoditizing the open-weights space that Meta pioneered. When your open-source strategy mainly benefits competitors with lower labor costs and fewer regulatory constraints, the calculus changes.&lt;/p&gt;

&lt;p&gt;The community reaction has been mixed. Some view this as a betrayal of Meta's open-source commitments; others see it as inevitable market maturation. What's undeniable is that Meta can no longer credibly position itself as the champion of AI democratization—and that the open-source vs. proprietary debate in AI is far more nuanced than partisans on either side admit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Zhipu AI's GLM-5.1 Sets New Open-Source Benchmark with 754B Parameters
&lt;/h2&gt;

&lt;p&gt;Chinese AI lab Zhipu AI released &lt;a href="https://venturebeat.com/technology/ai-joins-the-8-hour-work-day-as-glm-ships-5-1-open-source-llm-beating-opus-4" rel="noopener noreferrer"&gt;GLM-5.1&lt;/a&gt;, a 754 billion parameter model that establishes new benchmarks for open-source AI capabilities. The model includes a 202,752 token context window—large enough to process substantial codebases or document collections in a single pass.&lt;/p&gt;

&lt;p&gt;The benchmark results are striking. GLM-5.1 achieved 95.3 on AIME 2026, a rigorous mathematics assessment, and 68.7 on CyberGym, a cybersecurity evaluation spanning 1,507 tasks. Perhaps most impressively, the model passed what Zhipu calls "Scenario 3"—building a functional Linux-style desktop environment from scratch within an 8-hour timeframe, demonstrating sustained agentic capability over extended task horizons.&lt;/p&gt;

&lt;p&gt;For practitioners, the deployment story matters as much as the benchmarks. GLM-5.1 is available for local deployment via vLLM, SGLang, and xLLM frameworks, meaning organizations with sufficient hardware can run frontier-class capabilities entirely on-premises. This addresses data sovereignty and cost concerns that limit enterprise adoption of API-based models.&lt;/p&gt;

&lt;p&gt;The release comes as Zhipu AI IPO'd at a $52.83 billion valuation, reflecting investor confidence in Chinese AI development. For Western AI labs, GLM-5.1 represents yet another data point in an uncomfortable trend: the open-source lead they once held has definitively shifted east, with implications for talent flows, regulatory frameworks, and the geopolitics of AI development.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stalking Victim's Lawsuit Against OpenAI Raises New Questions About AI Safety Guardrails
&lt;/h2&gt;

&lt;p&gt;A lawsuit filed against OpenAI by a stalking victim presents one of the most concrete tests yet of AI company liability when their products facilitate real-world harm. The victim claims ChatGPT fueled her abuser's delusional fixation and that the company ignored her direct warnings about the situation.&lt;/p&gt;

&lt;p&gt;The case highlights the gap between AI safety commitments—the red-teaming, the RLHF, the constitutional AI principles—and the actual mechanisms available to prevent harm when someone reports ongoing abuse. What processes exist for a victim to flag that an AI system is being weaponized against them? How do AI companies triage such reports against the millions of support tickets they receive? The lawsuit suggests these systems are either inadequate or non-existent.&lt;/p&gt;

&lt;p&gt;From a legal perspective, the case could establish precedent for when AI companies become liable for foreseeable misuse. Section 230 protections that shield platforms from user-generated content may not apply when a company has specific knowledge of harmful use and fails to act. The plaintiff's direct warnings to OpenAI—if documented—could prove pivotal.&lt;/p&gt;

&lt;p&gt;For AI companies, this lawsuit should prompt immediate review of their harm reporting mechanisms. The abstract safety research that dominates AI ethics discussions matters less than whether a stalking victim can effectively communicate that your product is being used to terrorize her—and whether anyone at your company is empowered to act on that information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;The theoretical foundations of agentic AI are rapidly solidifying into production-ready architectures. A comprehensive &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;arXiv paper&lt;/a&gt; now formalizes the evolution toward orchestrated multi-agent systems, proposing a reference architecture that cleanly separates cognitive reasoning, hierarchical memory, typed tool invocation, and embedded governance layers.&lt;/p&gt;

&lt;p&gt;Multi-agent coordination patterns have reached standardization maturity. Analysis of frameworks including CAMEL, AutoGen, MetaGPT, LangGraph, Swarm, and MAKER reveals four dominant patterns: chain (sequential), star (hub-and-spoke), mesh (peer-to-peer), and explicit workflow graphs. The &lt;a href="https://huggingface.co/blog/Svngoku/agentic-coding-trends-2026" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends guide&lt;/a&gt; provides implementation details for each, noting that production deployments increasingly favor DAG-based task graphs with content-addressed artifacts for agent collaboration and audit trails.&lt;/p&gt;

&lt;p&gt;On the tooling front, AWS Bedrock AgentCore has emerged as the enterprise-grade option, offering managed infrastructure for agent deployment at scale. CrewAI and LangGraph continue gaining traction for teams preferring role-based agent orchestration with more granular control. OpenAI's &lt;a href="https://openai.com/index/new-tools-for-building-agents/" rel="noopener noreferrer"&gt;Agents SDK&lt;/a&gt; (available in Python and TypeScript) has evolved to become provider-agnostic, with &lt;a href="https://developers.openai.com/blog/openai-for-developers-2025" rel="noopener noreferrer"&gt;documented paths&lt;/a&gt; for integrating non-OpenAI models—a notable concession to the multi-model reality of production systems.&lt;/p&gt;

&lt;p&gt;Research into &lt;a href="https://arxiv.org/html/2603.27075v1" rel="noopener noreferrer"&gt;agentic AI governance&lt;/a&gt; frameworks is accelerating, with particular focus on audit mechanisms for autonomous decision chains and liability attribution when agents operate across organizational boundaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Opens First Permanent London Office Amid UK Expansion Push
&lt;/h2&gt;

&lt;p&gt;OpenAI's decision to establish a permanent London office marks meaningful expansion beyond its San Francisco headquarters and signals serious intent in the European market. The move comes as UK enterprise demand for ChatGPT and API services has grown substantially, driven by financial services, healthcare, and government adoption.&lt;/p&gt;

&lt;p&gt;The timing coincides with ongoing regulatory uncertainty in the US market under the Trump administration, making geographic diversification strategically prudent. London offers access to European AI talent—particularly from universities like Imperial, Oxford, and Cambridge—without the regulatory complexity of establishing operations in EU member states post-Brexit.&lt;/p&gt;

&lt;p&gt;For UK enterprise customers, a local presence should mean faster sales cycles, easier procurement processes, and the relationship-building that remains essential for high-value B2B deals. It also positions OpenAI for potential UK government contracts that often require local presence or data residency.&lt;/p&gt;

&lt;p&gt;The office joins an increasingly competitive London AI scene that includes Google DeepMind's headquarters, Anthropic's growing European team, and numerous well-funded startups. Whether this creates a talent war that benefits workers or simply redistributes the same limited pool of experienced AI engineers remains to be seen.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic Temporarily Bans OpenClaw Creator from Claude Access
&lt;/h2&gt;

&lt;p&gt;In an incident that underscores ongoing tensions between AI companies and the developer ecosystem building on their platforms, Anthropic temporarily blocked the creator of OpenClaw from accessing Claude. OpenClaw is a tool that programmatically interfaces with Claude's capabilities.&lt;/p&gt;

&lt;p&gt;The ban highlights the unclear boundaries of acceptable use for API customers. AI companies want developers building applications on their platforms—it's a significant revenue and ecosystem play—but grow concerned when those applications automate access in ways that stress infrastructure or circumvent rate limits. The challenge is that the line between "creative developer" and "problematic automation" often depends on scale and intent rather than technical implementation.&lt;/p&gt;

&lt;p&gt;This follows a broader pattern of AI companies tightening controls on programmatic access and wrapper applications. OpenAI has similarly cracked down on projects it views as competitive or abusive, creating uncertainty for developers investing in AI-dependent products.&lt;/p&gt;

&lt;p&gt;For the developer community, these incidents raise legitimate concerns about platform risk. Building a business on AI APIs means accepting that your access can be revoked with limited recourse or explanation. The OpenClaw creator's ban was temporary, but the precedent matters—and suggests developers should maintain fallback options across multiple providers wherever possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;p&gt;The Intel-Google partnership will face its first real test when benchmark results for AI-optimized CPU inference emerge—expect performance comparisons against NVIDIA's latest within the quarter. The nuclear power agreements signal a 3-5 year buildout cycle that will determine whether AI scaling continues or hits hard physical limits. And Meta's proprietary pivot suggests the next Llama release may be significantly more restrictive, potentially fragmenting the open-source AI community that coalesced around previous versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.reuters.com/business/intel-google-double-down-ai-cpus-with-expanded-partnership-2026-04-09/" rel="noopener noreferrer"&gt;Intel and Google to double down on AI CPUs with expanded partnership&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reuters.com/commentary/breakingviews/ai-dreams-crash-into-stark-7-trln-reality-2026-04-07/" rel="noopener noreferrer"&gt;AI dreams crash into stark $7 trln reality&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reuters.com/legal/litigation/big-tech-puts-financial-heft-behind-next-gen-nuclear-power-ai-demand-surges-2026-04-10/" rel="noopener noreferrer"&gt;Big Tech puts financial heft behind next-gen nuclear power as AI demand surges&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://venturebeat.com/technology/goodbye-llama-meta-launches-new-proprietary-ai-model-muse-spark-first-since" rel="noopener noreferrer"&gt;Goodbye, Llama? Meta launches new proprietary AI model Muse Spark&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://venturebeat.com/technology/ai-joins-the-8-hour-work-day-as-glm-ships-5-1-open-source-llm-beating-opus-4" rel="noopener noreferrer"&gt;AI joins the 8-hour work day as GLM ships 5.1 open source LLM beating Opus 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;The Evolution of Agentic AI Software Architecture - arXiv&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/blog/Svngoku/agentic-coding-trends-2026" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends - Implementation Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.27075v1" rel="noopener noreferrer"&gt;How the Technical Mechanisms of Agentic AI Outpace Global Legal Frameworks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/new-tools-for-building-agents/" rel="noopener noreferrer"&gt;New tools for building agents - OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://developers.openai.com/blog/openai-for-developers-2025" rel="noopener noreferrer"&gt;OpenAI for Developers in 2025&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every week, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
    <item>
      <title>NVIDIA-Accelerated LangGraph — Parallel and Speculative Execution for Production Agents</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 06 Apr 2026 12:03:13 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/nvidia-accelerated-langgraph-parallel-and-speculative-execution-for-production-agents-4mg6</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/nvidia-accelerated-langgraph-parallel-and-speculative-execution-for-production-agents-4mg6</guid>
      <description>&lt;h1&gt;
  
  
  NVIDIA-Accelerated LangGraph — Parallel and Speculative Execution for Production Agents
&lt;/h1&gt;

&lt;p&gt;Your multi-step research agent takes 12 seconds to respond. Users are bouncing. You've optimized prompts, cached embeddings, and upgraded to faster models—yet the fundamental problem remains: sequential LLM calls compound latency in ways that no single-node optimization can fix. The &lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;LangChain-NVIDIA enterprise partnership&lt;/a&gt; announced in March 2026 addresses this head-on with compile-time execution strategies that analyze your graph structure and automatically parallelize independent operations. This isn't about writing faster code—it's about declaring your intent and letting the compiler find the optimal execution path.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Latency Problem in Multi-Step Agent Workflows
&lt;/h2&gt;

&lt;p&gt;Production agent systems rarely accomplish meaningful work with a single LLM call. A typical research agent might search the web, retrieve relevant documents, synthesize findings, evaluate completeness, and either iterate or produce a final answer. Each node in this workflow adds 500ms to 2 seconds of latency, depending on model size, context length, and inference provider. A five-node graph with one conditional loop easily hits 8-15 seconds—an eternity for interactive applications.&lt;/p&gt;

&lt;p&gt;The frustrating reality is that many of these operations could run simultaneously. Your web search doesn't depend on your document retrieval. Your conditional branches represent alternative futures that could both be computed before you know which path is correct. Yet traditional LangGraph execution respects the topological ordering of your graph, running nodes one after another even when the dependency structure allows parallelism.&lt;/p&gt;

&lt;p&gt;You might reach for &lt;code&gt;asyncio.gather()&lt;/code&gt; to manually parallelize, but this creates its own problems. State management becomes your responsibility. Reducer conflicts when parallel nodes write to the same state key need explicit handling. Rollback semantics for failed branches require careful coordination. The &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;evolution of agentic AI architectures&lt;/a&gt; has highlighted that these orchestration concerns consume significant engineering effort that should instead go toward domain logic.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;langchain-nvidia&lt;/code&gt; package addresses this systematically. Rather than requiring you to restructure your graph or manually manage concurrency, it analyzes your StateGraph at compile time and produces an optimized execution plan. The promise: 40-60% latency reduction on complex graphs without touching your node logic or edge definitions.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the NVIDIA Execution Strategies Work Under the Hood
&lt;/h2&gt;

&lt;p&gt;When you compile a StateGraph with NVIDIA execution strategies enabled, the compiler performs a static analysis pass that builds a dependency DAG from your node and edge definitions. This analysis identifies which nodes read which state keys, which nodes write to which keys, and which edges create hard sequencing requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parallel execution&lt;/strong&gt; batches nodes that have no data dependencies between them. If your &lt;code&gt;search_web&lt;/code&gt; node only reads &lt;code&gt;query&lt;/code&gt; and writes &lt;code&gt;search_results&lt;/code&gt;, while your &lt;code&gt;retrieve_documents&lt;/code&gt; node reads &lt;code&gt;query&lt;/code&gt; and writes &lt;code&gt;retrieved_docs&lt;/code&gt;, these can execute concurrently—they touch disjoint portions of state. The compiler emits an execution plan that groups such nodes into parallel batches, using either asyncio coroutines or thread pools depending on your nodes' implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speculative execution&lt;/strong&gt; goes further by running both branches of conditional edges before the routing function resolves. Consider a conditional edge that routes to either &lt;code&gt;continue_research&lt;/code&gt; or &lt;code&gt;generate_answer&lt;/code&gt; based on a quality check. Traditionally, you'd wait for the quality check, then invoke the selected branch. With speculation, both branches begin executing immediately. Once the routing function returns, the "wrong" branch is terminated and its state changes discarded.&lt;/p&gt;

&lt;p&gt;This differs fundamentally from naive &lt;code&gt;asyncio.gather()&lt;/code&gt; parallelization. The NVIDIA optimizer handles &lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;state merging automatically&lt;/a&gt;, applying reducers in dependency order even when writes arrive out of sequence. For speculative branches, it maintains state snapshots that can be rolled back without corrupting your primary state. Failed branches don't leave partial state mutations behind.&lt;/p&gt;

&lt;p&gt;Memory overhead is the primary trade-off. Speculative branches duplicate the entire state snapshot at branch entry. For graphs with large state objects—say, a &lt;code&gt;messages&lt;/code&gt; list containing hundreds of conversation turns—this duplication can be expensive. The compiler provides heuristics, but you may need to annotate branches where speculation isn't worth the memory cost.&lt;/p&gt;

&lt;p&gt;For full GPU acceleration, the optimizer integrates with &lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;NVIDIA NIM microservices&lt;/a&gt; to batch inference requests from parallel nodes. If you're running Nemotron models through NIM, multiple parallel LLM calls can be batched into a single GPU kernel launch, further reducing overhead. This is where the really dramatic speedups come from—not just concurrent execution, but fused inference at the hardware level.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enabling NVIDIA Optimizations in Your Existing LangGraph
&lt;/h2&gt;

&lt;p&gt;Getting started requires installing the &lt;code&gt;langchain-nvidia&lt;/code&gt; package alongside your existing LangGraph setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;langchain-nvidia&amp;gt;&lt;span class="o"&gt;=&lt;/span&gt;0.2.0 langgraph&amp;gt;&lt;span class="o"&gt;=&lt;/span&gt;0.3.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're targeting full GPU acceleration (not just CPU-based parallelism), you'll also need CUDA 12.x and the NIM client libraries. For CPU-only environments, the optimizer falls back to thread pool parallelism—slower than GPU batching but still significantly faster than sequential execution.&lt;/p&gt;

&lt;p&gt;The integration surfaces through new keyword arguments on the &lt;code&gt;compile()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_nvidia&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NVIDIAExecutionStrategy&lt;/span&gt;

&lt;span class="c1"&gt;# Your existing graph definition
&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retrieve_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;synthesize_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;should_continue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{...})&lt;/span&gt;

&lt;span class="c1"&gt;# Compile with NVIDIA optimizations
&lt;/span&gt;&lt;span class="n"&gt;compiled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;execution_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;NVIDIAExecutionStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARALLEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;speculative_branches&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;speculation_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="c1"&gt;# Max nested speculation levels
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler's auto-detection works well for most graphs, but sometimes state dependencies are implicit or dynamic. You can hint parallelizability with the &lt;code&gt;@independent&lt;/code&gt; decorator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_nvidia&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;independent&lt;/span&gt;

&lt;span class="nd"&gt;@independent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;writes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Explicitly declares state access pattern
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;search_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For conditional edges where one branch has side effects—database writes, external API calls with rate limits, or any non-idempotent operation—you can opt out of speculation per-edge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quality_check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;route_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_more&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_to_database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Has side effects
&lt;/span&gt;    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;speculative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Debugging the execution plan is crucial for understanding what the optimizer actually produces. The &lt;code&gt;explain_execution_plan()&lt;/code&gt; method prints a human-readable DAG with timing estimates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;compiled&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;explain_execution_plan&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output:
# Batch 1 (parallel): [search_node, retrieve_node] ~800ms
# Batch 2 (sequential): [synthesize_node] ~1200ms
# Conditional (speculative): [research_more | final_answer] ~600ms (one discarded)
# Estimated total: 2600ms (vs 4800ms sequential)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One critical gotcha: speculative execution and LangGraph's &lt;code&gt;interrupt()&lt;/code&gt; mechanism don't mix. If a node might raise an interrupt to request human input, that node and all nodes depending on its output must execute sequentially. The compiler enforces this, emitting a warning when it detects potential conflicts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Code Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's build a multi-tool research agent that demonstrates both parallel and speculative execution. This agent searches the web, retrieves documents from a vector store, synthesizes findings, and conditionally loops back for more research or produces a final answer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Research Agent with NVIDIA-Optimized Execution
Demonstrates parallel node batching and speculative conditional execution
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.checkpoint.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MemorySaver&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_nvidia&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NVIDIAExecutionStrategy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;independent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatAnthropic&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TavilySearchResults&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt;


&lt;span class="c1"&gt;# Define agent state with explicit reducers for parallel safety
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Reducer handles parallel writes
&lt;/span&gt;    &lt;span class="n"&gt;retrieved_docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;synthesis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;iteration_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;final_answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;


&lt;span class="c1"&gt;# Initialize components
&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;search_tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TavilySearchResults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# Node 1: Web Search (parallelizable with document retrieval)
&lt;/span&gt;&lt;span class="nd"&gt;@independent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;writes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Searches the web for relevant information.
    Marked @independent because it only reads &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; and writes &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,
    allowing parallel execution with other nodes that don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t touch these keys.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;search_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# Extract content strings from search results
&lt;/span&gt;    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# Node 2: Document Retrieval (parallelizable with web search)
&lt;/span&gt;&lt;span class="nd"&gt;@independent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;writes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieved_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Retrieves relevant documents from vector store.
    Runs in parallel with search_web since they access disjoint state keys.
    In production, this would query Pinecone/Chroma/etc.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Simulated retrieval - replace with actual vector store call
&lt;/span&gt;    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Simulate retrieval latency
&lt;/span&gt;    &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retrieved context for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                 &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector_store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieved_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# Node 3: Synthesis (depends on search_results and retrieved_docs)
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;synthesize_findings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Synthesizes all gathered information into a coherent summary.
    Must run after parallel nodes complete since it reads their outputs.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Combine all sources
&lt;/span&gt;    &lt;span class="n"&gt;all_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;doc_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieved_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Based on the following research for query &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:

Web Search Results:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;all_context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Retrieved Documents:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Provide a synthesis of the key findings. Note any gaps that require further research.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# Node 4: Continue Research (speculative branch - may be discarded)
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;continue_research&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Generates a refined query for additional research.
    This node runs speculatively alongside generate_answer.
    If routing selects generate_answer, this branch is discarded.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;The current synthesis has gaps: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Generate a refined search query to fill these gaps.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;# Updates query for next iteration
&lt;/span&gt;

&lt;span class="c1"&gt;# Node 5: Generate Final Answer (speculative branch - may be discarded)
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Produces the final answer from synthesized research.
    Runs speculatively - discarded if routing continues research instead.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Based on this research synthesis:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Provide a comprehensive final answer to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# Routing function for conditional edge
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;should_continue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Determines whether research is sufficient or needs another iteration.
    Both branches execute speculatively before this function returns.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Simple heuristic: max 3 iterations, or check synthesis quality
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# In production, use LLM to evaluate synthesis completeness
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;further research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="c1"&gt;# Build the graph
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_research_graph&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add nodes
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retrieve_documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;synthesize_findings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue_research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;continue_research&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;generate_answer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add edges - search and retrieve can run in parallel
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Both start from START = parallel
&lt;/span&gt;
    &lt;span class="c1"&gt;# Both must complete before synthesis
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Conditional edge with speculative execution
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;should_continue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue_research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Loop back or end
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue_research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;


&lt;span class="c1"&gt;# Compile with NVIDIA optimizations
&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_research_graph&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Baseline compilation (sequential execution)
&lt;/span&gt;&lt;span class="n"&gt;baseline_compiled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MemorySaver&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# Optimized compilation with parallel + speculative execution
&lt;/span&gt;&lt;span class="n"&gt;optimized_compiled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MemorySaver&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;execution_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;NVIDIAExecutionStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARALLEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;speculative_branches&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;speculation_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# Benchmarking harness
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;compiled_graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Measures end-to-end latency across multiple runs.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;latencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;configurable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;compiled_graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the latest advances in quantum computing?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieved_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
        &lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;avg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p50&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;p95&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: avg=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;avg&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s, p50=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p50&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s, p95=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p95&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;latencies&lt;/span&gt;


&lt;span class="c1"&gt;# Run comparison
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Print execution plan for optimized graph
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== Optimized Execution Plan ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized_compiled&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;explain_execution_plan&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Run benchmarks
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== Benchmark Results (50 runs each) ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;baseline_latencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseline_compiled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Baseline&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;optimized_latencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized_compiled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Optimized&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate improvement
&lt;/span&gt;    &lt;span class="n"&gt;baseline_avg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseline_latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseline_latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;optimized_avg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized_latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized_latencies&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;improvement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseline_avg&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;optimized_avg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;baseline_avg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Latency reduction: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;improvement&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run this benchmark, you'll see the execution plan clearly showing how &lt;code&gt;search_web&lt;/code&gt; and &lt;code&gt;retrieve_docs&lt;/code&gt; batch together in a parallel group, followed by &lt;code&gt;synthesize&lt;/code&gt;, then the speculative conditional where both &lt;code&gt;continue_research&lt;/code&gt; and &lt;code&gt;generate_answer&lt;/code&gt; start simultaneously. In &lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;LangSmith traces&lt;/a&gt;, the parallel batch appears as overlapping spans rather than sequential blocks—a visual confirmation that optimization is working.&lt;/p&gt;

&lt;p&gt;Expected results on this 4-node graph with one conditional branch: approximately 45% latency reduction compared to baseline sequential execution. The exact improvement depends on your inference provider's concurrency support and network conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use (and When to Avoid) Speculative Execution
&lt;/h2&gt;

&lt;p&gt;Speculative execution is powerful but not free. Every speculative branch consumes compute resources—tokens, GPU cycles, API calls—for work that may be discarded. Understanding when speculation pays off is critical for production deployments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ideal scenarios for speculation:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Read-only branches perform well because discarding them has no lasting effects. A branch that only runs inference and updates local state can be safely abandoned mid-execution. Similarly, idempotent operations—where running twice produces the same result as running once—are safe to speculate because even if partial work persists, it doesn't corrupt your system.&lt;/p&gt;

&lt;p&gt;Low-cost nodes are obvious candidates. If both branches of a conditional complete in under 500ms, speculating costs you at most 500ms of parallel compute. When routing takes 200ms to resolve, you've paid a small premium for potentially significant latency reduction.&lt;/p&gt;

&lt;p&gt;Cached or pre-computed results make speculation nearly free. If one branch just reads from a cache while the other runs full inference, speculating on the cached branch adds negligible overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scenarios where speculation hurts:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;External writes are the biggest red flag. A branch that writes to a database, sends an email, or calls a billing API should never run speculatively. Even if you "discard" the branch result, the side effect has already occurred. The framework can't undo your Stripe charge.&lt;/p&gt;

&lt;p&gt;API rate limits compound the problem. If you're calling a third-party API with per-minute quotas, speculative branches double your request rate on conditional paths. During the &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;agentic AI architecture evolution&lt;/a&gt;, practitioners have learned that rate limit exhaustion often manifests as cascading failures rather than graceful degradation.&lt;/p&gt;

&lt;p&gt;Vastly asymmetric branch costs make speculation inefficient. If one branch takes 200ms and the other takes 5 seconds, speculating on the expensive branch when the cheap branch would have been selected wastes 5 seconds of compute. The framework can't predict routing outcomes, so it speculates on both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost analysis template:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ROI calculation for speculative execution on a conditional edge
&lt;/span&gt;&lt;span class="n"&gt;cheap_branch_cost_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="n"&gt;expensive_branch_cost_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;
&lt;span class="n"&gt;routing_probability_cheap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;  &lt;span class="c1"&gt;# 70% of traffic takes cheap branch
&lt;/span&gt;&lt;span class="n"&gt;routing_probability_expensive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;

&lt;span class="c1"&gt;# Without speculation: expected tokens per request
&lt;/span&gt;&lt;span class="n"&gt;baseline_expected_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;cheap_branch_cost_tokens&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;routing_probability_cheap&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="n"&gt;expensive_branch_cost_tokens&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;routing_probability_expensive&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# = 500 * 0.7 + 3000 * 0.3 = 350 + 900 = 1250 tokens
&lt;/span&gt;
&lt;span class="c1"&gt;# With speculation: always pay for both branches
&lt;/span&gt;&lt;span class="n"&gt;speculative_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cheap_branch_cost_tokens&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;expensive_branch_cost_tokens&lt;/span&gt;
&lt;span class="c1"&gt;# = 500 + 3000 = 3500 tokens
&lt;/span&gt;
&lt;span class="c1"&gt;# Speculation costs 2.8x more tokens
# Only worth it if latency reduction value exceeds 2.8x token cost increase
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Hybrid approaches work best.&lt;/strong&gt; Mark expensive or side-effect-producing branches as non-speculative while allowing cheap, read-only branches to speculate freely. The per-edge &lt;code&gt;speculative&lt;/code&gt; parameter gives you this granularity.&lt;/p&gt;

&lt;p&gt;One subtle interaction: &lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;LangGraph's checkpointing mechanism&lt;/a&gt; doesn't persist speculative branch state until routing resolves. This is usually what you want—failed speculation shouldn't pollute your checkpoint history. However, it means you can't resume from a mid-speculation checkpoint if the process crashes. For long-running agents where crash recovery matters, consider whether speculation's latency benefits outweigh the recovery complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;The NVIDIA execution optimization represents a philosophical shift in how we build agent systems. Instead of meticulously hand-tuning async boundaries and managing concurrent state, you declare your node dependencies and let the compiler find parallelism. This is the same trajectory that took SQL from procedural cursor loops to declarative queries optimized by the database engine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immediate wins for existing deployments:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you have production LangGraph agents today, you can often get meaningful speedups without any refactoring. Install &lt;code&gt;langchain-nvidia&lt;/code&gt;, add the execution strategy flags to your &lt;code&gt;compile()&lt;/code&gt; call, and run &lt;code&gt;explain_execution_plan()&lt;/code&gt; to see what the optimizer finds. Many real-world graphs have latent parallelism—nodes that happen to be defined sequentially but don't actually depend on each other's outputs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blog.langchain.com/customers-kensho/" rel="noopener noreferrer"&gt;Kensho's multi-agent framework&lt;/a&gt; is a good case study here. Their financial research agents had multiple data-gathering nodes that were functionally independent but executed sequentially due to how the graph was originally authored. Adding parallel execution dropped their median latency by 38% with zero changes to node implementations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment considerations:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Full GPU acceleration requires &lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;NVIDIA NIM microservices&lt;/a&gt; running somewhere your agents can reach them—either self-hosted on GPU instances or via NVIDIA's cloud offerings. This adds infrastructure complexity but enables inference batching that further compounds the parallel execution benefits.&lt;/p&gt;

&lt;p&gt;For teams not ready to operate NIM infrastructure, CPU-only fallback still provides parallel execution via thread pools. You lose the GPU batching speedups but retain the concurrent node execution benefits. This is a reasonable starting point for evaluation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost-benefit for cloud deployments:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Parallel execution increases your instantaneous compute footprint. Instead of one inference call at a time, you might have three or four. On cloud GPU instances priced by the hour, this doesn't increase cost—you're already paying for the GPU. On API-based providers priced per token, parallel execution is cost-neutral (same tokens, just concurrent). Speculative execution, however, genuinely increases token spend on conditional paths.&lt;/p&gt;

&lt;p&gt;Run the numbers for your specific traffic patterns. A 45% latency reduction might justify a 20% increase in token costs for latency-sensitive applications. For batch processing where latency doesn't matter, speculation may not make sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration path:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with parallel execution only (&lt;code&gt;speculative_branches=False&lt;/code&gt;). This is nearly always safe and provides immediate benefits. Monitor your &lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;LangSmith traces&lt;/a&gt; for the parallel execution patterns, validate that state merging works correctly for your reducers, and measure the actual latency improvement.&lt;/p&gt;

&lt;p&gt;Once comfortable, enable speculative execution on specific conditional edges where both branches are cheap and side-effect-free. Use the per-edge &lt;code&gt;speculative&lt;/code&gt; parameter rather than the global flag. This incremental approach lets you learn where speculation helps in your specific workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New observability metrics:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LangSmith's integration with the NVIDIA execution strategies surfaces two new metrics worth monitoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speculative waste ratio&lt;/strong&gt;: Percentage of speculative branch compute that gets discarded. High values (&amp;gt;50%) suggest your speculation targets are poorly chosen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel efficiency score&lt;/strong&gt;: Ratio of achieved parallelism to theoretical maximum. Low values indicate state dependencies you might be able to refactor away.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Looking ahead, the &lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;LangChain roadmap&lt;/a&gt; hints at auto-tuning capabilities that learn optimal speculation targets from production traces. The vision: your agent framework observes which branches win routing decisions, estimates branch costs, and automatically adjusts speculation settings to minimize expected latency given observed traffic patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Build This Week
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Project: Latency-Optimized Customer Support Agent&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Build a customer support agent that handles incoming tickets by: (1) classifying the ticket category, (2) searching knowledge base documentation, (3) retrieving similar past tickets, and (4) either generating a draft response or escalating to a human—with the escalation/response decision made via conditional routing.&lt;/p&gt;

&lt;p&gt;Implementation steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define a StateGraph with four main nodes plus the conditional branch&lt;/li&gt;
&lt;li&gt;Annotate the knowledge base search and past ticket retrieval nodes as &lt;code&gt;@independent&lt;/code&gt;—they read the same input but write to different state keys&lt;/li&gt;
&lt;li&gt;Mark the "escalate to human" branch as &lt;code&gt;speculative=False&lt;/code&gt; since escalation triggers external notifications&lt;/li&gt;
&lt;li&gt;Allow the "draft response" branch to speculate since it's a pure LLM call&lt;/li&gt;
&lt;li&gt;Implement &lt;code&gt;explain_execution_plan()&lt;/code&gt; logging to verify optimization&lt;/li&gt;
&lt;li&gt;Build a test harness that submits 100 tickets and compares baseline vs. optimized latency distributions&lt;/li&gt;
&lt;li&gt;Calculate your actual ROI: latency reduction vs. token cost increase from speculation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Target metrics: 35-50% latency reduction on the happy path (response generation), with escalation paths unaffected by speculation. If your support tickets have 80/20 response/escalation ratio, speculation should have positive ROI even with some wasted compute on the 20% that escalates.&lt;/p&gt;

&lt;p&gt;This project directly applies to production use cases and gives you hands-on experience with the optimization trade-offs before deploying to real traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/nvidia-enterprise/" rel="noopener noreferrer"&gt;LangChain Announces Enterprise Agentic AI Platform Built ...&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/march-2026-langchain-newsletter/" rel="noopener noreferrer"&gt;March 2026: LangChain Newsletter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/on-agent-frameworks-and-agent-observability/" rel="noopener noreferrer"&gt;On Agent Frameworks and Agent Observability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/customers-kensho/" rel="noopener noreferrer"&gt;How Kensho built a multi-agent framework with LangGraph ...&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;The Evolution of Agentic AI Software Architecture&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of the **Agentic Engineering Weekly&lt;/em&gt;* series — a deep-dive every Monday into the frameworks,&lt;br&gt;
patterns, and techniques shaping the next generation of AI systems.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Agentic Engineering Weekly series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building something agentic? Drop a comment — I'd love to feature reader projects.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>Primitive Shifts: The Async Task Primitive</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 06 Apr 2026 12:02:23 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/primitive-shifts-the-async-task-primitive-1aih</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/primitive-shifts-the-async-task-primitive-1aih</guid>
      <description>&lt;h1&gt;
  
  
  Primitive Shifts: The Async Task Primitive
&lt;/h1&gt;

&lt;p&gt;Every few months, the baseline of how AI systems work quietly moves. Engineers who noticed early weren't smarter — they were just paying attention to the right signals. The engineers who saw containerization coming didn't predict Docker's dominance; they noticed deployment friction patterns that VMs couldn't solve. The ones who caught the serverless shift weren't visionaries; they were tired of capacity planning for bursty workloads. Right now, the same kind of quiet shift is happening in agent orchestration — and if you're still building synchronous agent calls, you're about to feel the floor move.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is It?
&lt;/h2&gt;

&lt;p&gt;The shift is from synchronous agent invocations — where you call an agent, block, and wait for a response — to &lt;strong&gt;task-based asynchronous primitives&lt;/strong&gt; with first-class status tracking, timeout budgets, and structured resumption. The core pattern: agent calls return a task identifier immediately, the client polls or subscribes for status updates, and the agent can run for minutes to hours without blocking the caller.&lt;/p&gt;

&lt;p&gt;This isn't just "background jobs for AI." The &lt;a href="https://arxiv.org/html/2603.13417" rel="noopener noreferrer"&gt;design patterns research for deploying AI agents&lt;/a&gt; describes protocol-level primitives that are emerging specifically for agent orchestration: task TTLs (typically 15 minutes for orphaned tasks), adaptive timeout budgeting that propagates through multi-agent hierarchies, and structured error semantics that go far beyond HTTP status codes. When a task fails, you don't just get a 500 — you get typed failures like &lt;code&gt;budget_exhausted&lt;/code&gt;, &lt;code&gt;upstream_unavailable&lt;/code&gt;, or &lt;code&gt;partial_completion&lt;/code&gt; with resumable checkpoints.&lt;/p&gt;

&lt;p&gt;AWS Bedrock AgentCore already ships this pattern in production. The MCP specification is converging toward what's being called the "tasks pattern" as the standard approach for non-trivial agent interactions. But the key insight is architectural: in multi-agent systems where planner agents invoke specialist agents, synchronous blocking is &lt;a href="https://arxiv.org/html/2603.13417" rel="noopener noreferrer"&gt;guaranteed to fail against planner timeout budgets&lt;/a&gt;. The planner has 10 minutes to coordinate five specialists. If any specialist blocks for 8 minutes, the planner can't aggregate results before its own deadline. The math doesn't work.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;agentic AI architecture research&lt;/a&gt; frames this as moving from "orchestration through blocking calls" to "orchestration through task lifecycle management." It's the difference between a conductor waiting for each musician to finish before cueing the next, versus a conductor who starts all sections and monitors their progress in parallel.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It's Flying Under the Radar
&lt;/h2&gt;

&lt;p&gt;Most agent tutorials and demos still show synchronous patterns that work perfectly for 30-second completions. The canonical example — "build an agent that searches the web and summarizes results" — completes in under a minute. The tutorials work. The YouTube videos work. The blog posts work. And engineers reasonably conclude that their production systems should look like the tutorials.&lt;/p&gt;

&lt;p&gt;But &lt;a href="https://www.anthropic.com/news/measuring-agent-autonomy" rel="noopener noreferrer"&gt;Anthropic's data on agent autonomy&lt;/a&gt; shows that Claude Code's 99th percentile turn durations have nearly doubled — from around 25 minutes to over 45 minutes — as the system handles increasingly complex tasks. The &lt;a href="https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends Report&lt;/a&gt; confirms this trajectory: as agents take on more autonomous work, session lengths stretch. The ceiling you're building against isn't the 45-second median; it's the 45-minute tail that's growing faster than most architectures can accommodate.&lt;/p&gt;

&lt;p&gt;Engineers familiar with background job patterns — Celery, Sidekiq, Bull — assume they can retrofit existing infrastructure. But agent tasks need &lt;strong&gt;identity propagation&lt;/strong&gt; (who initiated this multi-hop request?), &lt;strong&gt;context-scoped routing&lt;/strong&gt; (which agent instance has the conversation state?), and &lt;strong&gt;structured continuation&lt;/strong&gt; (how do we resume from a checkpoint with partial results?). Traditional job queues give you "run this function later." Agent tasks need "run this function later, as this user, with this context, reporting status to this callback, resuming from this state if interrupted, and inheriting this timeout budget."&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://arxiv.org/html/2603.25100v1" rel="noopener noreferrer"&gt;research on separation of execution power in AI systems&lt;/a&gt; describes this as a fundamental architectural distinction: agent tasks aren't background jobs with extra metadata — they're a different primitive with different lifecycle guarantees.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Try It Today
&lt;/h2&gt;

&lt;p&gt;The simplest starting point is implementing the MCP tasks pattern in your existing agent infrastructure. Here's a minimal Python implementation showing the core primitives — task creation, status polling, timeout budgeting, and structured errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# task_primitives.py
# A minimal implementation of async task primitives for agent orchestration
# Requires: pip install pydantic&amp;gt;=2.0 fastapi uvicorn
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid4&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;PENDING&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;           &lt;span class="c1"&gt;# Created, not yet started
&lt;/span&gt;    &lt;span class="n"&gt;RUNNING&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;running&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;           &lt;span class="c1"&gt;# Currently executing
&lt;/span&gt;    &lt;span class="n"&gt;COMPLETED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;       &lt;span class="c1"&gt;# Finished successfully
&lt;/span&gt;    &lt;span class="n"&gt;FAILED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;            &lt;span class="c1"&gt;# Finished with error
&lt;/span&gt;    &lt;span class="n"&gt;TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="c1"&gt;# Budget exhausted
&lt;/span&gt;    &lt;span class="n"&gt;PARTIAL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;partial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="c1"&gt;# Partial completion with checkpoint
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ErrorType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Structured error types beyond generic failures&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;BUDGET_EXHAUSTED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget_exhausted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;      &lt;span class="c1"&gt;# Timeout budget depleted
&lt;/span&gt;    &lt;span class="n"&gt;UPSTREAM_UNAVAILABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;upstream_unavailable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Dependency failed
&lt;/span&gt;    &lt;span class="n"&gt;PARTIAL_COMPLETION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;partial_completion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Can resume from checkpoint
&lt;/span&gt;    &lt;span class="n"&gt;CONTEXT_LOST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context_lost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;             &lt;span class="c1"&gt;# State invalidated
&lt;/span&gt;
&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TaskCheckpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Resumable state for partial completions&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;step_completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;intermediate_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;remaining_work&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt; 
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Core task primitive with timeout budgeting&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
    &lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;  &lt;span class="c1"&gt;# Time remaining for this task
&lt;/span&gt;    &lt;span class="n"&gt;parent_task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# For hierarchical tracking
&lt;/span&gt;    &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TaskCheckpoint&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ErrorType&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;error_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="c1"&gt;# TTL for orphaned task cleanup (default 15 minutes per MCP pattern)
&lt;/span&gt;    &lt;span class="n"&gt;ttl_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;900.0&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Calculate remaining timeout budget&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;child_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reserve_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;30.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Budget to allocate to child tasks, reserving time for aggregation&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;reserve_seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TaskRegistry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;In-memory task store (use Redis/Postgres in production)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_cleanup_task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;parent_task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a new task and return immediately (fire-and-forget pattern)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
            &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;parent_task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;parent_task_id&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ErrorType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;error_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskCheckpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Update task status with structured error semantics&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error_type&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error_message&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;checkpoint&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cleanup_expired&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Remove orphaned tasks past TTL&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;expired&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ttl_seconds&lt;/span&gt;
            &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RUNNING&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expired&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# In production: log, emit metric, notify observers
&lt;/span&gt;            &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_tasks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Example: Multi-agent orchestration with budget propagation
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;orchestrator_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskRegistry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_budget_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;600.0&lt;/span&gt;  &lt;span class="c1"&gt;# 10 minutes
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Orchestrator that spawns specialist agents with propagated budgets.
    Demonstrates the &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fire, track, resume&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; pattern.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Create parent task immediately (fire)
&lt;/span&gt;    &lt;span class="n"&gt;parent_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_budget_seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RUNNING&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Spawn child tasks with reduced budgets (propagate)
&lt;/span&gt;    &lt;span class="n"&gt;child_budget&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reserve_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;60.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;child_tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;specialist&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;timeout_budget_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;child_budget&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Split among specialists
&lt;/span&gt;            &lt;span class="n"&gt;parent_task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;child_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="c1"&gt;# In production: dispatch to actual agent workers
&lt;/span&gt;        &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;simulate_specialist_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Poll for completion (track)
&lt;/span&gt;    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;60.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# Reserve aggregation time
&lt;/span&gt;        &lt;span class="n"&gt;all_done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; 
            &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FAILED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;child_tasks&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;all_done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Poll interval
&lt;/span&gt;
    &lt;span class="c1"&gt;# Aggregate results or handle partial completion (resume)
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;child_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;child_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;child_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;child_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;child_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Use checkpoint for partial results
&lt;/span&gt;            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;child_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;intermediate_results&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child_tasks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Aggregated: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Partial completion - save checkpoint for resumption
&lt;/span&gt;        &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ErrorType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL_COMPLETION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;TaskCheckpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;step_completed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;intermediate_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;remaining_work&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;child_tasks&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parent_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simulate_specialist_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskRegistry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;specialist_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Simulate specialist agent work with timeout awareness&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RUNNING&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Check budget before starting work
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FAILED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ErrorType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BUDGET_EXHAUSTED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;error_message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Insufficient budget to start work&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="c1"&gt;# Simulate work with periodic budget checks
&lt;/span&gt;    &lt;span class="n"&gt;work_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;30.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;work_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Save checkpoint before timeout
&lt;/span&gt;            &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ErrorType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTIAL_COMPLETION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;TaskCheckpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;step_completed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;intermediate_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;partial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;specialist_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; interim&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="n"&gt;remaining_work&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;finalize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;specialist_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; completed successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation shows the three key shifts from synchronous patterns: immediate task ID return instead of blocking, explicit budget propagation to child tasks, and structured checkpoints for partial completions. The &lt;a href="https://arxiv.org/html/2603.13417" rel="noopener noreferrer"&gt;Context-Aware Broker Pattern&lt;/a&gt; described in deployment research extends this further with identity propagation and six-stage routing — but this foundation gets you 80% of the architectural benefit.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Synchronous agent wrappers become technical debt.&lt;/strong&gt; Every &lt;code&gt;await agent.complete(prompt)&lt;/code&gt; in your codebase is a blocking call that assumes sub-minute responses. The &lt;a href="https://arxiv.org/pdf/2601.15195" rel="noopener noreferrer"&gt;empirical study of AI coding agents&lt;/a&gt; documents failure modes that emerge specifically when synchronous assumptions break — cascading timeouts, lost context, and silent failures that only surface in logs hours later. The investment in task primitives pays off immediately for any agent interaction that might exceed your HTTP timeout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability requires task-native tracing.&lt;/strong&gt; Traditional request traces end when the HTTP connection closes. The &lt;a href="https://arxiv.org/html/2603.28735v1" rel="noopener noreferrer"&gt;research on architecture documentation for AI systems&lt;/a&gt; emphasizes that agent observability must span the full task lifecycle — creation, status transitions, checkpoint saves, and eventual completion or resumption. Your existing Datadog or Jaeger setup will show a request that returned quickly with a task ID, then nothing until the client polls. The actual work happens in a tracing blind spot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-agent orchestration needs explicit budget propagation.&lt;/strong&gt; Without passing remaining TTL to child agents, you get the scenario from our code example: a child agent consumes the entire timeout budget, and the parent fails to aggregate results before its own deadline. The &lt;a href="https://arxiv.org/html/2603.27296v1" rel="noopener noreferrer"&gt;multi-agent AI systems research&lt;/a&gt; shows this is the primary failure mode in hierarchical agent architectures — not model errors, but coordination timeouts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error handling shifts from "retry or fail" to "resume from checkpoint."&lt;/strong&gt; The structured error semantics in the code — &lt;code&gt;partial_completion&lt;/code&gt; with checkpoint state — reflect a fundamental change in how failures work. The &lt;a href="https://arxiv.org/html/2603.27249v1" rel="noopener noreferrer"&gt;growing burden of AI-assisted development&lt;/a&gt; notes that engineers spend increasing time recovering from partial completions rather than handling clean success/failure binaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Infrastructure Signal
&lt;/h2&gt;

&lt;p&gt;AWS Bedrock AgentCore shipping task patterns in production suggests AWS sees this as required infrastructure, not experimental nicety. When a major cloud provider builds timeout budgeting and structured resumption into their agent platform, they're responding to customer pain that's already widespread enough to justify platform investment.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://arxiv.org/html/2602.18764v2" rel="noopener noreferrer"&gt;schema-guided dialogue systems research&lt;/a&gt; describes enterprise deployments requiring what they call "CABP-style identity propagation" — context-aware broker patterns that synchronous architectures simply cannot support. The identity of who initiated a request must flow through every task hop; orphaned tasks with sensitive context need explicit cleanup policies. This is table-stakes for enterprise compliance, and it requires task primitives.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/ARUNAGIRINATHAN-K/awesome-ai-agents" rel="noopener noreferrer"&gt;AI agents ecosystem&lt;/a&gt; shows a clear pattern: agent frameworks are increasingly separating "tooling and infrastructure" from core agent logic. Task lifecycle is becoming a layer — like logging or metrics — that you import rather than build. The &lt;a href="https://arxiv.org/html/2603.14805v1" rel="noopener noreferrer"&gt;institutional knowledge primitives research&lt;/a&gt; frames this as inevitable: as agent capabilities grow, the operational complexity concentrates in lifecycle management rather than inference.&lt;/p&gt;

&lt;p&gt;Perhaps most telling: Anthropic's data shows the 99.9th percentile turn durations &lt;a href="https://www.anthropic.com/news/measuring-agent-autonomy" rel="noopener noreferrer"&gt;continue to climb&lt;/a&gt; as agents handle more complex autonomous work. Architectures built for minutes will hit hours. The question isn't whether you'll need task primitives — it's whether you'll adopt them proactively or reactively after a production incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift Rating
&lt;/h2&gt;

&lt;p&gt;🟢 &lt;strong&gt;Adopt Now&lt;/strong&gt; — Teams building multi-agent systems or expecting agent turn durations beyond 2-3 minutes should implement task primitives immediately. The synchronous patterns that work today will fail silently as capabilities expand — not with clean errors, but with timeout cascades that surface as missing data and confused users.&lt;/p&gt;

&lt;p&gt;The refactoring cost grows with codebase size. Early adopters implement the pattern once in their agent abstraction layer. Late adopters retrofit dozens of call sites after production incidents reveal the architectural gap. The &lt;a href="https://arxiv.org/html/2602.10122v1" rel="noopener noreferrer"&gt;practical guide to agentic AI transition&lt;/a&gt; explicitly recommends treating task primitives as foundational infrastructure rather than optimization — not because current workloads require it, but because the ceiling is rising faster than codebases can reactively adapt.&lt;/p&gt;

&lt;p&gt;If you're still synchronously awaiting agent completions, start with the highest-duration calls. Wrap them in task primitives this month. When the floor moves — and &lt;a href="https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf" rel="noopener noreferrer"&gt;the data suggests it's moving soon&lt;/a&gt; — you'll already be standing on solid ground.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.13417" rel="noopener noreferrer"&gt;Design Patterns for Deploying AI Agents with Model Context Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.25100v1" rel="noopener noreferrer"&gt;From Logic Monopoly to Social Contract: Separation of Execution Power in AI Systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.14805v1" rel="noopener noreferrer"&gt;AI Skills as the Institutional Knowledge Primitive for Agentic Systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2602.10122v1" rel="noopener noreferrer"&gt;A Practical Guide to Agentic AI Transition in Organizations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;The Evolution of Agentic AI Software Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/news/measuring-agent-autonomy" rel="noopener noreferrer"&gt;Measuring AI agent autonomy in practice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.27296v1" rel="noopener noreferrer"&gt;A Multi-agent AI System for Deep Learning Model Lifecycle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends Report&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2601.15195" rel="noopener noreferrer"&gt;Where Do AI Coding Agents Fail? An Empirical Study&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.27249v1" rel="noopener noreferrer"&gt;The Growing Burden of AI-Assisted Software Development&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2602.18764v2" rel="noopener noreferrer"&gt;The Convergence of Schema-Guided Dialogue Systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2603.28735v1" rel="noopener noreferrer"&gt;RAD-AI: Rethinking Architecture Documentation for AI Systems&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://github.com/ARUNAGIRINATHAN-K/awesome-ai-agents" rel="noopener noreferrer"&gt;Awesome AI Agents for 2026&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of **Primitive Shifts&lt;/em&gt;* — a monthly series tracking when new AI building blocks&lt;br&gt;
move from novel experiments to infrastructure you'll be expected to know.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Next MCP Watch series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Spotted a shift happening in your stack? Drop it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AI Weekly Roundup: Microsoft's Price War, Google's Open Model Push, and the Reliability Question Looming Over Enterprise AI</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 06 Apr 2026 12:02:21 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-roundup-microsofts-price-war-googles-open-model-push-and-the-reliability-question-2a49</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-roundup-microsofts-price-war-googles-open-model-push-and-the-reliability-question-2a49</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly Roundup: Microsoft's Price War, Google's Open Model Push, and the Reliability Question Looming Over Enterprise AI
&lt;/h1&gt;

&lt;p&gt;This week marks a pivotal moment in the AI landscape as the two largest cloud providers made aggressive moves to capture developer mindshare—Microsoft through aggressive pricing and Google by releasing its most capable open-weight models yet. But beneath the product announcements lies a more fundamental question gaining traction: can AI systems actually achieve the reliability that justifies the hundreds of billions being wagered on enterprise adoption?&lt;/p&gt;

&lt;h2&gt;
  
  
  Microsoft Unveils MAI Foundry Models to Challenge OpenAI and Google on Price
&lt;/h2&gt;

&lt;p&gt;Microsoft's MAI Superintelligence team, led by Mustafa Suleyman, &lt;a href="https://techcrunch.com/2026/04/02/microsoft-takes-on-ai-rivals-with-three-new-foundational-models/" rel="noopener noreferrer"&gt;released three new foundational models&lt;/a&gt; this week designed to undercut competitors on inference costs while maintaining competitive performance. The models—MAI-1, MAI-2, and MAI-3—span a range of parameter counts optimized for different use cases, from lightweight edge deployment to full-scale reasoning tasks.&lt;/p&gt;

&lt;p&gt;The release represents Microsoft's first major in-house foundation model push since acquiring Inflection AI talent in 2024. Suleyman's team developed the models under what they're calling a "Humanist AI" philosophy, emphasizing practical human-centered communication over raw benchmark performance. In practice, this translates to models that prioritize coherent multi-turn dialogue and task completion over flashy single-shot capabilities.&lt;/p&gt;

&lt;p&gt;The key selling point is price: Microsoft claims MAI-2 delivers comparable performance to GPT-4-class models at roughly 40% lower inference costs, while MAI-3 targets the premium reasoning tier at prices significantly below OpenAI's o1 and Google's Gemini Ultra. The models are available immediately through Microsoft Foundry and will be integrated across Microsoft 365 Copilot, Azure AI services, and GitHub Copilot in coming weeks.&lt;/p&gt;

&lt;p&gt;Whether these cost savings hold up under production workloads remains to be seen, but the pricing pressure on OpenAI—Microsoft's own portfolio company—signals just how fragmented the foundation model market has become.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google Releases Gemma 4 as Most Capable Open Model Family
&lt;/h2&gt;

&lt;p&gt;Google DeepMind &lt;a href="https://deepmind.google/blog/gemma-4-byte-for-byte-the-most-capable-open-models/" rel="noopener noreferrer"&gt;announced Gemma 4&lt;/a&gt;, its first major update to the Gemma open model family in over a year, shipping four distinct model sizes targeting different deployment scenarios. The lineup includes E2B and E4B efficient variants for edge and mobile applications, a 26B Mixture-of-Experts model for cost-effective inference, and a 31B dense model positioned as the flagship for maximum capability.&lt;/p&gt;

&lt;p&gt;All models ship under the Apache 2.0 license, making them fully permissive for commercial use—a notable contrast to Meta's Llama licensing restrictions. The 31B dense model in particular targets complex logic and agentic workflows, with Google claiming competitive performance against closed-source alternatives on multi-step reasoning benchmarks.&lt;/p&gt;

&lt;p&gt;Perhaps most significant is the 1M token context window available across the model family, enabling document-scale processing without chunking. Early reports from the Hugging Face community, documented in their &lt;a href="https://huggingface.co/blog/huggingface/state-of-os-hf-spring-2026" rel="noopener noreferrer"&gt;Spring 2026 State of Open Source report&lt;/a&gt;, suggest the 26B MoE variant delivers particularly strong results on agentic coding tasks while requiring substantially less compute than the dense model.&lt;/p&gt;

&lt;p&gt;The timing positions Gemma 4 as a direct response to both Meta's Llama 4 release and the growing ecosystem of fine-tuned open models. For teams building production AI systems without deep pockets for API costs, this release significantly raises the bar for what's achievable with self-hosted inference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;The multi-agent orchestration landscape continues to mature, with &lt;a href="https://github.com/caramaschiHG/awesome-ai-agents-2026" rel="noopener noreferrer"&gt;LangChain, AutoGen, and CrewAI&lt;/a&gt; maintaining their positions as the dominant frameworks for building complex agent systems. However, the past quarter has seen significant movement in the tooling layer as teams push agents from demos into production environments.&lt;/p&gt;

&lt;p&gt;Several &lt;a href="https://github.com/ARUNAGIRINATHAN-K/awesome-ai-agents" rel="noopener noreferrer"&gt;new frameworks have emerged&lt;/a&gt; addressing specific pain points: VoltAgent brings a TypeScript-first approach with self-improving context management, while PraisonAI focuses on production multi-agent deployments with native Model Context Protocol (MCP) integration. The MCP standard, now approaching its first anniversary, has become the de facto protocol for tool integration across the ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://huggingface.co/blog/RDTvlokip/the-9-trends-that-will-explode-this-year" rel="noopener noreferrer"&gt;Smolagents from Hugging Face&lt;/a&gt; continues gaining traction with its code-first philosophy where agents write and execute Python directly rather than emitting JSON tool calls. This approach trades some safety guarantees for dramatically improved flexibility in complex workflows.&lt;/p&gt;

&lt;p&gt;On the MLOps side, ZenML's integration of "LangGraph swarms" into standard ML pipelines represents the &lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;ongoing convergence&lt;/a&gt; between traditional machine learning infrastructure and agentic systems. The industry is clearly shifting toward formalized inter-agent protocols and supervisor agent patterns, moving away from the ad-hoc agent chains that characterized early implementations.&lt;/p&gt;

&lt;p&gt;Research from this quarter emphasizes that &lt;a href="https://arxiv.org/html/2601.02749v1" rel="noopener noreferrer"&gt;production-ready agent architectures&lt;/a&gt; require explicit failure handling, state persistence, and human-in-the-loop checkpoints—capabilities that separate serious frameworks from toy implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take-Two Shutters Internal AI Team as Gaming Industry Resets AI Strategy
&lt;/h2&gt;

&lt;p&gt;Luke Dicken, Take-Two Interactive's head of AI, &lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/1" rel="noopener noreferrer"&gt;announced the dissolution&lt;/a&gt; of the company's internal AI team via LinkedIn this week, marking a notable retreat from the publisher's previous AI ambitions. Dicken previously served as senior director of applied AI at Zynga, the mobile gaming giant Take-Two acquired in 2022 for $12.7 billion.&lt;/p&gt;

&lt;p&gt;The move signals a potential industry-wide recalibration in how major gaming companies approach AI development. Rather than maintaining dedicated AI research teams, publishers may be shifting toward licensing external AI services or integrating AI capabilities through middleware providers. This approach trades potential competitive advantages for reduced overhead and access to rapidly evolving third-party capabilities.&lt;/p&gt;

&lt;p&gt;The dissolution contrasts sharply with continued heavy AI investment announcements from other sectors. While enterprise software companies race to embed AI into every product, gaming companies appear more cautious about where AI delivers genuine value versus hype-driven experimentation. NPCs powered by language models, procedural content generation, and AI-assisted development tools have all shown promise in demos but struggle to justify dedicated team costs at current capability levels.&lt;/p&gt;

&lt;h2&gt;
  
  
  German Publisher Sues OpenAI Over ChatGPT's Reproduction of Children's Book Series
&lt;/h2&gt;

&lt;p&gt;A German publisher &lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/2" rel="noopener noreferrer"&gt;filed suit against OpenAI&lt;/a&gt; in Munich this week, alleging that ChatGPT violated copyright by generating content "virtually indistinguishable from the original" Coconut the Dragon children's book series. The case represents one of the most detailed copyright claims yet filed against a major AI provider in European courts.&lt;/p&gt;

&lt;p&gt;According to the filing, ChatGPT not only reproduced narrative text closely matching the original books but also generated cover art, back cover marketing blurbs, and even instructions for self-publishing the generated content—essentially providing a turnkey system for producing unauthorized derivative works. The publisher's legal team documented dozens of prompts that reliably triggered near-verbatim reproduction of copyrighted material.&lt;/p&gt;

&lt;p&gt;The lawsuit tests the boundaries of fair use doctrine under European copyright law, which generally provides narrower exceptions than U.S. law. OpenAI has previously argued that training on copyrighted material constitutes transformative use, but cases involving near-exact reproduction of specific works present a harder legal challenge than claims about training data in general.&lt;/p&gt;

&lt;p&gt;The outcome could have significant implications for how AI companies handle clearly copyrighted creative works in training data and whether post-training mitigations against reproduction are legally sufficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  Musk Reportedly Requiring SpaceX IPO Advisers to Purchase Grok Subscriptions
&lt;/h2&gt;

&lt;p&gt;Banks, law firms, and auditors working on SpaceX's anticipated initial public offering have allegedly been told to purchase subscriptions to Grok, xAI's chatbot product, according to &lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/1" rel="noopener noreferrer"&gt;reporting from The New York Times&lt;/a&gt;. The requirement follows recent corporate restructuring that placed xAI's Grok product technically under SpaceX's corporate umbrella.&lt;/p&gt;

&lt;p&gt;The arrangement raises immediate questions about conflicts of interest in major financial transactions. Advisory firms typically maintain strict independence from clients to preserve the integrity of their guidance, but mandatory product purchases create financial entanglement however small the individual subscription costs.&lt;/p&gt;

&lt;p&gt;Beyond the ethics questions, the move underscores Musk's continued efforts to drive Grok adoption through unconventional channels. The chatbot has struggled to gain market share against ChatGPT and Claude despite significant infrastructure investment in xAI's Memphis supercomputer cluster. Requiring professional services firms to use the product at least ensures some enterprise exposure, even if adoption is coerced rather than organic.&lt;/p&gt;

&lt;p&gt;SpaceX's IPO, if it proceeds, would be one of the largest technology offerings in years, making the advisory relationships particularly high-stakes for all parties involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Suno Faces Mounting Copyright Concerns as AI Music Generation Enables Streaming Fraud
&lt;/h2&gt;

&lt;p&gt;Suno, the AI music generation platform, faces &lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/2" rel="noopener noreferrer"&gt;growing scrutiny&lt;/a&gt; as its technology enables increasingly sophisticated streaming fraud schemes. The platform's ability to generate music that closely mimics popular artists' styles has made it trivially easy to flood streaming services with AI-generated ripoffs designed to capture listener searches for established acts.&lt;/p&gt;

&lt;p&gt;The problem extends beyond simple copyright infringement. Fraudsters use AI-generated tracks to execute streaming manipulation schemes, uploading thousands of songs that algorithmically target popular search terms and playlist categories. When listeners search for well-known artists or genres, AI-generated content increasingly appears alongside legitimate recordings, siphoning royalty payments from actual creators.&lt;/p&gt;

&lt;p&gt;Streaming platforms have struggled to implement effective detection systems for AI-generated content. Unlike deepfakes of specific recordings, stylistic imitations exist in a legal gray zone—mimicking a musical style isn't clearly illegal, but doing so at industrial scale to deceive consumers raises different concerns.&lt;/p&gt;

&lt;p&gt;The situation highlights regulatory gaps in AI-generated content authentication and growing calls for platform accountability. Some industry groups are pushing for mandatory labeling of AI-generated audio, while others advocate for streaming services to implement more aggressive content moderation. Neither approach has gained sufficient traction to address the problem at its current scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI Business Model Reliability Under Scrutiny as Billions Ride on Enterprise Adoption
&lt;/h2&gt;

&lt;p&gt;A &lt;a href="https://www.reuters.com/technology/does-ai-business-model-have-fatal-flaw-2026-04-01/" rel="noopener noreferrer"&gt;Reuters analysis published this week&lt;/a&gt; poses an uncomfortable question for the AI industry: can these systems actually achieve the reliability needed for high-stakes enterprise work? With hundreds of billions of dollars in investment predicated on the assumption that AI will handle critical business processes, the gap between demo-quality performance and production-grade consistency represents an existential risk for the current investment thesis.&lt;/p&gt;

&lt;p&gt;The analysis highlights a fundamental tension in AI deployment. Language models excel at generating plausible outputs but struggle with the kind of deterministic reliability that enterprise software typically requires. A system that's correct 95% of the time sounds impressive until you consider that a 5% error rate in financial transactions, medical records, or legal documents would be catastrophic.&lt;/p&gt;

&lt;p&gt;Current mitigation strategies—human review, confidence thresholds, restricted use cases—all work but dramatically limit the efficiency gains that justify AI investments. If every AI output requires human verification, the productivity benefits shrink considerably. Enterprise adoption ultimately hinges on solving reliability, not just demonstrating capability on cherry-picked benchmarks.&lt;/p&gt;

&lt;p&gt;The question looms particularly large as AI companies push toward agentic systems that take autonomous actions. A chatbot that occasionally hallucinates is annoying; an agent that occasionally executes the wrong transaction is dangerous.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;p&gt;The next few weeks will likely see competitive responses to both Microsoft's pricing moves and Google's Gemma 4 release, potentially forcing further price cuts across the inference market. The German copyright case bears watching as a potential template for European regulatory approaches to AI training data. And as enterprise reliability concerns gain mainstream attention, expect increased focus on evaluation frameworks and production monitoring tools designed to quantify—and hopefully improve—real-world AI system dependability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2026/04/02/microsoft-takes-on-ai-rivals-with-three-new-foundational-models/" rel="noopener noreferrer"&gt;Microsoft takes on AI rivals with three new foundational models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://deepmind.google/blog/gemma-4-byte-for-byte-the-most-capable-open-models/" rel="noopener noreferrer"&gt;Gemma 4: Our most capable open models to date&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/blog/huggingface/state-of-os-hf-spring-2026" rel="noopener noreferrer"&gt;State of Open Source on Hugging Face: Spring 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/caramaschiHG/awesome-ai-agents-2026" rel="noopener noreferrer"&gt;caramaschiHG/awesome-ai-agents-2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ARUNAGIRINATHAN-K/awesome-ai-agents" rel="noopener noreferrer"&gt;Awesome AI Agents for 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/blog/RDTvlokip/the-9-trends-that-will-explode-this-year" rel="noopener noreferrer"&gt;🎆 AI 2026 — The 9 trends that will EXPLODE this year!&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2602.10479" rel="noopener noreferrer"&gt;The Evolution of Agentic AI Software Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2601.02749v1" rel="noopener noreferrer"&gt;The Path Ahead for Agentic AI: Challenges and Opportunities&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/1" rel="noopener noreferrer"&gt;Ai Artificial Intelligence Archive for April 2026 - Page 1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.theverge.com/archives/ai-artificial-intelligence/2026/4/2" rel="noopener noreferrer"&gt;Ai Artificial Intelligence Archive for April 2026 - Page 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://www.reuters.com/technology/does-ai-business-model-have-fatal-flaw-2026-04-01/" rel="noopener noreferrer"&gt;Does the AI business model have a fatal flaw?&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every week, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
    <item>
      <title>AI Weekly Digest: OpenAI Pivots Away from Sora, Agentic Frameworks Mature, and Open-Weight Models Gain Ground</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Tue, 31 Mar 2026 13:18:01 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-digest-openai-pivots-away-from-sora-agentic-frameworks-mature-and-open-weight-models-5ahh</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-digest-openai-pivots-away-from-sora-agentic-frameworks-mature-and-open-weight-models-5ahh</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly Digest: OpenAI Pivots Away from Sora, Agentic Frameworks Mature, and Open-Weight Models Gain Ground
&lt;/h1&gt;

&lt;p&gt;This week marked a strategic inflection point as OpenAI made the surprising decision to sunset its standalone Sora video application, signaling that even the most hyped consumer AI products face ruthless prioritization when compute resources run thin. Meanwhile, the agentic programming landscape continued its rapid maturation, with LangGraph cementing its position as the production-grade framework of choice while enterprises scramble to measure actual ROI from their AI agent deployments. The tension between building impressive demos and delivering reliable, cost-effective systems has never been more apparent.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Discontinues Sora Video App to Refocus on Core AI Infrastructure
&lt;/h2&gt;

&lt;p&gt;In a move that caught many observers off guard, OpenAI has &lt;a href="https://coaio.com/news/2026/03/breaking-tech-news-on-march-28-2026-ai-revolution-hardware-upgrades-2koc/" rel="noopener noreferrer"&gt;discontinued its standalone Sora video generation application&lt;/a&gt; despite the tool's initial popularity following its public launch. The company cited the substantial compute demands of high-fidelity video generation as the primary driver behind the decision, with internal metrics reportedly showing declining active usage after the initial novelty wore off.&lt;/p&gt;

&lt;p&gt;The resources previously dedicated to Sora consumer operations are being &lt;a href="https://www.marketingprofs.com/opinions/2026/54473/ai-update-march-27-2026-ai-news-and-views-from-the-past-week/" rel="noopener noreferrer"&gt;redirected toward foundational model scaling, infrastructure improvements, and enterprise products&lt;/a&gt;—areas where OpenAI sees clearer paths to sustainable revenue. However, the company emphasized that video and world simulation research continues internally, with applications targeted toward future robotics systems rather than consumer content creation.&lt;/p&gt;

&lt;p&gt;This pivot reflects a broader industry reckoning with the economics of generative media. Video models require orders of magnitude more compute than text or image generation, and monetization has proven challenging when users expect free or near-free access. For developers who built workflows around Sora's API, the shutdown underscores the platform risk inherent in depending on consumer-facing AI services from companies still searching for product-market fit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Revenium Launches AI Outcomes for Workflow-Level ROI Measurement
&lt;/h2&gt;

&lt;p&gt;As enterprises push deeper into agentic AI deployments, the question of actual return on investment has become increasingly urgent. Revenium's newly launched &lt;a href="https://www.crescendo.ai/news/latest-ai-news-and-updates" rel="noopener noreferrer"&gt;AI Outcomes tool addresses this challenge by measuring ROI at the granular workflow level&lt;/a&gt; for individual AI agent executions—a capability that's been notably absent from the market.&lt;/p&gt;

&lt;p&gt;The tool provides detailed cost attribution across multi-step agent workflows, tracking not just API spend but also infrastructure costs, human oversight time, and downstream business impact. For organizations running hundreds of distinct agent configurations, this visibility helps identify which deployments are generating value versus which are burning compute cycles without meaningful output.&lt;/p&gt;

&lt;p&gt;The timing is particularly relevant as companies move from experimental pilots to production-scale agent deployments. &lt;a href="https://www.ibm.com/think/news/ai-tech-trends-predictions-2026" rel="noopener noreferrer"&gt;Enterprises are increasingly focused on demonstrating concrete business value&lt;/a&gt; from their AI investments, and CFOs want more than anecdotal productivity gains. Revenium's approach of measuring outcomes at the workflow rather than model level acknowledges that AI value creation happens in the orchestration—how agents are chained, what tools they access, and how errors are handled—not just in raw model capability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lucid Software Expands MCP Server with New Process Agent for Team Collaboration
&lt;/h2&gt;

&lt;p&gt;Lucid Software has &lt;a href="https://www.marketingprofs.com/opinions/2026/54473/ai-update-march-27-2026-ai-news-and-views-from-the-past-week/" rel="noopener noreferrer"&gt;rolled out significant enhancements to its Model Context Protocol server integration&lt;/a&gt; alongside a new Process Agent designed to streamline complex diagram and documentation workflows. The updates extend Lucid's existing AI capabilities with tighter integration into the MCP ecosystem that's become the de facto standard for tool connectivity in agentic systems.&lt;/p&gt;

&lt;p&gt;The Process Agent specifically targets the labor-intensive work of translating meeting notes, requirements documents, and scattered communications into structured visual artifacts like flowcharts, architecture diagrams, and project timelines. Early users report substantial time savings on the kind of documentation work that often falls through the cracks in fast-moving engineering teams.&lt;/p&gt;

&lt;p&gt;This release reflects a &lt;a href="https://news.microsoft.com/source/features/ai/whats-next-in-ai-7-trends-to-watch-in-2026/" rel="noopener noreferrer"&gt;broader trend toward AI-assisted collaborative tooling&lt;/a&gt; where AI agents operate as persistent team members rather than on-demand utilities. By integrating through MCP, Lucid's agents can be invoked from other AI tools and workflows, fitting into the increasingly complex orchestration layers that enterprises are building. The approach suggests that specialized domain agents—rather than general-purpose assistants—may be the path to production-ready AI augmentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code Tool Suffers from "Amnesia" Bug in Extended Sessions
&lt;/h2&gt;

&lt;p&gt;Reports have surfaced of a frustrating bug in Anthropic's Claude Code tool where the assistant &lt;a href="https://blog.logrocket.com/ai-dev-tool-power-rankings/" rel="noopener noreferrer"&gt;forgets previous instructions during extended coding sessions&lt;/a&gt;, forcing users to repeatedly re-explain project context and coding conventions. The issue appears most pronounced in sessions exceeding several hours, with the AI seeming to "reset" its understanding of the codebase and prior decisions.&lt;/p&gt;

&lt;p&gt;Developers describe scenarios where Claude Code begins suggesting changes that directly contradict earlier agreed-upon patterns, or requests information that was provided earlier in the same session. The bug leads to significant productivity losses on complex, multi-day refactoring efforts where maintaining continuity is essential.&lt;/p&gt;

&lt;p&gt;The "amnesia" problem highlights ongoing &lt;a href="https://www.builder.io/blog/best-ai-tools-2026" rel="noopener noreferrer"&gt;reliability challenges in AI coding assistants&lt;/a&gt; when applied to substantial projects. While these tools excel at bounded tasks—writing a single function, explaining a code snippet, or generating boilerplate—maintaining coherent understanding across extended workflows remains technically difficult. The issue raises fundamental questions about memory management in agentic systems: how should context be persisted, summarized, and retrieved to enable the kind of sustained collaboration that mirrors human pair programming?&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;The agentic programming landscape has crystallized significantly this quarter, with &lt;a href="https://gurusup.com/blog/best-multi-agent-frameworks-2026" rel="noopener noreferrer"&gt;LangGraph emerging as the framework of choice&lt;/a&gt; for teams building production systems that require iterative, self-correcting reasoning loops. Its cyclic graph architecture enables the kind of retry logic and conditional branching that real-world agent deployments demand—agents can reconsider decisions, request human approval at key junctures, and gracefully handle the failures that inevitably occur when AI interfaces with external systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pub.towardsai.net/the-4-best-open-source-multi-agent-ai-frameworks-2026-9da389f9407a" rel="noopener noreferrer"&gt;CrewAI continues gaining traction&lt;/a&gt; for rapid multi-agent prototyping, particularly among teams exploring agent concepts before committing to production architecture. However, we're seeing a consistent migration pattern: teams that start with CrewAI's simpler abstractions often move to LangGraph when they need production-grade state management, persistence, and observability. The tradeoff between developer velocity and operational robustness remains a defining tension in framework selection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.instaclustr.com/education/agentic-ai/agentic-ai-frameworks-top-10-options-in-2026/" rel="noopener noreferrer"&gt;AutoGen has continued its evolution past v0.4&lt;/a&gt;, now offering a layered architecture with the event-driven Core runtime, Studio for no-code prototyping, and AgentChat as the primary Python API surface. This stratification aims to serve both researchers exploring novel agent architectures and enterprise teams deploying standardized patterns. Meanwhile, &lt;a href="https://vellum.ai/blog/top-ai-agent-frameworks-for-developers" rel="noopener noreferrer"&gt;Google's Agent Development Kit (ADK) and Claude's Agent SDK&lt;/a&gt; are expanding enterprise multi-agent options, though adoption remains concentrated among customers already invested in those respective ecosystems.&lt;/p&gt;

&lt;p&gt;The emerging &lt;a href="https://www.exabeam.com/explainers/agentic-ai/agentic-ai-frameworks-key-components-top-8-options/" rel="noopener noreferrer"&gt;industry consensus is clear: orchestration architecture now matters more than individual agent intelligence&lt;/a&gt;. Teams are finding that how agents communicate, maintain state, handle errors, and coordinate tool usage often determines system success more than the underlying model's raw capabilities. This realization is driving increased investment in observability tools, agent tracing systems, and standardized evaluation frameworks for multi-agent workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gemini 3.1 Flash Live Rolls Out Globally with Real-Time Audio and Camera Integration
&lt;/h2&gt;

&lt;p&gt;Google has &lt;a href="https://www.crescendo.ai/news/latest-ai-news-and-updates" rel="noopener noreferrer"&gt;expanded its Gemini 3.1 Flash Live capabilities globally&lt;/a&gt;, bringing real-time audio conversations and visual context integration to users worldwide. The system allows continuous voice conversations with the AI while optionally sharing phone camera input, enabling scenarios where users can ask questions about what they're seeing in real-time.&lt;/p&gt;

&lt;p&gt;The technical implementation represents a significant advancement in multimodal streaming—the system processes audio and visual inputs simultaneously while maintaining conversational context and generating low-latency responses. Early demonstrations show the system identifying objects, reading text, providing navigation assistance, and answering questions about physical environments with reasonable accuracy.&lt;/p&gt;

&lt;p&gt;The global rollout has intensified discussions about &lt;a href="https://www.marketingprofs.com/opinions/2026/54473/ai-update-march-27-2026-ai-news-and-views-from-the-past-week/" rel="noopener noreferrer"&gt;distinguishing human from machine interactions&lt;/a&gt; as ambient AI assistants become more prevalent. The naturalness of real-time audio conversations—without the perceptible pauses of earlier systems—raises questions about disclosure and authenticity in communication. More broadly, the release signals a shift toward AI that observes and participates in the physical world rather than being confined to text boxes and chat interfaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic Secures Legal Victory Against Trump Administration Restrictions
&lt;/h2&gt;

&lt;p&gt;Anthropic has &lt;a href="https://www.crescendo.ai/news/latest-ai-news-and-updates" rel="noopener noreferrer"&gt;won an injunction against federal restrictions&lt;/a&gt; that had threatened to constrain its AI operations, providing the company with crucial operational clarity amid ongoing regulatory uncertainty. The legal victory addresses restrictions that would have impacted aspects of the company's enterprise AI services, though specific details of the contested regulations remain partially sealed.&lt;/p&gt;

&lt;p&gt;The ruling comes as Anthropic &lt;a href="https://www.ibm.com/think/news/ai-tech-trends-predictions-2026" rel="noopener noreferrer"&gt;continues an aggressive enterprise push&lt;/a&gt;, expanding its Claude model family and enterprise API offerings. The company has positioned itself as a safety-focused alternative to OpenAI and Google, and any operational restrictions would have significantly impacted that competitive positioning.&lt;/p&gt;

&lt;p&gt;The case signals ongoing tension between AI companies and government oversight efforts. While regulatory frameworks for AI remain unsettled, companies are increasingly turning to litigation to challenge restrictions they view as overreaching or technically uninformed. For enterprise customers evaluating AI partnerships, the regulatory environment has become a meaningful factor in vendor selection—platform stability depends not just on technical excellence but also on a company's ability to navigate the political landscape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kimi K2.5 and Qwen 3 Coder Join Growing Open-Weight Model Ecosystem
&lt;/h2&gt;

&lt;p&gt;The open-weight model ecosystem continues expanding, with &lt;a href="https://llm-stats.com/ai-news" rel="noopener noreferrer"&gt;Kimi K2.5 and Qwen 3 Coder&lt;/a&gt; representing the latest entries that provide developers with alternatives to closed API dependencies. These releases reflect an accelerating trend toward &lt;a href="https://whatllm.org/blog/llm-releases-march-2026" rel="noopener noreferrer"&gt;efficient mixture-of-experts architectures&lt;/a&gt; and edge-capable reasoning models that can run on more modest infrastructure.&lt;/p&gt;

&lt;p&gt;Kimi K2.5 offers a partially open-source approach with self-hosting capabilities, targeting developers who need model control without full weight access. Meanwhile, &lt;a href="https://explodingtopics.com/blog/list-of-llms" rel="noopener noreferrer"&gt;Qwen 3 Coder provides full Apache 2.0 licensing&lt;/a&gt;, enabling enterprise deployment without licensing friction—an increasingly important consideration as legal teams scrutinize AI model dependencies.&lt;/p&gt;

&lt;p&gt;The practical impact is meaningful: organizations can now &lt;a href="https://www.reddit.com/r/AISEOInsider/comments/1qg3m5k/the_2026_open_source_ai_stack_thats_beating_paid/" rel="noopener noreferrer"&gt;build AI stacks that rival or exceed paid tools&lt;/a&gt; in specific domains while maintaining full control over deployment, fine-tuning, and data handling. The &lt;a href="https://builder.aws.com/content/38sWXfm1ewXHg9pdCLmHo3XWIQX/top-5-open-source-ai-model-api-providers-in-2026" rel="noopener noreferrer"&gt;expansion of open-source AI model API providers&lt;/a&gt; means teams can leverage these models without managing inference infrastructure, getting the benefits of open weights with managed deployment convenience.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;p&gt;The OpenAI Sora shutdown and Anthropic's legal battle both point to an industry maturing past pure capability growth into harder questions of sustainability, economics, and governance. Watch for more companies to make similarly tough prioritization decisions as compute economics force focus. The agentic framework space is also approaching a consolidation phase—expect clearer winners and losers by midyear as production deployments reveal which architectures actually scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://coaio.com/news/2026/03/breaking-tech-news-on-march-28-2026-ai-revolution-hardware-upgrades-2koc/" rel="noopener noreferrer"&gt;Breaking Tech News on March 28, 2026: AI Revolution, Hardware ...&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.crescendo.ai/news/latest-ai-news-and-updates" rel="noopener noreferrer"&gt;Latest AI News and AI Breakthroughs that Matter Most: 2026 &amp;amp; 2025&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.marketingprofs.com/opinions/2026/54473/ai-update-march-27-2026-ai-news-and-views-from-the-past-week" rel="noopener noreferrer"&gt;AI Update, March 27, 2026: AI News and Views From the Past Week&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.ibm.com/think/news/ai-tech-trends-predictions-2026" rel="noopener noreferrer"&gt;The trends that will shape AI and tech in 2026 - IBM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.microsoft.com/source/features/ai/whats-next-in-ai-7-trends-to-watch-in-2026/" rel="noopener noreferrer"&gt;What's next in AI: 7 trends to watch in 2026 - Microsoft Source&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.instaclustr.com/education/agentic-ai/agentic-ai-frameworks-top-10-options-in-2026/" rel="noopener noreferrer"&gt;Agentic AI Frameworks: Top 10 Options in 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.exabeam.com/explainers/agentic-ai/agentic-ai-frameworks-key-components-top-8-options/" rel="noopener noreferrer"&gt;Agentic AI Frameworks: Key Components &amp;amp; Top 8 Options in 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pub.towardsai.net/the-4-best-open-source-multi-agent-ai-frameworks-2026-9da389f9407a" rel="noopener noreferrer"&gt;The 4 Best Open Source Multi-Agent AI Frameworks 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gurusup.com/blog/best-multi-agent-frameworks-2026" rel="noopener noreferrer"&gt;Best Multi-Agent Frameworks in 2026: LangGraph, CrewAI, OpenAI ...&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vellum.ai/blog/top-ai-agent-frameworks-for-developers" rel="noopener noreferrer"&gt;The Top 11 AI Agent Frameworks For Developers In September 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://llm-stats.com/ai-news" rel="noopener noreferrer"&gt;LLM News Today (March 2026) – AI Model Releases - LLM Stats&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://whatllm.org/blog/llm-releases-march-2026" rel="noopener noreferrer"&gt;New LLMs March 2026: GPT-5.4 Tied for #1. Nobody Talked About It.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://explodingtopics.com/blog/list-of-llms" rel="noopener noreferrer"&gt;Top 50+ Large Language Models (LLMs) in 2026 - Exploding Topics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://builder.aws.com/content/38sWXfm1ewXHg9pdCLmHo3XWIQX/top-5-open-source-ai-model-api-providers-in-2026" rel="noopener noreferrer"&gt;Top 5 Open-Source AI Model API Providers in 2026&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reddit.com/r/AISEOInsider/comments/1qg3m5k/the_2026_open_source_ai_stack_thats_beating_paid/" rel="noopener noreferrer"&gt;The 2026 Open Source AI Stack That's Beating Paid Tools - Reddit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.logrocket.com/ai-dev-tool-power-rankings/" rel="noopener noreferrer"&gt;AI dev tool power rankings &amp;amp; comparison [March 2026]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://www.builder.io/blog/best-ai-tools-2026" rel="noopener noreferrer"&gt;Best AI Coding Tools for Developers in 2026 - Builder.io&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every day, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
    <item>
      <title>AI Weekly: Gemini 3.1 Pro Leads a Week Where Open Source Closes In</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Mon, 30 Mar 2026 12:01:51 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-gemini-31-pro-leads-a-week-where-open-source-closes-in-2bj7</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-gemini-31-pro-leads-a-week-where-open-source-closes-in-2bj7</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly: Gemini 3.1 Pro Leads a Week Where Open Source Closes In
&lt;/h1&gt;

&lt;p&gt;The gap between frontier and open-source models is shrinking faster than anyone predicted. This week, Google DeepMind dropped Gemini 3.1 Pro with benchmark numbers that would have seemed impossible two years ago, but the real story is what's happening in the open-weights space—MiMo-V2-Flash and DeepSeek V3.2 are now within striking distance of proprietary systems at a fraction of the cost. Meanwhile, the infrastructure for agentic AI is maturing rapidly, robotics funding is surging, and Wikipedia is drawing a hard line against AI-generated content.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gemini 3.1 Pro Arrives with 1M-Token Context and 77% ARC-AGI-2 Score
&lt;/h2&gt;

&lt;p&gt;Google DeepMind has released Gemini 3.1 Pro, positioning it as the most capable Pro-tier model in their lineup. The headline numbers are impressive: a 1 million token context window and a 77.1% score on the ARC-AGI-2 benchmark, which specifically tests abstract reasoning capabilities that have historically challenged language models.&lt;/p&gt;

&lt;p&gt;The multimodal reasoning extends across text, images, audio, video, and code in a unified architecture—no separate models stitched together. In practice, this means developers can pass in a two-hour video alongside a codebase and ask questions that require understanding both. Google claims latency improvements over 2.0 Pro despite the expanded capabilities, though independent benchmarks are still pending.&lt;/p&gt;

&lt;p&gt;Availability is broad: the model is accessible through the Gemini API, Vertex AI for enterprise deployments, and Google's Antigravity platform. Pricing sits at the expected Pro tier, making it competitive with Claude 4.1 Sonnet and GPT-5.1 for most production use cases.&lt;/p&gt;

&lt;p&gt;The ARC-AGI-2 score deserves attention. This benchmark specifically targets the kind of novel pattern recognition that pure next-token prediction struggles with—think IQ test problems rather than memorized facts. Breaking 77% suggests meaningful progress on generalization, though the gap to human performance (mid-80s for average adults) remains real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Open-Source Models Close the Gap: MiMo-V2-Flash and DeepSeek V3.2 Challenge Frontier Systems
&lt;/h2&gt;

&lt;p&gt;The open-source revolution in LLMs just hit an inflection point. MiMo-V2-Flash from Xiaomi's AI lab achieved a 66 QI (Quality Index) score with a stunning 96% on AIME—that's the American Invitational Mathematics Examination, not some synthetic benchmark. This represents the best mathematical reasoning performance ever recorded in an open-weights model.&lt;/p&gt;

&lt;p&gt;DeepSeek V3.2, released the same week, matches that 66 QI while offering inference at $0.30 per million tokens through deepinfra. For comparison, GPT-5.1 runs $3.50/M tokens—more than 10x the cost for roughly equivalent capability on most tasks.&lt;/p&gt;

&lt;p&gt;A detailed Reddit analysis examining 94 LLM API endpoints found the proprietary advantage has compressed to approximately 4 QI points. Two years ago, that gap was 15-20 points. The practical implication: for most production workloads that don't require the absolute bleeding edge, open-source models now offer better cost-performance ratios.&lt;/p&gt;

&lt;p&gt;This isn't just academic. Companies running inference at scale are doing the math: a 10x cost reduction with a &amp;lt;5% capability hit changes build-versus-buy calculations dramatically. We're seeing migration patterns accelerate, particularly for classification, summarization, and code generation tasks where the gap is smallest. The remaining proprietary advantages cluster around complex multi-step reasoning and highly specialized domains—exactly the capabilities that justify premium pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;Anthropic's 2026 Agentic Coding Trends Report, published last Tuesday, makes a bold prediction: multi-agent systems will largely replace single-agent workflows for complex coding tasks by year's end. The report cites internal data showing 3-4x throughput improvements when decomposing tasks across specialized agents versus monolithic prompting. This aligns with what we're seeing in production deployments—the single-agent pattern is hitting scaling walls.&lt;/p&gt;

&lt;p&gt;Microsoft's Agent Framework, now in public preview, takes an explicitly enterprise-first approach. The emphasis is on durable orchestration (agents that survive process restarts) and human-in-the-loop scenarios where approval gates are non-negotiable. This matters for regulated industries where "the AI just did it" isn't an acceptable answer to auditors.&lt;/p&gt;

&lt;p&gt;AutoGen v0.4+ represents a significant architectural pivot to event-driven execution with full async support. The previous version's synchronous patterns created bottlenecks in large-scale multi-agent coordination; the new architecture allows hundreds of agents to operate concurrently without blocking. Migration guides are available, but expect some friction—the programming model changed substantially.&lt;/p&gt;

&lt;p&gt;The dominant architectural pattern emerging across all frameworks is &lt;strong&gt;role archetypes&lt;/strong&gt;: Planner, Researcher, Coder, Reviewer. These constrained personas improve explainability (you can trace which "role" made which decision) and reduce the unbounded exploration that makes agents unreliable. Framework selection now increasingly hinges on state management philosophy—durable execution platforms like Temporal versus stateless serverless approaches determine whether your agents survive infrastructure failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wikipedia Cracks Down on AI-Generated Articles Amid Misinformation Concerns
&lt;/h2&gt;

&lt;p&gt;Wikipedia editors have implemented significantly stricter detection and removal protocols for AI-generated content, marking an escalation in the platform's ongoing struggle with synthetic text. The trigger: a wave of articles containing hallucinated citations—papers that don't exist, quotes that were never said, statistics fabricated wholesale.&lt;/p&gt;

&lt;p&gt;The detection methods combine automated classifiers with human review, focusing on synthetic writing patterns (the telltale hedging phrases, the suspiciously comprehensive coverage, the lack of idiosyncratic human perspective) and citation verification. Editors report finding articles where every single source was either non-existent or misrepresented.&lt;/p&gt;

&lt;p&gt;This raises uncomfortable questions for the AI ecosystem. Wikipedia has been a cornerstone training data source for language models; if AI-generated content infiltrates Wikipedia at scale, we get a feedback loop where models train on their own hallucinations. The platform is essentially defending the integrity of one of the most important knowledge repositories on the internet.&lt;/p&gt;

&lt;p&gt;The response is part of a broader platform governance trend. Stack Overflow's AI content restrictions, academic publishers requiring AI disclosure, and social media platforms labeling synthetic content all reflect the same tension: AI-generated text is now good enough to pass casual inspection but not reliable enough to trust without verification. Wikipedia's hard line may influence how other knowledge platforms approach the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  ByteDance Launches Dreamina Seedance 2.0 with Built-In Misuse Protections
&lt;/h2&gt;

&lt;p&gt;ByteDance entered the AI video generation race this week with Dreamina Seedance 2.0, notable less for its generation quality (competitive with Runway Gen-3 and Pika 2.0) than for its deployment model. The tool is integrated directly into CapCut, ByteDance's video editing platform with over 200 million monthly users.&lt;/p&gt;

&lt;p&gt;The built-in safeguards are the more interesting story. Seedance 2.0 includes detection systems that refuse to generate content matching known individuals without explicit consent mechanisms, won't produce photorealistic violence or explicit content, and watermarks all outputs with invisible fingerprints. These aren't post-hoc additions—they're architectural decisions baked into the model.&lt;/p&gt;

&lt;p&gt;This "responsible-by-design" approach contrasts with the "release-then-patch" pattern we've seen from other players. ByteDance clearly learned from the deepfake controversies that plagued earlier tools; TikTok's content moderation challenges presumably informed the decision to build guardrails before launch rather than after the damage is done.&lt;/p&gt;

&lt;p&gt;For developers, CapCut integration means API access is coming—ByteDance typically follows consumer launches with developer tools within 3-6 months. The misuse protections will likely carry over, meaning anyone building on this platform inherits the guardrails. Whether that's a feature or a limitation depends on your use case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cohere Ships Open-Source Voice Model Supporting 14 Languages on Consumer GPUs
&lt;/h2&gt;

&lt;p&gt;Cohere released an open-source speech transcription model this week designed to run on consumer-grade hardware—think RTX 4070-class GPUs, not datacenter A100s. The model supports 14 languages out of the box: English, Spanish, Mandarin, Hindi, Arabic, Portuguese, French, German, Japanese, Korean, Russian, Italian, Dutch, and Turkish.&lt;/p&gt;

&lt;p&gt;The performance numbers are solid if not groundbreaking: word error rates within 10% of Whisper large-v3 while running 3x faster on equivalent hardware. The real value proposition is architectural—this is a single model handling all 14 languages, not 14 separate models, which simplifies deployment significantly.&lt;/p&gt;

&lt;p&gt;For developers building voice-enabled applications, the implications are meaningful. No API dependencies means no per-minute costs, no network latency, and no privacy concerns about audio leaving local infrastructure. A small business building a voice assistant can now do so without ongoing API costs that scale with usage.&lt;/p&gt;

&lt;p&gt;This continues the edge deployment trend we've tracked throughout 2025-2026: capable models are migrating from cloud-only to local-first. Voice is particularly suited to this pattern because audio data is sensitive and latency-critical. Cohere's licensing (Apache 2.0) removes commercial use restrictions, making this viable for production deployments without negotiating enterprise agreements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Physical Intelligence Eyes $1B Raise as Robotics AI Funding Accelerates
&lt;/h2&gt;

&lt;p&gt;Physical Intelligence, the robotics AI startup founded by ex-Google and Berkeley researchers, is reportedly in discussions for a $1 billion funding round that would approximately double its valuation to $4.5 billion. The company's π0 foundation model for robot control demonstrated cross-embodiment generalization last year—the same weights controlling arms, hands, and mobile bases.&lt;/p&gt;

&lt;p&gt;The timing aligns with a broader thesis gaining momentum: 2026 is the year physical AI and robotics become the new scaling frontier. IBM Research's January predictions explicitly called this out, arguing that pure LLM scaling is hitting diminishing returns and that embodied intelligence represents the next capability unlock.&lt;/p&gt;

&lt;p&gt;The capital deployment in this space is accelerating. SoftBank's $40 billion loan announced last week—primarily targeting AI infrastructure—signals that major investors see robotics and physical AI as requiring the same massive capital intensity that drove the LLM buildout. Figure, Sanctuary AI, and 1X are all reportedly raising substantial rounds.&lt;/p&gt;

&lt;p&gt;The bulls argue that language models have essentially solved perception and reasoning; applying those capabilities to physical tasks is the obvious next step. The bears counter that sim-to-real transfer, hardware reliability, and safety certification create multi-year deployment timelines that investor patience may not accommodate. Physical Intelligence's ability to close this round will signal which narrative the market believes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Platoon Integrates AI into Veteran Coding Bootcamp Curriculum
&lt;/h2&gt;

&lt;p&gt;Code Platoon, the nonprofit bootcamp focused on transitioning military veterans and their spouses into tech careers, unveiled a substantially modernized curriculum this week. The updated program combines full-stack engineering fundamentals with generative AI skills, reflecting what the organization calls "the new baseline for engineering competency."&lt;/p&gt;

&lt;p&gt;The AI integration goes beyond "how to use ChatGPT." Students learn ML fundamentals (enough to understand what's happening inside the models), prompt engineering for production systems, RAG architecture for building knowledge-augmented applications, and evaluation frameworks for AI-assisted code. The capstone projects require building applications with generative AI components.&lt;/p&gt;

&lt;p&gt;This matters because workforce training programs are leading indicators. When bootcamps—which optimize aggressively for job placement—embed AI throughout their curricula rather than treating it as an elective, it signals that employers now expect these skills by default. Code Platoon's placement partners reportedly requested the curriculum changes, wanting graduates who can build with AI tooling from day one.&lt;/p&gt;

&lt;p&gt;The veteran angle adds another dimension: this population often brings domain expertise (logistics, cybersecurity, operations) that translates well to building AI applications for those verticals. Combining that background with modern engineering skills creates a talent pipeline that enterprise employers are actively seeking.&lt;/p&gt;




&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Next week brings the Anthropic developer conference where Claude 4.2 is expected alongside expanded MCP tooling. The open-source momentum shows no signs of slowing—Llama 4 is rumored for Q2, which could compress the proprietary gap further. Most immediately, keep an eye on how quickly enterprise adoption shifts as the cost-capability curves cross; the migration patterns we're seeing in Q1 data suggest 2026 may be the year open-source becomes the default choice for most production LLM workloads.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every day, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
    <item>
      <title>Primitive Shifts: AI Skills — The Knowledge Primitive Replacing Prompt Engineering</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Sun, 29 Mar 2026 15:44:24 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/primitive-shifts-ai-skills-the-knowledge-primitive-replacing-prompt-engineering-14gf</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/primitive-shifts-ai-skills-the-knowledge-primitive-replacing-prompt-engineering-14gf</guid>
      <description>&lt;h1&gt;
  
  
  Primitive Shifts: AI Skills — The Knowledge Primitive Replacing Prompt Engineering
&lt;/h1&gt;

&lt;p&gt;Every few months, the baseline of how AI systems work quietly moves. Engineers who noticed early weren't smarter — they were just paying attention to the right signals. The shift from "prompt engineering" to "function calling" caught teams off guard in 2023. The rise of MCP for tool integration did the same in late 2024. Now, the same pattern is playing out with AI Skills — and if you're still treating agent context as a prompt engineering problem, you're about to feel that familiar sting of realizing the floor moved while you were standing on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is It?
&lt;/h2&gt;

&lt;p&gt;AI Skills are portable, structured knowledge packages that teach agents &lt;em&gt;how&lt;/em&gt; to perform domain-specific tasks. This is categorically different from tools (which provide &lt;em&gt;access&lt;/em&gt; to external systems) and prompts (which provide &lt;em&gt;instructions&lt;/em&gt; for a single interaction). A skill encapsulates institutional knowledge: your team's code review standards, your API versioning conventions, the twelve edge cases in your payment processing flow that every new engineer learns the hard way.&lt;/p&gt;

&lt;p&gt;The pattern didn't emerge from a single announcement. It crystallized from convergence. Cursor rules, GitHub Copilot custom instructions, Windsurf rules, and the &lt;code&gt;.cursorrules&lt;/code&gt; file format all evolved toward the same abstraction independently throughout 2024 and early 2025. Different teams, solving the same underlying problem, arrived at remarkably similar solutions.&lt;/p&gt;

&lt;p&gt;Anthropic formalized this in December 2025 with the Agent Skills open standard. OpenAI, Google, and Microsoft tooling adopted compatible interfaces within months. This wasn't vendor lock-in theater — the abstraction was obvious enough that interoperability became the path of least resistance.&lt;/p&gt;

&lt;p&gt;Critically, skills are consumed at inference time, not training time. They extend agent capabilities without fine-tuning, sitting in context windows as structured, discoverable knowledge units. The primitive answers a specific architectural question that every production AI team eventually asks: "How do I make my agent understand our codebase conventions, compliance requirements, and domain logic without rebuilding the model?"&lt;/p&gt;

&lt;p&gt;The answer used to be "stuff it in the system prompt and hope for the best." Now there's actual infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It's Flying Under the Radar
&lt;/h2&gt;

&lt;p&gt;Skills suffer from the worst kind of invisibility: they look like something teams already know how to dismiss. Engineers see &lt;code&gt;.md&lt;/code&gt; files with instructions and assume it's prompt engineering with extra steps. The structural and discoverability properties — the actual innovation — are invisible at first glance. "We already have a &lt;code&gt;CONTRIBUTING.md&lt;/code&gt;," teams say, missing that the format, placement, and tooling integration are what transform documentation into agent-consumable knowledge.&lt;/p&gt;

&lt;p&gt;The adoption happened inside IDEs, not through announcement posts. Over 60,000 open-source projects adopted &lt;code&gt;AGENTS.md&lt;/code&gt; files before most engineering teams noticed the pattern. The signal was there — you just had to be looking at repository structures rather than press releases.&lt;/p&gt;

&lt;p&gt;Skills solve a problem teams don't name correctly. "Context finding" is the number-one developer pain point — 40% cite it in recent surveys. But teams frame this as a documentation problem or a search problem, not a knowledge-packaging problem. They invest in better docs, improved search indexing, more comprehensive onboarding materials. Meanwhile, the actual friction is that their AI agents can't &lt;em&gt;find&lt;/em&gt; the knowledge that already exists.&lt;/p&gt;

&lt;p&gt;Here's what should concern you: the format war already ended. While internal teams debated custom solutions and enterprise architects drew up plans for proprietary knowledge management systems, the major AI platforms quietly converged on a common interface. The window for "we'll build our own" closed without a memo.&lt;/p&gt;

&lt;p&gt;Flask creator Armin Ronacher's observation went viral in niche circles: moving several MCP integrations to skills improved agent performance meaningfully — fewer hallucinations, better adherence to conventions, reduced back-and-forth. But this signal stayed in technical blogs and Hacker News threads. It didn't reach engineering leadership channels. Which means right now, there are teams planning six-month "AI context management" initiatives that could be solved with a well-structured markdown file and thirty minutes of work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Try It Today
&lt;/h2&gt;

&lt;p&gt;Let's make this concrete. The following Python script demonstrates how to create, validate, and test a skill file that teaches an AI agent your team's database migration conventions. This isn't a toy example — it's the pattern production teams are using.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
skill_builder.py - Create and validate AI Skills for your codebase

This script demonstrates the emerging skill file pattern:
1. Structured markdown with semantic sections
2. Validation against the Agent Skills schema
3. Testing skill discovery with a local agent

Requires: pip install pydantic&amp;gt;=2.0 anthropic&amp;gt;=0.40.0 pyyaml&amp;gt;=6.0
Tested with: Python 3.11+, Anthropic API v0.40.0
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;field_validator&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;A single section within a skill file.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Section heading (## level)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Section content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0-10, higher = more critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentSkill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Represents a portable AI Skill following the emerging standard.

    Skills differ from prompts in three ways:
    1. They&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re discoverable (agents find them based on task context)
    2. They&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re portable (same skill works across Cursor, Claude Code, Copilot)
    3. They&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re composable (multiple skills can be active simultaneously)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skill identifier, e.g., &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;database-migrations&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^\d+\.\d+\.\d+$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;triggers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Keywords/patterns that should activate this skill&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;max_rules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Guardrails should stay under 15 rules&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@field_validator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sections&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_guardrails_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Enforce the 15-rule limit for GUARDRAILS sections.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GUARDRAIL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="n"&gt;rule_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;^[-*]\s&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MULTILINE&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rule_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GUARDRAILS section has &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rule_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rules; max is 15. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Agents struggle with more — prioritize ruthlessly.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;to_markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Export skill as AGENTS.md compatible markdown.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;!-- skill-version: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; --&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;!-- triggers: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;triggers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; --&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_migration_skill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AgentSkill&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Example: A skill teaching database migration conventions.

    This encapsulates knowledge that would otherwise live in:
    - Onboarding docs (rarely read by AI)
    - Code review comments (learned too late)
    - Tribal knowledge (never written down)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;AgentSkill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database-migrations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;triggers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;migration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema change&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alembic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALTER TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="nc"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GUARDRAILS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
- NEVER use `DROP COLUMN` without a two-phase migration plan
- ALWAYS add new columns as nullable first, backfill, then add constraints
- NEVER modify migrations that have been applied to production
- ALWAYS include rollback instructions in migration docstring
- Use `op.execute()` for data migrations, not ORM models
- Foreign keys must use `ondelete=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CASCADE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;` or explicit alternative
- Index names must follow pattern: `ix_{table}_{columns}`
- ALWAYS run `alembic check` before committing migration files
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;SkillSection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CONVENTIONS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Our migrations follow a specific lifecycle:

1. **Development**: Create migration with `alembic revision --autogenerate -m &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;`
2. **Review**: Migration PRs require explicit sign-off from database owner
3. **Staging**: Migrations run automatically on merge to `staging` branch
4. **Production**: Migrations require manual trigger via deployment pipeline

Migration files live in `src/db/migrations/versions/`. The revision ID format
is timestamp-based (YYYYMMDD_HHMM) not random hex.

For large tables (&amp;gt;1M rows), use batch operations:
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
python&lt;/p&gt;
&lt;h1&gt;
  
  
  Good: batched update
&lt;/h1&gt;

&lt;p&gt;for batch in op.batch_alter_table('users', batch_size=10000):&lt;br&gt;
    batch.update(...)&lt;/p&gt;
&lt;h1&gt;
  
  
  Bad: full table lock
&lt;/h1&gt;

&lt;p&gt;op.execute("UPDATE users SET ...")&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"""
            ),
            SkillSection(
                heading="COMMON MISTAKES",
                priority=7,
                content="""
**The ORM Import Trap**: Never import application models in migrations.
Models change; migrations are immutable history. Use raw SQL or SQLAlchemy core.

**The Default Value Trap**: Adding `server_default` to existing column requires
two migrations — one to add the default, one to backfill NULL values.

**The Index Trap**: Creating indexes on large tables blocks writes. Use 
`CREATE INDEX CONCURRENTLY` via `op.execute()` with `postgresql_concurrently=True`.
"""
            ),
        ],
    )


def test_skill_discovery(skill: AgentSkill, test_query: str) -&amp;gt; dict:
    """
    Test whether an agent correctly discovers and applies a skill.

    This simulates what happens when a developer asks for help with
    a task that should trigger skill activation.
    """
    client = anthropic.Anthropic()  # Uses ANTHROPIC_API_KEY env var

    skill_content = skill.to_markdown()

    # Simulate skill injection (in production, IDEs do this automatically)
    system_prompt = f"""You are a coding assistant with access to project-specific skills.

Active skill:
---
{skill_content}
---

When the user's request relates to a skill's triggers, apply that skill's 
guardrails and conventions. Cite specific rules when relevant."""

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        system=system_prompt,
        messages=[{"role": "user", "content": test_query}]
    )

    response_text = response.content[0].text

    # Check if skill was actually applied
    skill_indicators = [
        "nullable first" in response_text.lower(),
        "guardrail" in response_text.lower(),
        "two-phase" in response_text.lower(),
        "rollback" in response_text.lower(),
    ]

    return {
        "query": test_query,
        "response": response_text,
        "skill_applied": any(skill_indicators),
        "indicators_found": sum(skill_indicators),
    }


if __name__ == "__main__":
    # Create the skill
    skill = create_migration_skill()

    # Write to AGENTS.md (or skill-specific file)
    output_path = Path("AGENTS.md")
    output_path.write_text(skill.to_markdown())
    print(f"✓ Wrote skill to {output_path}")

    # Validate structure
    print(f"✓ Skill '{skill.name}' v{skill.version} validated")
    print(f"  Triggers: {skill.triggers}")
    print(f"  Sections: {[s.heading for s in skill.sections]}")

    # Test discovery (requires ANTHROPIC_API_KEY)
    try:
        result = test_skill_discovery(
            skill, 
            "I need to add a new 'email_verified' boolean column to the users table"
        )
        print(f"\n✓ Discovery test:")
        print(f"  Query: {result['query']}")
        print(f"  Skill applied: {result['skill_applied']}")
        print(f"  Indicators found: {result['indicators_found']}/4")
    except anthropic.AuthenticationError:
        print("\n⚠ Skipping discovery test (set ANTHROPIC_API_KEY to test)")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start with a single &lt;code&gt;AGENTS.md&lt;/code&gt; file at your repository root. Document your team's code review standards, API conventions, and — critically — the things your AI assistant keeps getting wrong. The emerging hierarchy uses &lt;code&gt;MEMORY.md&lt;/code&gt; for long-term knowledge, &lt;code&gt;GUARDRAILS.md&lt;/code&gt; for hard constraints (keep it under 15 rules; agents struggle with more), and &lt;code&gt;WORKSTATE.md&lt;/code&gt; for session continuity.&lt;/p&gt;

&lt;p&gt;Test skill discovery by creating a skill for one workflow and verifying your IDE's agent surfaces it unprompted when relevant context appears. Measure the delta: run identical coding tasks with and without skills loaded. Teams report 20-30% reduction in back-and-forth corrections, which compounds significantly over a sprint.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;If your team has per-developer &lt;code&gt;.cursorrules&lt;/code&gt; files, per-repo prompt templates, and scattered &lt;code&gt;system_prompt.txt&lt;/code&gt; files across different projects, you're accumulating non-portable knowledge silos. This is technical debt that will become painful when you need to switch tools, onboard new team members, or scale agent usage beyond individual developers.&lt;/p&gt;

&lt;p&gt;More immediately relevant: your RAG pipelines may be over-engineered for problems skills solve natively. If you're chunking internal documentation, embedding it into a vector database, and injecting retrieved chunks into prompts at inference time, ask yourself whether a curated 500-line skill file would perform better. For institutional knowledge — conventions, constraints, common mistakes — the answer is usually yes. Lower latency, lower cost, better coherence.&lt;/p&gt;

&lt;p&gt;The "context window economy" is real. Atlassian's 2025 developer productivity data shows AI gains are significantly offset by knowledge-finding friction. Skills function as the compression layer that makes institutional knowledge agent-consumable. You're not solving a documentation problem; you're solving a &lt;em&gt;context density&lt;/em&gt; problem.&lt;/p&gt;

&lt;p&gt;Here's the architectural insight that matters: Skills and MCP form a complete architecture. MCP gives agents &lt;em&gt;access&lt;/em&gt; to systems — databases, APIs, services, tools. Skills give agents &lt;em&gt;knowledge&lt;/em&gt; about how to use that access correctly. Most teams have built the former without the latter. They have agents that can query their databases and call their APIs, but those agents don't know the team's conventions, constraints, or context. That's why the outputs feel generic even when the capabilities are powerful.&lt;/p&gt;

&lt;p&gt;Your evaluation frameworks need updating too. You're not just testing model outputs anymore. You're testing whether the right skills were discovered and applied for the task. Add skill coverage to your agent evaluation suite.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Infrastructure Signal
&lt;/h2&gt;

&lt;p&gt;When Anthropic, OpenAI, Google, and Microsoft independently adopt the same abstraction within six months, you're watching an interface crystallize into infrastructure. This is the signal that separates "interesting experiment" from "new baseline." Skills crossed that threshold.&lt;/p&gt;

&lt;p&gt;Bessemer Venture Partners' 2025 State of AI report identifies skills as the "memory, context, and beyond" layer — what they call the dark matter of production AI systems. It's the factor that explains why two teams using the same model, same tools, and same prompts get meaningfully different results. One team's agents know things. The other team's agents are constantly relearning.&lt;/p&gt;

&lt;p&gt;The pattern mirrors function calling's trajectory from 2023. That capability went from prompt hack to first-class API primitive to "why are you still parsing JSON manually?" in under 18 months. Skills are on the same curve, roughly twelve months behind. We're currently in the "early adopters have integrated it; everyone else is about to feel pressure" phase.&lt;/p&gt;

&lt;p&gt;IDE vendors are building skill registries and discovery mechanisms into their agent UX. The primitives are becoming platform features, not user configurations. This is the point where opting out starts to cost more than opting in.&lt;/p&gt;

&lt;p&gt;The research-to-production gap is closing fast. Academic work on "institutional knowledge primitives" appeared in March 2025 arxiv papers. Those same concepts shipped in production products by Q4 2025. The theory-to-deployment cycle has compressed to months, not years. By the time you read a research paper about emerging AI patterns, the major platforms have often already shipped an implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift Rating
&lt;/h2&gt;

&lt;p&gt;🟢 &lt;strong&gt;Adopt Now&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The standard exists. Major platforms support it. The migration cost is minimal — you're writing structured markdown files, not rebuilding infrastructure. Teams that formalize their institutional knowledge into skills today will compound that advantage as agent capabilities expand. Every skill you write now becomes more valuable as models get better at following them.&lt;/p&gt;

&lt;p&gt;Teams that wait will find themselves explaining the same context to every new AI tool, repeatedly, while competitors' agents already know the answers. The compounding effect here is real: a team with well-maintained skills gets better agent outputs, which means less time correcting agents, which means more time building, which means faster iteration on the skills themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start this week. Pick one workflow where your AI assistant consistently makes mistakes or asks for clarification. Document the implicit knowledge in a skill file. Measure the improvement. Then expand from there. The infrastructure is ready. The question is whether you are.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of **Primitive Shifts&lt;/em&gt;* — a monthly series tracking when new AI building blocks&lt;br&gt;
move from novel experiments to infrastructure you'll be expected to know.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Next MCP Watch series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Spotted a shift happening in your stack? Drop it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>LangGraph 2.0: The Definitive Guide to Building Production-Grade AI Agents in 2026</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Sun, 29 Mar 2026 14:35:04 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/langgraph-20-the-definitive-guide-to-building-production-grade-ai-agents-in-2026-4j2b</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/langgraph-20-the-definitive-guide-to-building-production-grade-ai-agents-in-2026-4j2b</guid>
      <description>&lt;h1&gt;
  
  
  LangGraph 2.0: The Definitive Guide to Building Production-Grade AI Agents in 2026
&lt;/h1&gt;

&lt;p&gt;The agent framework wars are over, and LangGraph won. Not because it's the simplest tool—it isn't—but because production agents have proven to be fundamentally different from the linear pipelines we built in 2024. When your agent needs to retry a failed API call, loop back for clarification, pause for human approval, and recover gracefully from partial failures, you need a graph, not a chain. LangGraph 2.0, released in February 2026, codifies three years of hard-won production patterns into a framework that finally feels mature.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why LangGraph Matters Now: From Chains to Graphs
&lt;/h2&gt;

&lt;p&gt;The mental model shift from LangChain to LangGraph is the same shift that took web development from CGI scripts to event-driven architectures. In LangChain, you think in sequences: retrieve, augment, generate, done. In LangGraph, you think in states and transitions: what state is my agent in, what conditions determine where it goes next, and what happens when something fails?&lt;/p&gt;

&lt;p&gt;This isn't academic. Consider a simple document research agent. In the linear world, you retrieve documents, synthesize findings, and return a response. But production requirements quickly break this model. What if the retrieval returns low-confidence results? You need to loop back and refine the query. What if the synthesized action requires human approval before execution? You need to pause, persist state, and resume potentially hours later. What if the external API you're calling has a 5% failure rate? You need retry logic with backoff.&lt;/p&gt;

&lt;p&gt;LangGraph represents these workflows as directed cyclic graphs where nodes are actions and edges are conditional transitions. The graph can loop back on itself, branch into parallel paths, or pause indefinitely waiting for external input. This is the fundamental primitive that production agents require.&lt;/p&gt;

&lt;p&gt;The adoption numbers reflect this reality. Gartner's March 2026 report predicts 40% of enterprise applications will embed agentic capabilities by year-end, up from 12% in 2025. Among teams building these systems, LangGraph has emerged as the dominant orchestration layer—not because LangChain failed, but because it solved a different problem.&lt;/p&gt;

&lt;p&gt;That said, LangGraph is frequently overkill. If you're building a straightforward RAG pipeline, a Q&amp;amp;A bot without loops, or any single-turn interaction, LangChain remains the right choice. LangGraph introduces complexity—state schemas, checkpoint persistence, graph compilation—that pays dividends for complex workflows but adds friction for simple ones. The decision tree is straightforward: if your agent needs to loop, branch, retry, or pause, use LangGraph. If it doesn't, don't.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's New in LangGraph 2.0: Breaking Changes and Migration
&lt;/h2&gt;

&lt;p&gt;Version 2.0 represents LangGraph's graduation from "promising framework" to "production foundation." The API surface has been dramatically cleaned up, deprecated patterns removed, and type safety elevated to first-class status. This maturity comes with breaking changes that will require migration effort.&lt;/p&gt;

&lt;p&gt;The most significant breaking change is &lt;code&gt;StateGraph&lt;/code&gt; initialization. In v0.1.x, you could create a graph with minimal configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# v0.1.x (DEPRECATED)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;research_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Checkpointing was optional and configured separately
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In v2.0, checkpoint configuration is required at initialization, and the API enforces explicit persistence decisions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# v2.0 (CURRENT)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.checkpoint.postgres&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PostgresSaver&lt;/span&gt;

&lt;span class="n"&gt;checkpointer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PostgresSaver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_conn_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:pass@localhost/langgraph&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pool_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# New in 2.0: connection pooling configuration
&lt;/span&gt;    &lt;span class="n"&gt;schema_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# New in 2.0: multi-tenant schema support
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;interrupt_before&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# Moved from add_node to initialization
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second major change is interrupt handling. The previous pattern used exceptions to signal interrupts, which was fragile and made error handling ambiguous:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# v0.1.x (DEPRECATED)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.errors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NodeInterrupt&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;human_review_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;needs_approval&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;NodeInterrupt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Awaiting human approval&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Version 2.0 introduces a clean &lt;code&gt;interrupt()&lt;/code&gt; function that explicitly signals pause points:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# v2.0 (CURRENT)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;interrupt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Command&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;human_review_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;needs_approval&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;approval&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;interrupt&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Approve this action?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;approval&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The new approach makes interrupt payloads explicit, provides better typing for IDE support, and cleanly separates the interrupt signal from error handling.&lt;/p&gt;

&lt;p&gt;Guardrail Nodes represent the third major addition. Enterprise deployments have been rolling their own content filtering, rate limiting, and compliance logging for years. Version 2.0 makes these first-class primitives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.guardrails&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ContentFilter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AuditLogger&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_guardrail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;ContentFilter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PII_PATTERN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PROFANITY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# or "block", "flag"
&lt;/span&gt;    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Apply before these nodes
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_guardrail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;requests_per_minute&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;burst_limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# or "global", "per_thread"
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_guardrail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;AuditLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;destination&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cloudwatch://agents/compliance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;include_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;redact_fields&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_pii&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For teams migrating existing applications, LangChain provides automated migration scripts in the &lt;code&gt;langchain-cli&lt;/code&gt; package. Running &lt;code&gt;langchain migrate langgraph&lt;/code&gt; will identify deprecated patterns and suggest fixes. However, the scripts can't automatically migrate custom checkpoint implementations or complex interrupt handlers—budget manual effort for those components.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Agent Protocol Support: A2A and MCP Integration
&lt;/h2&gt;

&lt;p&gt;Before 2026, connecting an agent to external tools meant writing custom integration code for every tool, every database, every API. Connecting agents to other agents—especially across frameworks—was effectively impossible without building custom message-passing infrastructure. Two protocols have emerged to solve this: MCP for agent-tool connections and A2A for agent-agent communication.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol (MCP), originally developed by Anthropic and now maintained by the Linux Foundation, has become the "USB port for agents." It provides a standardized way for agents to discover, authenticate with, and invoke external tools. Rather than writing custom code to connect your agent to Slack, you connect to a Slack MCP server that exposes standardized tool definitions.&lt;/p&gt;

&lt;p&gt;LangGraph 2.0 treats MCP as a first-class primitive. Tools can be imported directly from MCP servers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.tools.mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MCPToolkit&lt;/span&gt;

&lt;span class="c1"&gt;# Connect to MCP servers
&lt;/span&gt;&lt;span class="n"&gt;toolkit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MCPToolkit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;servers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp://tools.company.internal/slack&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp://tools.company.internal/jira&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp://tools.company.internal/postgres&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;auth_provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vault://secrets/mcp-tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Tools are automatically typed and documented
&lt;/span&gt;&lt;span class="n"&gt;available_tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;toolkit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# Returns: [SlackSendMessage, SlackReadChannel, JiraCreateTicket, ...]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The A2A (Agent-to-Agent) protocol addresses the harder problem: agents communicating with other agents, potentially running on different frameworks. A LangGraph orchestrator might need to delegate a subtask to a CrewAI crew, receive the result, and incorporate it into its own state.&lt;/p&gt;

&lt;p&gt;LangGraph 2.0's A2A support works through message queues and shared contracts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.a2a&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;A2AClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentContract&lt;/span&gt;

&lt;span class="c1"&gt;# Define the contract for the external agent
&lt;/span&gt;&lt;span class="n"&gt;research_crew_contract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentContract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;crewai://research-team.internal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;input_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ResearchRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ResearchReport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;a2a_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;broker_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nats://a2a-broker.internal:4222&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contracts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;research_crew_contract&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delegate_research_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Delegate deep research to specialized CrewAI team.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResearchRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;comprehensive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;academic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;news&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;internal_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Send request and await response (with timeout and retry)
&lt;/span&gt;    &lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;a2a_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;crewai://research-team.internal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;correlation_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delegation_complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The architecture flows like this: your LangGraph orchestrator node sends a message to the A2A broker, which routes it to the CrewAI crew. That crew processes the request, sends its response back through the broker, and your LangGraph node receives it and updates state. From LangGraph's perspective, it's just an async node; the complexity of cross-framework communication is hidden behind the protocol.&lt;/p&gt;

&lt;p&gt;One critical gotcha: MCP and A2A are versioned protocols, and version mismatches cause subtle bugs. Always pin your protocol versions in production, and test thoroughly when upgrading. The &lt;code&gt;mcp-version&lt;/code&gt; and &lt;code&gt;a2a-version&lt;/code&gt; fields in your configuration should be explicit, not &lt;code&gt;latest&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-On: Code Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's build a complete document research agent that demonstrates LangGraph 2.0's key patterns. This agent searches documents, synthesizes findings, pauses for human approval before taking actions, and can loop back for clarification when needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Document Research Agent with Human-in-the-Loop Approval
LangGraph 2.0 | Python 3.12+ | PostgreSQL persistence

Requirements:
    langgraph==2.0.3
    langchain-anthropic==0.3.1
    langchain-postgres==0.2.0
    psycopg[pool]==3.2.0
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatAnthropic&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.messages&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SystemMessage&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.checkpoint.postgres&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PostgresSaver&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;interrupt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Command&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.prebuilt&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ToolNode&lt;/span&gt;


&lt;span class="c1"&gt;# === State Schema Definition ===
# Annotated fields with reducer functions define how state updates accumulate
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Complete state for the research agent.

    Using Annotated with reducers means multiple node updates
    to the same field are combined rather than overwritten.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Messages accumulate across nodes
&lt;/span&gt;    &lt;span class="n"&gt;research_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Research findings accumulate
&lt;/span&gt;    &lt;span class="n"&gt;synthesis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;  &lt;span class="c1"&gt;# Latest synthesis (overwrites)
&lt;/span&gt;    &lt;span class="n"&gt;proposed_action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# Action awaiting approval
&lt;/span&gt;    &lt;span class="n"&gt;approval_status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;needs_clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; 
    &lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;  &lt;span class="c1"&gt;# Research confidence (0.0 - 1.0)
&lt;/span&gt;    &lt;span class="n"&gt;iteration_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;  &lt;span class="c1"&gt;# Track loops for circuit-breaker
&lt;/span&gt;

&lt;span class="c1"&gt;# === Tool Definitions ===
&lt;/span&gt;
&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search the document corpus for relevant information.

    Args:
        query: Natural language search query
        max_results: Maximum number of results to return

    Returns:
        List of document excerpts with relevance scores
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# In production, this would call your vector store
&lt;/span&gt;    &lt;span class="c1"&gt;# Simulated response for demonstration
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DOC-2024-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;excerpt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Q4 revenue exceeded projections by 12%...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relevance_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.92&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_reports&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DOC-2024-002&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;excerpt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Customer acquisition cost decreased to $45...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relevance_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metrics_dashboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;


&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute an approved action.

    Args:
        action_type: Type of action (e.g., &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update_dashboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)
        parameters: Action-specific parameters

    Returns:
        Execution result with status and details
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# In production, this would perform real actions
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;action_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-03-29T10:30:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;details&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Executed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;action_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# === Model Configuration ===
&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Low temperature for consistent research
&lt;/span&gt;    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Bind tools to model for automatic tool calling
&lt;/span&gt;&lt;span class="n"&gt;model_with_tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind_tools&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;search_documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;execute_action&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;


&lt;span class="c1"&gt;# === Node Implementations ===
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;research_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Invoke retrieval tools and accumulate research results.

    This node lets the model decide which tools to call based
    on the current conversation context and previous findings.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are a research assistant.
    Analyze the user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s request and search for relevant documents.
    If previous research exists, determine if more searching is needed
    or if you have sufficient information to synthesize findings.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_with_tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;synthesis_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Synthesize research findings and propose next action.

    Evaluates accumulated research, generates a synthesis,
    calculates confidence, and proposes an action if warranted.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Based on the research results,
    synthesize the findings into a coherent summary. Then:
    1. Assess your confidence (0.0-1.0) in the completeness of research
    2. If confidence &amp;gt; 0.7, propose a specific action to take
    3. If confidence &amp;lt;= 0.7, indicate more research is needed

    Format your response as JSON with keys: synthesis, confidence, proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Include research results in context
&lt;/span&gt;    &lt;span class="n"&gt;research_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research findings: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;research_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;research_context&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Parse the structured response (in production, use structured output)
&lt;/span&gt;    &lt;span class="c1"&gt;# Simplified for demonstration
&lt;/span&gt;    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;human_review_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Pause for human approval using LangGraph 2.0 interrupt API.

    This node suspends graph execution and awaits external input.
    The interrupt payload is sent to the client and can be displayed
    in a UI for human review.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Create interrupt with full context for reviewer
&lt;/span&gt;    &lt;span class="n"&gt;approval_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;interrupt&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Based on &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;research_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;options&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approve&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reject&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;needs_clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# approval_response comes from the resume call
&lt;/span&gt;    &lt;span class="c1"&gt;# e.g., {"decision": "approved", "notes": "Looks good"}
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;approval_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reviewer decision: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;approval_response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;action_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute the approved action.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Action was not approved.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]}&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;execute_action&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Action executed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clarification_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Handle requests for clarification by refining the research query.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ll gather more specific information.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;  &lt;span class="c1"&gt;# Reset to trigger more research
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# === Conditional Edge Functions ===
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_after_synthesis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Determine next step based on confidence and iteration count.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Circuit breaker: max 5 research iterations
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Loop back for more research
&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_after_approval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Route based on human approval decision.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;needs_clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# rejected or other
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_after_clarification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Always loop back to research after clarification.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="c1"&gt;# === Graph Construction ===
&lt;/span&gt;
&lt;span class="c1"&gt;# Configure PostgreSQL persistence with 2.0 connection pooling
&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PostgresSaver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_conn_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://langgraph:secure_password@localhost:5432/agents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pool_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_overflow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pool_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;schema_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Build the graph
&lt;/span&gt;&lt;span class="n"&gt;graph_builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add nodes
&lt;/span&gt;&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;research_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ToolNode&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;search_documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;execute_action&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;synthesis_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;human_review_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clarification_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add edges
&lt;/span&gt;&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Always run tools after research node
&lt;/span&gt;&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route_after_synthesis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route_after_approval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clarification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route_after_clarification&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Compile the graph
&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="c1"&gt;# === Execution Examples ===
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_research_session&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Demonstrate full agent execution with human-in-the-loop.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Initial invocation - will pause at human_review
&lt;/span&gt;    &lt;span class="n"&gt;thread_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;configurable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-session-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

    &lt;span class="n"&gt;initial_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze Q4 financial performance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# First run - executes until interrupt
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting research session...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;initial_state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Paused for approval. Synthesis: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Simulate time passing, human reviews in UI...
&lt;/span&gt;    &lt;span class="c1"&gt;# Resume with approval
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Resuming with approval...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;final_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analysis looks comprehensive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="n"&gt;thread_config&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Final result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;final_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stream_research_session&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Stream execution for real-time UI updates.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;thread_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;configurable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-session-002&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
    &lt;span class="n"&gt;initial_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What were our top customer complaints last quarter?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proposed_action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;approval_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Stream mode provides real-time updates
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;initial_state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stream_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;updates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;node_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] State update received&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# In a real app, send these updates via WebSocket to frontend
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Confidence: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;N/A&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;run_research_session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation demonstrates several LangGraph 2.0 patterns. The state schema uses &lt;code&gt;Annotated&lt;/code&gt; fields with reducer functions to accumulate messages and research results across iterations. The &lt;code&gt;interrupt()&lt;/code&gt; function cleanly pauses execution with a structured payload that can be rendered in a review UI. Conditional edges route the graph based on confidence scores and approval decisions, enabling both loops (back to research) and branches (to action or clarification). The checkpointer configuration uses 2.0's connection pooling for production database usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Stack
&lt;/h2&gt;

&lt;p&gt;The decision to adopt LangGraph should be based on workflow complexity, not hype. Start with this decision tree: Does your agent need to loop, retry, branch conditionally, or pause for external input? If yes, LangGraph is the right choice. If no—if you're building retrieval pipelines, Q&amp;amp;A bots, or single-turn interactions—stick with LangChain's simpler abstractions.&lt;/p&gt;

&lt;p&gt;For teams with existing LangChain applications, migration should be prioritized by value. The highest-priority candidates are applications that currently use hacky workarounds for agent loops, human-in-the-loop approval, or error recovery with retries. These workflows are fighting against LangChain's linear model and will benefit most from LangGraph's cyclic graph primitives. Lower priority are well-functioning RAG pipelines and simple chains—these work fine as-is.&lt;/p&gt;

&lt;p&gt;Observability investment is non-negotiable for production LangGraph deployments. LangSmith's trace visualization has matured significantly for graph-based executions, showing cycle counts, interrupt points, and branch decisions in a comprehensible format. Spend time understanding the trace UI before deploying to production—debugging a cyclic graph without proper observability is painful.&lt;/p&gt;

&lt;p&gt;Plan for team ramp-up time. Engineers familiar with LangChain will need 2-4 weeks to internalize graph-based design thinking. The concepts aren't harder, but they're different. State schemas, reducers, conditional edges, and interrupt handling require a mental model shift. Budget this time explicitly rather than assuming immediate productivity.&lt;/p&gt;

&lt;p&gt;Infrastructure costs increase with LangGraph. Checkpointing requires persistent storage—PostgreSQL, Redis, or cloud equivalents—with associated database costs. State persistence means more data stored longer. Estimate 10-20% infrastructure overhead compared to stateless chains, higher for workflows with frequent interrupts and long-running threads.&lt;/p&gt;

&lt;p&gt;For enterprise deployments, version 2.0's guardrail nodes simplify compliance requirements. Content filtering, rate limiting, and audit logging are now declarative configuration rather than custom code. If you're pursuing SOC 2 compliance for AI systems, review LangSmith's data handling policies—trace data may include sensitive information that needs appropriate handling.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Build This Week
&lt;/h2&gt;

&lt;p&gt;Build a customer support escalation agent that demonstrates LangGraph's human-in-the-loop capabilities in a realistic context.&lt;/p&gt;

&lt;p&gt;The agent should handle incoming support tickets through this workflow: classify the ticket severity, attempt automated resolution for low-severity issues, escalate to human review for high-severity or low-confidence responses, and loop back for additional information gathering when the human reviewer requests clarification.&lt;/p&gt;

&lt;p&gt;Specific requirements:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use the new &lt;code&gt;interrupt()&lt;/code&gt; API for escalation pauses&lt;/li&gt;
&lt;li&gt;Implement confidence-based routing (auto-resolve if confidence &amp;gt; 0.85, escalate otherwise)&lt;/li&gt;
&lt;li&gt;Add a circuit breaker after 3 information-gathering loops&lt;/li&gt;
&lt;li&gt;Configure PostgreSQL checkpointing with connection pooling&lt;/li&gt;
&lt;li&gt;Add a guardrail node for PII detection before responses are sent&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  This project will exercise state accumulation (ticket history), conditional routing (severity and confidence), human-in-the-loop (escalation), loops (information gathering), and guardrails (PII filtering)—the core patterns that make LangGraph valuable in production. Aim for a working prototype by end of week, then iterate on the human review UI and observability integration.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is part of the **Agentic Engineering Weekly&lt;/em&gt;* series — a deep-dive every Monday into the frameworks,&lt;br&gt;
patterns, and techniques shaping the next generation of AI systems.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow the Agentic Engineering Weekly series on Dev.to to catch every edition.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building something agentic? Drop a comment — I'd love to feature reader projects.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI Weekly: Musk Merges SpaceX with xAI, LeCun's AMI Labs Raises Record $1B Seed, and the Infrastructure Race Intensifies</title>
      <dc:creator>Richard Dillon</dc:creator>
      <pubDate>Sun, 29 Mar 2026 13:59:27 +0000</pubDate>
      <link>https://dev.to/richard_dillon_b9c238186e/ai-weekly-musk-merges-spacex-with-xai-lecuns-ami-labs-raises-record-1b-seed-and-the-4f8c</link>
      <guid>https://dev.to/richard_dillon_b9c238186e/ai-weekly-musk-merges-spacex-with-xai-lecuns-ami-labs-raises-record-1b-seed-and-the-4f8c</guid>
      <description>&lt;h1&gt;
  
  
  AI Weekly: Musk Merges SpaceX with xAI, LeCun's AMI Labs Raises Record $1B Seed, and the Infrastructure Race Intensifies
&lt;/h1&gt;

&lt;p&gt;This week marks a pivotal inflection point in AI's evolution from software phenomenon to physical-world force. Elon Musk's merger of SpaceX and xAI signals that the race to embed intelligence into hardware has officially begun, while Yann LeCun's billion-dollar bet on world models challenges the entire foundation of how we've been building AI systems. Meanwhile, the infrastructure war is heating up as NVIDIA and Amazon pour tens of billions into OpenAI, and cracks are showing in one of open-source AI's most important teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Elon Musk Merges SpaceX and xAI to Build Autonomous Spacecraft and Robotic Mars Colonies
&lt;/h2&gt;

&lt;p&gt;Elon Musk announced the formal merger of SpaceX and xAI this week, creating what analysts are calling a potential "technological powerhouse" that combines world-leading aerospace engineering with advanced generative AI capabilities. The strategic rationale centers on embedding Grok models directly into SpaceX operations—from mission planning to real-time spacecraft decision-making.&lt;/p&gt;

&lt;p&gt;The technical ambitions are substantial. SpaceX engineers are reportedly working to integrate Grok-3's reasoning capabilities into autonomous trajectory optimization for deep-space missions, where communication latency with Earth makes human-in-the-loop control impractical. For Mars colonization efforts, the merged entity aims to deploy AI-driven robotics capable of habitat construction, resource extraction, and maintenance operations with minimal human oversight.&lt;/p&gt;

&lt;p&gt;Industry observers note that xAI's simulation and synthetic data generation capabilities could accelerate SpaceX's already aggressive development timeline. Training autonomous systems for Martian conditions is notoriously difficult given the lack of real-world data, but Grok's world-modeling capabilities could generate high-fidelity training environments.&lt;/p&gt;

&lt;p&gt;The merger also consolidates Musk's AI compute resources—xAI's Memphis data center reportedly houses over 100,000 H100 GPUs—with SpaceX's operational needs. Critics point out that this vertical integration raises questions about oversight of autonomous systems making life-critical decisions in space, but Musk has historically favored speed over regulatory caution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Yann LeCun's AMI Labs Raises $1.03 Billion Seed Round to Build World Models
&lt;/h2&gt;

&lt;p&gt;Advanced Machine Intelligence (AMI Labs), the startup founded by Meta's Chief AI Scientist Yann LeCun, closed a $1.03 billion seed round this week—likely the largest seed investment ever for a European company and one of the largest globally. The 890 million euro raise signals serious investor confidence in LeCun's long-standing critique of current AI architectures.&lt;/p&gt;

&lt;p&gt;LeCun has spent years arguing that autoregressive transformers—the foundation of GPT-4, Claude, and Gemini—are fundamentally limited in their ability to understand and reason about the physical world. AMI Labs is betting on "world models," systems that build internal representations of how reality works rather than simply predicting the next token in a sequence.&lt;/p&gt;

&lt;p&gt;The technical approach reportedly combines Joint Embedding Predictive Architectures (JEPA) with hierarchical planning systems capable of reasoning over multiple time scales. Unlike current LLMs that struggle with basic physics intuitions, world models aim to develop the kind of causal understanding that humans acquire through embodied experience.&lt;/p&gt;

&lt;p&gt;LeCun remains at Meta in an advisory capacity, but AMI Labs represents his opportunity to test these theories at scale without the constraints of a larger organization's product roadmap. The fundraise includes participation from several European sovereign wealth funds eager to establish a competitive AI presence outside the US-China axis. Whether world models can deliver on their theoretical promise—or whether scale is the real bottleneck after all—should become clearer within the next 18 months.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Programming Updates
&lt;/h2&gt;

&lt;p&gt;NVIDIA released Nemotron 3 Super this week, featuring a hybrid mixture-of-experts (MoE) architecture specifically optimized for building high-throughput AI agents. The model activates only 17 billion of its 49 billion total parameters per forward pass, dramatically reducing inference costs while maintaining strong performance. Artificial Analysis ranked Nemotron 3 as the most open and efficient model in its class, with leading scores on coding benchmarks (HumanEval: 89.2%), reasoning tasks, and the emerging AgentBench evaluation suite.&lt;/p&gt;

&lt;p&gt;Ai2 countered with Olmo Hybrid, a 7B-parameter model that combines traditional transformer attention with linear recurrent layers (specifically, Mamba-2 blocks). The architecture achieves roughly 2× data efficiency compared to pure transformer models of equivalent size—a significant advantage as high-quality training data becomes increasingly scarce. Olmo Hybrid scores competitively with models twice its size on agentic tool-use benchmarks.&lt;/p&gt;

&lt;p&gt;On the framework side, Agno has emerged as the high-performance open-source Python framework for multi-agent systems. Its AgentOS runtime handles inter-agent communication, state management, and failure recovery, while the built-in FastAPI app provides immediate REST API access to deployed agents. Early adopters report 40% latency improvements over LangChain-based implementations for complex orchestration workflows.&lt;/p&gt;

&lt;p&gt;The broader enterprise trend is unmistakable: AI is shifting from individual productivity tools (Copilot-style code completion) toward team and workflow orchestration systems. Companies are increasingly deploying agent swarms that handle entire business processes—from customer inquiry to resolution—with human oversight at strategic checkpoints rather than individual steps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alibaba's Qwen Team Exodus Threatens Open-Source AI Ecosystem
&lt;/h2&gt;

&lt;p&gt;Major departures from Alibaba's Qwen team this week have sent ripples of concern through the open-source AI community. Following the launch of Qwen3.5-small—which achieved impressive efficiency benchmarks—several key researchers began circulating a unified message: "Qwen is nothing without its people." The coordinated messaging echoes the dramatic exodus from OpenAI during its 2023 board crisis.&lt;/p&gt;

&lt;p&gt;Reports indicate that Alibaba Cloud leadership removed Qwen's technical lead amid internal disputes over the team's direction and resource allocation. The specifics remain murky, but sources suggest tension between Alibaba's commercial priorities and the team's commitment to open-source releases. Qwen models have been among the most capable freely available options, particularly for Chinese language tasks and efficient small-model deployments.&lt;/p&gt;

&lt;p&gt;The open-source community's concern is justified. Qwen has filled a critical niche: high-quality small models (1.8B to 72B parameters) with permissive licenses and strong multilingual capabilities. If the team implodes, that gap won't be easily filled. Meta's Llama team operates under different constraints, Mistral has shifted toward commercial offerings, and Google's Gemma remains limited in scope.&lt;/p&gt;

&lt;p&gt;Several departing researchers have reportedly been approached by both US and Chinese AI labs, but any reconstitution of the team's capabilities would take months at minimum. For developers who've built production systems on Qwen models, this week's news is a reminder that open-source sustainability remains fragile even when backed by corporate resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  NVIDIA and Amazon Pour Billions into OpenAI Amid Infrastructure Race
&lt;/h2&gt;

&lt;p&gt;NVIDIA committed $30 billion to OpenAI this week in what CEO Jensen Huang characterized as potentially his company's "last" major AI investment of this scale. The statement raised eyebrows—NVIDIA's cash position certainly allows for more—but Huang appears to be signaling that the infrastructure buildout phase is nearing completion.&lt;/p&gt;

&lt;p&gt;Amazon announced a strategic partnership with OpenAI simultaneously, joining SoftBank and NVIDIA in what's becoming an unprecedented concentration of capital around a single AI company. The move is notable given Amazon's existing 21% stake in Anthropic, meaning the e-commerce giant now holds significant positions in both leading frontier AI labs.&lt;/p&gt;

&lt;p&gt;Both OpenAI and Anthropic are reportedly planning IPOs in 2026, which would position Amazon as potentially the best-performing venture investor of the decade. The strategic logic extends beyond financial returns: Amazon Web Services gains preferred access to the most capable models for its enterprise customers, while the AI companies secure long-term compute commitments.&lt;/p&gt;

&lt;p&gt;The infrastructure race is intensifying as training runs for next-generation models approach $10 billion in compute costs alone. NVIDIA's investment secures its hardware position in OpenAI's data centers, while Amazon's participation suggests AWS will host significant portions of OpenAI's inference workloads. For competitors without access to this capital, the path to frontier capabilities is narrowing rapidly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Physical AI and Robotics Gaining Momentum as Scaling Hits Limits
&lt;/h2&gt;

&lt;p&gt;IBM Research published a widely-circulated analysis this week predicting that 2026 marks the definitive shift toward robotics and physical AI as the industry's primary innovation vector. The thesis: pure scaling of language models is hitting diminishing returns, and the next wave of breakthroughs will come from AI systems that can sense, act, and learn in real environments.&lt;/p&gt;

&lt;p&gt;The evidence is mounting. GPT-5's capabilities, while impressive, represent incremental improvements over GPT-4 rather than the step-function gains seen in earlier generations. Training data is becoming scarce and expensive. Inference costs, while dropping, remain substantial for reasoning-heavy applications. Meanwhile, robotics costs have declined precipitously—capable manipulation hardware now costs under $20,000—while simulation environments for training embodied agents have matured significantly.&lt;/p&gt;

&lt;p&gt;The technical challenges of physical AI remain formidable. Real-world environments are messy, unpredictable, and unforgiving of errors. The sim-to-real gap—where policies learned in simulation fail to transfer to physical systems—hasn't been fully solved. Safety requirements for robots operating around humans add constraints that software agents don't face.&lt;/p&gt;

&lt;p&gt;But the investment thesis is compelling: AI that can perform physical labor addresses a larger economic opportunity than AI that can write code or answer questions. Figure, Boston Dynamics, and a wave of well-funded startups are betting that 2026's advances in world models and multimodal learning will finally crack the embodiment problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Arkansas Deploys AI-Powered Cameras to Detect Distracted Drivers
&lt;/h2&gt;

&lt;p&gt;Arkansas began deploying AI-powered cameras in highway work zones this week, targeting drivers using handheld cell phones. The system, manufactured by Acusensus, uses computer vision trained on millions of images to identify specific visual indicators of phone use while filtering out legal hands-free Bluetooth devices.&lt;/p&gt;

&lt;p&gt;Enforcement under Act 707 begins mid-January 2026, with an initial warning period allowing drivers to receive educational notices rather than citations. The technical implementation processes images locally on edge devices, with human review required before any citation is issued. Penalties start at $250 and escalate for repeat offenses.&lt;/p&gt;

&lt;p&gt;The privacy debates were immediate and predictable. Civil liberties advocates argue that the cameras capture far more than phone use—passenger faces, vehicle contents, and driving patterns that could be retained or misused. The legislation authorizing the cameras specified distracted driving detection, but the underlying technology is capable of much more.&lt;/p&gt;

&lt;p&gt;Arkansas transportation officials emphasize the safety rationale: work zone fatalities have increased 40% over the past five years, with distracted driving cited as a primary factor. The cameras represent a middle ground between unenforceable bans and more invasive alternatives like phone-disabling technology. Whether other states follow Arkansas's lead will depend largely on how effectively the privacy concerns are addressed in implementation rather than legislation.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI Costs Approaching Zero Create New Entrepreneurial Opportunities
&lt;/h2&gt;

&lt;p&gt;The cost curve for AI inference continues its dramatic decline, with API pricing for GPT-4-class models now under $0.50 per million input tokens—roughly 100× cheaper than two years ago. For entrepreneurs and small businesses, capabilities that required enterprise budgets in 2024 are now accessible at near-trivial costs.&lt;/p&gt;

&lt;p&gt;The implications are reshaping the startup landscape. Several AI-native companies have reached eight-figure annual revenue with teams under 20 people, automating functions that would have required entire departments. Customer service, legal document review, code generation, and content production are increasingly handled by AI systems with minimal human oversight.&lt;/p&gt;

&lt;p&gt;The competitive dynamics have inverted. When everyone has access to the same powerful models at the same low prices, the differentiator isn't AI capability—it's understanding how to effectively deploy these systems. Domain expertise, workflow design, and integration with existing business processes matter more than raw model performance.&lt;/p&gt;

&lt;p&gt;For freelancers, the calculus is similar. An independent consultant with deep expertise and sophisticated AI tooling can now deliver output that rivals small agencies. The leverage that AI provides to individuals continues to increase, even as the tools themselves become commoditized. The winners in this environment won't be those with the best AI—they'll be those who understand their domains deeply enough to know where AI creates value and where human judgment remains essential.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The next few weeks will be critical for several developing stories. Watch for concrete technical details from the SpaceX-xAI merger—specifically whether Grok integration begins with Starlink operations before moving to more ambitious space applications. The Qwen situation bears monitoring; if key researchers land at a single destination, we may see a spiritual successor emerge quickly. And with both OpenAI and Anthropic reportedly targeting 2026 IPOs, expect increasing financial disclosure that will finally reveal the true economics of frontier AI development.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this briefing? Follow this series for a fresh AI update every day, written for engineers who want to stay ahead.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow this publication on Dev.to to get notified of every new article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have a story tip or correction? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>technology</category>
    </item>
  </channel>
</rss>
