<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ajay Gupta</title>
    <description>The latest articles on DEV Community by Ajay Gupta (@ajay_gupta_60a0393643f3e9).</description>
    <link>https://dev.to/ajay_gupta_60a0393643f3e9</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3407391%2F28cae715-da0a-42ed-9227-59e932a6b7a8.png</url>
      <title>DEV Community: Ajay Gupta</title>
      <link>https://dev.to/ajay_gupta_60a0393643f3e9</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ajay_gupta_60a0393643f3e9"/>
    <language>en</language>
    <item>
      <title>AG-UI + LangGraph Streaming: Technical Implementation Guide</title>
      <dc:creator>Ajay Gupta</dc:creator>
      <pubDate>Thu, 11 Sep 2025 10:33:49 +0000</pubDate>
      <link>https://dev.to/ajay_gupta_60a0393643f3e9/ag-ui-langgraph-streaming-technical-implementation-guide-kbl</link>
      <guid>https://dev.to/ajay_gupta_60a0393643f3e9/ag-ui-langgraph-streaming-technical-implementation-guide-kbl</guid>
      <description>&lt;h2&gt;
  
  
  🎯 Purpose
&lt;/h2&gt;

&lt;p&gt;This guide shows how to achieve &lt;strong&gt;real-time event streaming from AI workflows to UI&lt;/strong&gt; using AG-UI protocol with LangGraph StateGraph execution. The approach provides &lt;strong&gt;sub-100ms latency&lt;/strong&gt; for live user feedback during complex AI operations.&lt;/p&gt;




&lt;h3&gt;
  
  
  🚀 &lt;strong&gt;Demo Application&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A complete working implementation of this architecture is available at:&lt;br&gt;
&lt;strong&gt;&lt;a href="https://github.com/cimulink/ai-workflow-engine" rel="noopener noreferrer"&gt;https://github.com/cimulink/ai-workflow-engine&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The repository includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pure LangGraph + AG-UI server implementation&lt;/li&gt;
&lt;li&gt;React frontend with real-time streaming&lt;/li&gt;
&lt;li&gt;Document processing workflow example&lt;/li&gt;
&lt;li&gt;Complete setup and deployment instructions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📋 AG-UI Protocol Context
&lt;/h2&gt;

&lt;p&gt;AG-UI defines a standardized set of event types for real-time agent-user interaction. Our goal is to &lt;strong&gt;adapt our entire LangGraph workflow&lt;/strong&gt; to generate these predefined events and &lt;strong&gt;stream them from backend to frontend&lt;/strong&gt; for real-time user feedback.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;AG-UI Message Types&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;AG-UI defines several event categories for different aspects of agent communication:&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Lifecycle Events&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;RUN_STARTED&lt;/code&gt;, &lt;code&gt;RUN_FINISHED&lt;/code&gt;, &lt;code&gt;RUN_ERROR&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;STEP_STARTED&lt;/code&gt;, &lt;code&gt;STEP_FINISHED&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Text Message Events&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;TEXT_MESSAGE_START&lt;/code&gt;, &lt;code&gt;TEXT_MESSAGE_CONTENT&lt;/code&gt;, &lt;code&gt;TEXT_MESSAGE_END&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Tool Call Events&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;TOOL_CALL_START&lt;/code&gt;, &lt;code&gt;TOOL_CALL_ARGS&lt;/code&gt;, &lt;code&gt;TOOL_CALL_END&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;State Management Events&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;STATE_SNAPSHOT&lt;/code&gt;, &lt;code&gt;STATE_DELTA&lt;/code&gt;, &lt;code&gt;MESSAGES_SNAPSHOT&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Special Events&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;RAW&lt;/code&gt;, &lt;code&gt;CUSTOM&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🏗️ Architecture Overview
&lt;/h2&gt;

&lt;p&gt;The system combines three key components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph StateGraph&lt;/strong&gt; - Handles workflow orchestration with nodes and conditional edges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AG-UI Protocol&lt;/strong&gt; - Defines event types for real-time UI communication
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Streaming&lt;/strong&gt; - Uses Server-Sent Events (SSE) for browser-compatible streaming&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;High-Level Architecture&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92nd3e8o7abzm8oxc1pu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92nd3e8o7abzm8oxc1pu.png" alt="architectural overview" width="800" height="1136"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Data Flow Overview&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5d3t6g2ot4ftke3srkc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5d3t6g2ot4ftke3srkc.png" alt=" " width="800" height="25"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔧 Core Technical Components
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;asyncio.Queue&lt;/strong&gt;: Event Management Hub
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;asyncio.Queue&lt;/code&gt; acts as a thread-safe buffer between LangGraph node execution and HTTP streaming. When LangGraph nodes complete, they place events in the queue. The HTTP streaming endpoint continuously reads from the queue and sends events to the frontend.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feapy3smhobyoi6z8yh5g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feapy3smhobyoi6z8yh5g.png" alt=" " width="800" height="191"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Thread Safety&lt;/strong&gt;: Multiple nodes can emit events concurrently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure&lt;/strong&gt;: Prevents memory overflow during heavy processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Order Preservation&lt;/strong&gt;: Events maintain chronological sequence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-blocking&lt;/strong&gt;: Event production and consumption happen independently&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. &lt;strong&gt;yield Keyword&lt;/strong&gt;: Python Streaming Pattern
&lt;/h3&gt;

&lt;p&gt;Python's &lt;code&gt;yield&lt;/code&gt; keyword transforms regular functions into async generators. Instead of returning all results at once, the function yields events one by one as they're produced. This creates a memory-efficient streaming pipeline where events are processed immediately rather than buffered.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9h03gm0leteajv471od5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9h03gm0leteajv471od5.png" alt=" " width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory Efficient&lt;/strong&gt;: Events are yielded as produced, not stored&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time&lt;/strong&gt;: Zero buffering delay between event production and consumption
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lazy Evaluation&lt;/strong&gt;: Execution pauses until next event is requested&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Natural Flow Control&lt;/strong&gt;: Consumer controls processing pace&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. &lt;strong&gt;Custom AGUIStreamingCheckpointer&lt;/strong&gt;: State + Events
&lt;/h3&gt;

&lt;p&gt;This extends LangGraph's built-in &lt;code&gt;SqliteSaver&lt;/code&gt; checkpointer to automatically emit AG-UI events whenever workflow state changes. Every time a node completes and LangGraph saves a checkpoint, the custom checkpointer also emits a corresponding AG-UI event.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs4rs5mvqp309cyhj11mt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs4rs5mvqp309cyhj11mt.png" alt="state and events" width="800" height="1258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkpointer Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic Events&lt;/strong&gt;: No manual event emission required in nodes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State Consistency&lt;/strong&gt;: Events always reflect actual LangGraph state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified Persistence&lt;/strong&gt;: Database and streaming work together&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recovery Support&lt;/strong&gt;: Failed workflows can resume with complete event history&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. &lt;strong&gt;HTTP Streaming&lt;/strong&gt;: Server-Sent Events (SSE)
&lt;/h3&gt;

&lt;p&gt;FastAPI serves AG-UI events using Server-Sent Events format. SSE is a web standard that allows servers to push data to browsers over a single HTTP connection. Each event is formatted as &lt;code&gt;data: {json}\n\n&lt;/code&gt; and sent immediately to the client.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSE Benefits Over WebSocket:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simpler Protocol&lt;/strong&gt;: Standard HTTP, no connection upgrades needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Reconnection&lt;/strong&gt;: Browsers handle reconnection automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firewall Friendly&lt;/strong&gt;: Uses standard HTTP ports, works through proxies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-Way Optimal&lt;/strong&gt;: Perfect for event streaming (no bidirectional needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better Debugging&lt;/strong&gt;: Standard HTTP tools work (curl, Postman)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔄 End-to-End Flow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Complete Sequence Diagram&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4087aa7nwbh1d5wyfmbg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4087aa7nwbh1d5wyfmbg.png" alt="sequence diagram" width="800" height="812"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Step-by-Step Flow&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Request&lt;/strong&gt;: React component sends document content via &lt;code&gt;useAgent&lt;/code&gt; hook&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Streaming Setup&lt;/strong&gt;: FastAPI creates async generator for event streaming&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph Execution&lt;/strong&gt;: Workflow nodes execute with conditional routing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Generation&lt;/strong&gt;: Custom checkpointer emits AG-UI events on state changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Management&lt;/strong&gt;: Events flow through asyncio.Queue to HTTP response&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Processing&lt;/strong&gt;: Browser receives SSE events and updates UI in real-time&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Event Flow Visualization&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8ymlq52jwhjy4cbqlvv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8ymlq52jwhjy4cbqlvv.png" alt="event flow" width="800" height="741"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  ✅ Pros and Cons
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Advantages&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real-Time UX&lt;/strong&gt;: Immediate progress feedback, responsive human-in-the-loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;True LangGraph&lt;/strong&gt;: Preserves StateGraph orchestration, conditional edges, checkpointing
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable&lt;/strong&gt;: AsyncIO handles concurrent streams, event sourcing audit trail&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer Friendly&lt;/strong&gt;: Standard HTTP debugging, type-safe events, familiar React patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Disadvantages&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: Requires understanding async generators, custom checkpointer maintenance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource Usage&lt;/strong&gt;: Persistent connections, database growth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Stream interruptions, partial failures, connection recovery logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing&lt;/strong&gt;: Async streaming harder to unit test, timing considerations, race conditions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎯 Implementation Strategy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Progressive Implementation Roadmap&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F392li82626ch6ivmlnpa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F392li82626ch6ivmlnpa.png" alt="roadmap for implementation" width="800" height="3475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Event Design Patterns&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fieaghiepogk95vewpxfq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fieaghiepogk95vewpxfq.png" alt="event design patterns" width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Error Handling Strategy&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Graceful Degradation&lt;/strong&gt;: Emit error events, return partial results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Recovery&lt;/strong&gt;: Frontend auto-reconnection with exponential backoff&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Batching&lt;/strong&gt;: Reduce network overhead during high-volume periods&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 When to Use This Architecture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Perfect For:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document Processing&lt;/strong&gt;: Multi-step analysis with human review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Pipelines&lt;/strong&gt;: Real-time ETL progress tracking
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Agents&lt;/strong&gt;: Conversational workflows with tool usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long Tasks&lt;/strong&gt;: Processes &amp;gt;30 seconds needing progress updates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Consider Alternatives:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple Request/Response&lt;/strong&gt;: Single API calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Processing&lt;/strong&gt;: No real-time requirements
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource Constrained&lt;/strong&gt;: Limited memory/bandwidth&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💡 Key Takeaways
&lt;/h2&gt;

&lt;p&gt;This architecture delivers &lt;strong&gt;production-ready real-time AI workflow interfaces&lt;/strong&gt; by combining:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph's orchestration&lt;/strong&gt; (StateGraph, nodes, edges)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AG-UI's streaming protocol&lt;/strong&gt; (real-time events)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP SSE streaming&lt;/strong&gt; (browser-compatible, simple debugging)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;asyncio.Queue + yield&lt;/strong&gt; (memory-efficient event pipeline)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Success Factors&lt;/strong&gt;: Design events for user value, implement robust error handling, monitor performance, test streaming behavior, plan for horizontal scaling.&lt;/p&gt;

&lt;p&gt;The complexity is justified when user experience and workflow transparency are critical. Start simple and add streaming capabilities as real-time interaction becomes essential.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>python</category>
      <category>automation</category>
    </item>
    <item>
      <title>Don't Run it Twice: Mastering Idempotency in Production LangGraph Agents</title>
      <dc:creator>Ajay Gupta</dc:creator>
      <pubDate>Wed, 10 Sep 2025 07:15:02 +0000</pubDate>
      <link>https://dev.to/ajay_gupta_60a0393643f3e9/dont-run-it-twice-mastering-idempotency-in-production-langgraph-agents-2gmp</link>
      <guid>https://dev.to/ajay_gupta_60a0393643f3e9/dont-run-it-twice-mastering-idempotency-in-production-langgraph-agents-2gmp</guid>
      <description>&lt;p&gt;You've built an amazing AI agent with LangGraph, but what happens when things fail? What if an API times out, or a process restarts? Will you charge a customer twice or create duplicate users? If these questions make you nervous, you need to think about &lt;strong&gt;idempotency&lt;/strong&gt;. It's the unsung hero of reliable systems and a must for production-grade AI agents.&lt;/p&gt;

&lt;p&gt;This post covers what idempotency is, why it's critical for LangGraph, and how to implement it with practical code for both simple and concurrent scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What is Idempotency, and Why Should I Care?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;An operation is idempotent if calling it multiple times has the same effect as calling it once. Think of &lt;em&gt;setting&lt;/em&gt; a light to ON. Whether you send the command once or ten times, the result is the same: the light is on.&lt;/p&gt;

&lt;p&gt;Many actions in agentic workflows are &lt;em&gt;not&lt;/em&gt; naturally idempotent, like creating a booking (POST /api/bookings), charging a customer, or sending a notification. When your multi-step graph executes, any step can fail. Naively retrying a non-idempotent operation leads to duplicate data and unhappy users.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Core Pattern: The Idempotency Key&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The standard way to enforce idempotency is through a contract between your LangGraph node (the client) and the API you're calling (the server).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Client Generates Key:&lt;/strong&gt; Before the &lt;em&gt;first&lt;/em&gt; attempt, the client generates a unique &lt;strong&gt;idempotency key&lt;/strong&gt; for that specific operation.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client Sends Key:&lt;/strong&gt; The client sends this key with every request, usually in an HTTP header like Idempotency-Key: .
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server Checks Key:&lt;/strong&gt; The server tracks processed keys. If a request has a &lt;em&gt;new&lt;/em&gt; key, it processes it and stores the result. If the key has been &lt;em&gt;seen before&lt;/em&gt;, it skips processing and returns the stored result.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This guarantees that even with multiple retries, the server-side logic runs only once.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example: The Idempotent Flight Booker ✈️&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Let's implement this in LangGraph for a flaky flight booking agent using the tenacity library (pip install tenacity).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: The Graph State&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our State needs a field to hold the idempotency key, keeping it stable across retries of a node.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27jc6pu0fhg1etqki0ej.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27jc6pu0fhg1etqki0ej.png" alt="flight-state" width="800" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: The Graph Nodes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We'll use one node to generate the key and another to perform the retriable action.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwii7te2hheager2roxz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frwii7te2hheager2roxz.png" alt="graph-nodes" width="800" height="812"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Assemble and Run&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The flow is simple: generate the key, then attempt the booking.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsizlgv8kf78fdbo3uye0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsizlgv8kf78fdbo3uye0.png" alt="compile-and-run-nodes" width="800" height="546"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This pattern is perfect for a single process. But how do you handle concurrency?&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Hard Part: Idempotency with Concurrent Workers&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;When your application is deployed with multiple replicas (e.g., on Kubernetes), two workers could retry the &lt;em&gt;exact same task&lt;/em&gt; at the &lt;em&gt;exact same time&lt;/em&gt;. This creates a race condition, undermining our idempotency guarantee.&lt;/p&gt;

&lt;p&gt;The solution is to use a &lt;strong&gt;shared, persistent state manager&lt;/strong&gt; that supports &lt;strong&gt;atomic operations&lt;/strong&gt;, like Redis.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The "Claim Check" Pattern with Redis 🎟️&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This pattern ensures only one worker can "claim" the right to execute an operation for a given key.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stable Key:&lt;/strong&gt; The idempotency key must be deterministic (e.g., a hash of the flight details) so any worker can regenerate it.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic SET:&lt;/strong&gt; Before acting, a worker tries to claim the key in Redis using the atomic SET ... NX command. NX means "only set this key if it does &lt;strong&gt;not&lt;/strong&gt; already exist."
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Race Solved:&lt;/strong&gt; The first worker's SET NX command succeeds, granting it a "lock" to proceed. Any other worker's attempt will fail, telling it to back off.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;LangGraph's persistent &lt;strong&gt;Checkpointers&lt;/strong&gt; (like RedisSaver) are perfect for this, as your graph's state already lives in the shared store you can use for locking.&lt;/p&gt;

&lt;p&gt;Here's a conceptual snippet for a concurrent book_flight node:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flblr81ojq0x7sy8785tp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flblr81ojq0x7sy8785tp.png" alt="concurrency-handling" width="800" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Key Takeaways &amp;amp; Best Practices&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Identify Critical Actions:&lt;/strong&gt; Focus on nodes with external side effects (database writes, payments, etc.).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate Keys Before the Action:&lt;/strong&gt; The key must be created and saved to the state &lt;em&gt;before&lt;/em&gt; the fallible operation begins.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Persistent Checkpointers:&lt;/strong&gt; For any serious workload, use a persistent checkpointer (RedisSaver, SQLiteSaver). This is the foundation for resilience.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embrace the Claim Check:&lt;/strong&gt; For concurrent workers, use a distributed locking mechanism like Redis SET NX to prevent race conditions.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log Everything:&lt;/strong&gt; Log key generation, retries, and lock statuses. These logs will be invaluable for debugging.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By mastering idempotency, you can turn a cool LangGraph prototype into a robust, reliable, and production-ready application.&lt;/p&gt;

&lt;p&gt;Happy building!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>langgraph</category>
      <category>programming</category>
      <category>distributedsystems</category>
    </item>
  </channel>
</rss>
