Seenivasa Ramadurai

Posted on Dec 10

LangGraph Streaming 101: 5 Modes to Build Responsive AI Applications

#agents #tutorial #ai #llm

Understanding LangGraph Streaming Modes.

Building AI applications that feel responsive and interactive is crucial for user experience. Nobody wants to stare at a loading spinner for 30 seconds while an AI agent completes its work. This is where LangGraph's streaming capabilities shine allowing you to show progress, intermediate results, and realtime outputs as they happen.

Why Streaming Matters

Traditional AI workflows make you wait for the entire process to complete before showing any results. LangGraph flips this model by letting you stream outputs in real time. This means:

Better UX: Users see progress as it happens, not just the final result
Transparency: Watch your agent's reasoning unfold step-by-step
Interactivity: Build applications that feel alive and responsive
Debugging: Understand what's happening inside your graph during development

The Five Streaming Modes

LangGraph offers five distinct streaming modes, each designed for different use cases. You can even combine them to get the best of multiple worlds.

1. `values` Complete State Snapshots

Streams the full state of your graph after every node execution. Think of it as taking a complete photograph of your entire state at each step.

When to use: You need to see the complete context and all state variables at each stage. Great for applications where the full state history matters.

Trade-off: More data-heavy than alternatives since you're getting everything, not just what changed.

2. `updates` Just the Changes

Streams only the state changes (deltas) after each node runs. Instead of the full picture, you get just what changed.

When to use: Building dashboards or UIs that track progress incrementally. Perfect when you only care about what's new or different.

Trade-off: More efficient than values, but you need to track history yourself if you need the complete context later.

3. `messages` Token-by-Token LLM Output

Streams LLM output tokens as they're generated, creating that familiar "typing" effect you see in modern chat interfaces.

When to use: Chat style applications where you want users to see the AI "thinking" and responding in real-time.

Trade-off: Requires your LLM provider to support streaming. Not all providers or models offer this.

4. `custom` Your Own Events

Allows your code and tools to emit custom events, logs, or data during execution. Think progress percentages, custom metrics, or tool-specific updates.

When to use: Long-running workflows where you want to signal progress or emit custom data that's specific to your application's needs.

Trade-off: Requires you to implement custom event emission in your tools and nodes.

5. `debug` Complete Execution Trace

Streams a full execution trace including node entry/exit, state before/after, tool inputs/outputs, and errors. It's like having X-ray vision into your graph.

When to use: Development and debugging. When you need to understand exactly what's happening at every step.

Trade-off: Generates a lot of data. Not suitable for production UIs, but invaluable during development.

Mixing Modes for Maximum Power

Here's where it gets interesting: you can combine streaming modes. Want to show both the "typing" effect AND track state changes? Use both messages and updates:

stream_mode=["messages", "updates"]

This is perfect for building rich UIs that show both the AI's response and what's happening behind the scenes.

Practical Use Cases

Chat Interface:

Use messages for the typing effect
Add updates to show tool calls or reasoning steps
Result: Users see both the response and understand what the AI is doing

Progress Dashboard:

Use updates or values to track state evolution
Add custom for progress percentages or custom metrics
Result: Clear visibility into long-running workflows

Development & Testing:

Use debug mode exclusively
Get complete visibility into execution flow
Result: Find and fix issues faster

Complex Workflows:

Use custom to emit progress from time-consuming tools
Combine with updates to show state changes
Result: Users stay informed during long operations

Important Considerations

Data Volume: Combining multiple modes increases the amount of data flowing through your application. Consider the overhead, especially for production deployments.

State Management: The updates mode only shows changes. If you need complete history, either use values or implement your own state tracking.

Provider Support: Token-level streaming requires provider support. Check that your LLM provider supports streaming before relying on the messages mode.

Custom Events: The custom mode requires implementation work. You need to emit events from your tools and nodes explicitly.

Getting Started

The beauty of LangGraph's streaming is that you can start simple and add complexity as needed:

Start with values or updates to see state changes
Add messages when building chat interfaces
Implement custom events for specialized needs
Use debug during development to understand behavior

Streaming transforms AI applications from black boxes into transparent, responsive experiences. By choosing the right streaming mode—or combination of modes—you can build applications that keep users engaged and informed every step of the way.

"""
LangGraph Streaming Modes Examples
Demonstrates different streaming modes and how to use them
"""

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator

# Define the state
class GraphState(TypedDict):
    messages: Annotated[list, operator.add]
    step_count: int
    result: str

# Example nodes
def research_node(state: GraphState):
    """Simulates a research step"""
    return {
        "messages": ["Researching topic..."],
        "step_count": state["step_count"] + 1
    }

def analyze_node(state: GraphState):
    """Simulates an analysis step"""
    return {
        "messages": ["Analyzing data..."],
        "step_count": state["step_count"] + 1
    }

def summarize_node(state: GraphState):
    """Simulates a summary step"""
    return {
        "messages": ["Creating summary..."],
        "step_count": state["step_count"] + 1,
        "result": "Final analysis complete"
    }

# Build the graph
def create_example_graph():
    workflow = StateGraph(GraphState)

    # Add nodes
    workflow.add_node("research", research_node)
    workflow.add_node("analyze", analyze_node)
    workflow.add_node("summarize", summarize_node)

    # Add edges
    workflow.set_entry_point("research")
    workflow.add_edge("research", "analyze")
    workflow.add_edge("analyze", "summarize")
    workflow.add_edge("summarize", END)

    return workflow.compile()

# Example 1: Stream complete state snapshots
def example_values_mode():
    """Stream full state after each node"""
    print("=== VALUES MODE: Full State Snapshots ===\n")

    graph = create_example_graph()
    initial_state = {
        "messages": [],
        "step_count": 0,
        "result": ""
    }

    # Stream with 'values' mode
    for state in graph.stream(initial_state, stream_mode="values"):
        print(f"Step {state['step_count']}:")
        print(f"  Messages: {state['messages']}")
        print(f"  Result: {state['result']}")
        print()

# Example 2: Stream only state changes
def example_updates_mode():
    """Stream only what changed after each node"""
    print("=== UPDATES MODE: State Changes Only ===\n")

    graph = create_example_graph()
    initial_state = {
        "messages": [],
        "step_count": 0,
        "result": ""
    }

    # Stream with 'updates' mode
    for node_name, updates in graph.stream(initial_state, stream_mode="updates"):
        print(f"Node '{node_name}' updated:")
        print(f"  Changes: {updates}")
        print()

# Example 3: Combine multiple modes
def example_combined_modes():
    """Use multiple streaming modes simultaneously"""
    print("=== COMBINED MODE: Updates + Debug Info ===\n")

    graph = create_example_graph()
    initial_state = {
        "messages": [],
        "step_count": 0,
        "result": ""
    }

    # Stream with multiple modes
    for chunk in graph.stream(
        initial_state, 
        stream_mode=["updates", "debug"]
    ):
        print(f"Chunk received: {chunk}")
        print()

# Example 4: Real-world chat interface simulation
async def example_chat_with_streaming():
    """
    Simulates a chat interface using messages mode
    (This would work with an actual LLM that supports streaming)
    """
    print("=== MESSAGES MODE: Token-by-Token Streaming ===\n")

    # In a real implementation, you'd have a graph that includes LLM calls
    # Here's the concept:

    """
    graph = create_chat_graph()

    async for event in graph.astream(
        {"query": "Explain quantum computing"},
        stream_mode="messages"
    ):
        # Event contains token-by-token output
        if event['type'] == 'token':
            print(event['content'], end='', flush=True)
        elif event['type'] == 'tool_call':
            print(f"\n[Using tool: {event['tool']}]")
    """

    print("Chat: Hello! Let me help you with that...")
    print("(In production, this would stream token-by-token)")

# Example 5: Custom progress tracking
def example_custom_mode():
    """
    Demonstrates custom event streaming
    Your nodes would emit custom events for progress tracking
    """
    print("=== CUSTOM MODE: Custom Events ===\n")

    # In practice, your nodes would use StreamWriter to emit events:
    """
    def long_running_node(state: GraphState, writer: StreamWriter):
        for i in range(100):
            # Do some work...
            writer.write({
                'type': 'progress',
                'percentage': i + 1,
                'message': f'Processing item {i+1}/100'
            })
        return state

    for event in graph.stream(initial_state, stream_mode="custom"):
        if event['type'] == 'progress':
            print(f"Progress: {event['percentage']}% - {event['message']}")
    """

    print("Custom events would appear here with progress updates")
    print("Percentage: 25% - Processing data chunk 1/4")
    print("Percentage: 50% - Processing data chunk 2/4")
    print("Percentage: 75% - Processing data chunk 3/4")
    print("Percentage: 100% - Complete!")

# Run examples
if __name__ == "__main__":
    print("LangGraph Streaming Examples\n")
    print("=" * 60)
    print()

    # Example 1: Full state snapshots
    example_values_mode()
    print("\n" + "=" * 60 + "\n")

    # Example 2: Just the changes
    example_updates_mode()
    print("\n" + "=" * 60 + "\n")

    # Example 3: Combined modes
    example_combined_modes()
    print("\n" + "=" * 60 + "\n")

    # Example 4: Chat streaming (conceptual)
    print("Example 4: Chat streaming (see code for async example)")
    print("\n" + "=" * 60 + "\n")

    # Example 5: Custom events (conceptual)
    example_custom_mode()

    print("\n" + "=" * 60)
    print("\nKey Takeaways:")
    print("- Use 'values' for complete state at each step")
    print("- Use 'updates' for efficient change tracking")
    print("- Use 'messages' for token-by-token LLM output")
    print("- Use 'custom' for application-specific events")
    print("- Use 'debug' for development and troubleshooting")
    print("- Combine modes with: stream_mode=['updates', 'messages']")

Thanks
Sreeni Ramadorai

Top comments (2)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.