Understanding LangGraph Streaming Modes.
Building AI applications that feel responsive and interactive is crucial for user experience. Nobody wants to stare at a loading spinner for 30 seconds while an AI agent completes its work. This is where LangGraph's streaming capabilities shine allowing you to show progress, intermediate results, and realtime outputs as they happen.
Why Streaming Matters
Traditional AI workflows make you wait for the entire process to complete before showing any results. LangGraph flips this model by letting you stream outputs in real time. This means:
- Better UX: Users see progress as it happens, not just the final result
- Transparency: Watch your agent's reasoning unfold step-by-step
- Interactivity: Build applications that feel alive and responsive
- Debugging: Understand what's happening inside your graph during development
The Five Streaming Modes
LangGraph offers five distinct streaming modes, each designed for different use cases. You can even combine them to get the best of multiple worlds.
1. values Complete State Snapshots
Streams the full state of your graph after every node execution. Think of it as taking a complete photograph of your entire state at each step.
When to use: You need to see the complete context and all state variables at each stage. Great for applications where the full state history matters.
Trade-off: More data-heavy than alternatives since you're getting everything, not just what changed.
2. updates Just the Changes
Streams only the state changes (deltas) after each node runs. Instead of the full picture, you get just what changed.
When to use: Building dashboards or UIs that track progress incrementally. Perfect when you only care about what's new or different.
Trade-off: More efficient than values, but you need to track history yourself if you need the complete context later.
3. messages Token-by-Token LLM Output
Streams LLM output tokens as they're generated, creating that familiar "typing" effect you see in modern chat interfaces.
When to use: Chat style applications where you want users to see the AI "thinking" and responding in real-time.
Trade-off: Requires your LLM provider to support streaming. Not all providers or models offer this.
4. custom Your Own Events
Allows your code and tools to emit custom events, logs, or data during execution. Think progress percentages, custom metrics, or tool-specific updates.
When to use: Long-running workflows where you want to signal progress or emit custom data that's specific to your application's needs.
Trade-off: Requires you to implement custom event emission in your tools and nodes.
5. debug Complete Execution Trace
Streams a full execution trace including node entry/exit, state before/after, tool inputs/outputs, and errors. It's like having X-ray vision into your graph.
When to use: Development and debugging. When you need to understand exactly what's happening at every step.
Trade-off: Generates a lot of data. Not suitable for production UIs, but invaluable during development.
Mixing Modes for Maximum Power
Here's where it gets interesting: you can combine streaming modes. Want to show both the "typing" effect AND track state changes? Use both messages and updates:
stream_mode=["messages", "updates"]
This is perfect for building rich UIs that show both the AI's response and what's happening behind the scenes.
Practical Use Cases
Chat Interface:
- Use
messagesfor the typing effect - Add
updatesto show tool calls or reasoning steps - Result: Users see both the response and understand what the AI is doing
Progress Dashboard:
- Use
updatesorvaluesto track state evolution - Add
customfor progress percentages or custom metrics - Result: Clear visibility into long-running workflows
Development & Testing:
- Use
debugmode exclusively - Get complete visibility into execution flow
- Result: Find and fix issues faster
Complex Workflows:
- Use
customto emit progress from time-consuming tools - Combine with
updatesto show state changes - Result: Users stay informed during long operations
Important Considerations
Data Volume: Combining multiple modes increases the amount of data flowing through your application. Consider the overhead, especially for production deployments.
State Management: The updates mode only shows changes. If you need complete history, either use values or implement your own state tracking.
Provider Support: Token-level streaming requires provider support. Check that your LLM provider supports streaming before relying on the messages mode.
Custom Events: The custom mode requires implementation work. You need to emit events from your tools and nodes explicitly.
Getting Started
The beauty of LangGraph's streaming is that you can start simple and add complexity as needed:
- Start with
valuesorupdatesto see state changes - Add
messageswhen building chat interfaces - Implement
customevents for specialized needs - Use
debugduring development to understand behavior
Streaming transforms AI applications from black boxes into transparent, responsive experiences. By choosing the right streaming mode—or combination of modes—you can build applications that keep users engaged and informed every step of the way.
"""
LangGraph Streaming Modes Examples
Demonstrates different streaming modes and how to use them
"""
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
# Define the state
class GraphState(TypedDict):
messages: Annotated[list, operator.add]
step_count: int
result: str
# Example nodes
def research_node(state: GraphState):
"""Simulates a research step"""
return {
"messages": ["Researching topic..."],
"step_count": state["step_count"] + 1
}
def analyze_node(state: GraphState):
"""Simulates an analysis step"""
return {
"messages": ["Analyzing data..."],
"step_count": state["step_count"] + 1
}
def summarize_node(state: GraphState):
"""Simulates a summary step"""
return {
"messages": ["Creating summary..."],
"step_count": state["step_count"] + 1,
"result": "Final analysis complete"
}
# Build the graph
def create_example_graph():
workflow = StateGraph(GraphState)
# Add nodes
workflow.add_node("research", research_node)
workflow.add_node("analyze", analyze_node)
workflow.add_node("summarize", summarize_node)
# Add edges
workflow.set_entry_point("research")
workflow.add_edge("research", "analyze")
workflow.add_edge("analyze", "summarize")
workflow.add_edge("summarize", END)
return workflow.compile()
# Example 1: Stream complete state snapshots
def example_values_mode():
"""Stream full state after each node"""
print("=== VALUES MODE: Full State Snapshots ===\n")
graph = create_example_graph()
initial_state = {
"messages": [],
"step_count": 0,
"result": ""
}
# Stream with 'values' mode
for state in graph.stream(initial_state, stream_mode="values"):
print(f"Step {state['step_count']}:")
print(f" Messages: {state['messages']}")
print(f" Result: {state['result']}")
print()
# Example 2: Stream only state changes
def example_updates_mode():
"""Stream only what changed after each node"""
print("=== UPDATES MODE: State Changes Only ===\n")
graph = create_example_graph()
initial_state = {
"messages": [],
"step_count": 0,
"result": ""
}
# Stream with 'updates' mode
for node_name, updates in graph.stream(initial_state, stream_mode="updates"):
print(f"Node '{node_name}' updated:")
print(f" Changes: {updates}")
print()
# Example 3: Combine multiple modes
def example_combined_modes():
"""Use multiple streaming modes simultaneously"""
print("=== COMBINED MODE: Updates + Debug Info ===\n")
graph = create_example_graph()
initial_state = {
"messages": [],
"step_count": 0,
"result": ""
}
# Stream with multiple modes
for chunk in graph.stream(
initial_state,
stream_mode=["updates", "debug"]
):
print(f"Chunk received: {chunk}")
print()
# Example 4: Real-world chat interface simulation
async def example_chat_with_streaming():
"""
Simulates a chat interface using messages mode
(This would work with an actual LLM that supports streaming)
"""
print("=== MESSAGES MODE: Token-by-Token Streaming ===\n")
# In a real implementation, you'd have a graph that includes LLM calls
# Here's the concept:
"""
graph = create_chat_graph()
async for event in graph.astream(
{"query": "Explain quantum computing"},
stream_mode="messages"
):
# Event contains token-by-token output
if event['type'] == 'token':
print(event['content'], end='', flush=True)
elif event['type'] == 'tool_call':
print(f"\n[Using tool: {event['tool']}]")
"""
print("Chat: Hello! Let me help you with that...")
print("(In production, this would stream token-by-token)")
# Example 5: Custom progress tracking
def example_custom_mode():
"""
Demonstrates custom event streaming
Your nodes would emit custom events for progress tracking
"""
print("=== CUSTOM MODE: Custom Events ===\n")
# In practice, your nodes would use StreamWriter to emit events:
"""
def long_running_node(state: GraphState, writer: StreamWriter):
for i in range(100):
# Do some work...
writer.write({
'type': 'progress',
'percentage': i + 1,
'message': f'Processing item {i+1}/100'
})
return state
for event in graph.stream(initial_state, stream_mode="custom"):
if event['type'] == 'progress':
print(f"Progress: {event['percentage']}% - {event['message']}")
"""
print("Custom events would appear here with progress updates")
print("Percentage: 25% - Processing data chunk 1/4")
print("Percentage: 50% - Processing data chunk 2/4")
print("Percentage: 75% - Processing data chunk 3/4")
print("Percentage: 100% - Complete!")
# Run examples
if __name__ == "__main__":
print("LangGraph Streaming Examples\n")
print("=" * 60)
print()
# Example 1: Full state snapshots
example_values_mode()
print("\n" + "=" * 60 + "\n")
# Example 2: Just the changes
example_updates_mode()
print("\n" + "=" * 60 + "\n")
# Example 3: Combined modes
example_combined_modes()
print("\n" + "=" * 60 + "\n")
# Example 4: Chat streaming (conceptual)
print("Example 4: Chat streaming (see code for async example)")
print("\n" + "=" * 60 + "\n")
# Example 5: Custom events (conceptual)
example_custom_mode()
print("\n" + "=" * 60)
print("\nKey Takeaways:")
print("- Use 'values' for complete state at each step")
print("- Use 'updates' for efficient change tracking")
print("- Use 'messages' for token-by-token LLM output")
print("- Use 'custom' for application-specific events")
print("- Use 'debug' for development and troubleshooting")
print("- Combine modes with: stream_mode=['updates', 'messages']")
Thanks
Sreeni Ramadorai

Top comments (0)