In building complex AI applications, LangGraph serves as a powerful tool, providing us with flexible graph-structured programming capabilities. Today, we will delve into a key feature of LangGraph: the streaming response mode. This feature not only enhances application response speed but also offers a smoother interactive experience for users.
Streaming Response in LangGraph: How is it Different from Traditional LLM?
In LangGraph, the compiled graph program is essentially a Runnable component. Unlike traditional Large Language Models (LLMs), LangGraph supports multiple streaming modes. Traditional LLMs typically output one word at a time, whereas LangGraph’s streaming response outputs the data state of a node each time. This design offers more granular control and richer data presentation.
Two Basic Streaming Modes
LangGraph provides two main streaming response modes, each with specific use cases:
-
Values Mode: Returns the complete state values of the graph (total)
- After each node call, the complete state of the graph is returned.
- Suitable for scenarios where the entire graph state needs to be known at any time.
-
Updates Mode: Returns state updates of the graph (incremental)
- After each node call, only the changes in state are returned.
- Suitable for focusing only on changes or saving bandwidth.
How to Use Streaming Modes?
Using streaming modes is straightforward. When calling the stream()
function, simply pass the stream_mode
parameter to configure different streaming response modes. Let’s look at how to use these two modes with a ReACT agent as an example:
# Example of values mode
inputs = {"messages": [("human", "What are the top 3 results of the 2024 Beijing Half Marathon?")]}
for chunk in agent.stream(inputs, stream_mode="values"):
print(chunk["messages"][-1].pretty_print())
# Example of updates mode
for chunk in agent.stream(inputs, stream_mode="updates"):
print(chunk)
In values mode, each output is the complete data state. In updates mode, the output is incremental data in dictionary format, with keys as node names and values as state updates.
Current Limitations and Future Prospects
Although LangGraph’s streaming response mechanism already provides us with powerful features, it still has some limitations. Currently, while we can correctly retrieve data from each node, the waiting time is still long, especially for nodes involving large language models. This is because the nodes themselves should also support streaming output. Ideally, large language model nodes should maintain their inherent streaming characteristics under the graph’s streaming output, rather than waiting for the complete output before returning.
Ideal Agent Output Method
Common agent systems on the market (such as Coze, Dify, Zhipu, GPTs, etc.) adopt better solutions: each step (such as knowledge base retrieval, tool invocation, LLM content generation) is immediately streamed back after completion. This approach provides faster response speeds and better user experience.
The agent executes several steps: knowledge base retrieval, tool invocation, LLM content generation, and streams content back to the front end as each step completes. During some relatively time-consuming steps, such as LLM content generation, streaming output is also performed within the step, avoiding long periods without response that could lead to connection interruption and enhancing user experience.
Conclusion
LangGraph’s streaming response mechanism provides us with powerful tools for building efficient, responsive AI applications. By making good use of the values and updates modes, we can optimize application performance and user experience according to specific needs. Although there are still some limitations, with continuous technological advancement, we can expect LangGraph to offer more comprehensive and efficient streaming processing capabilities in the future.
In practical applications, developers are advised to choose the appropriate streaming mode based on specific scenarios and continuously monitor LangGraph updates to leverage the latest features to optimize your AI applications.
Top comments (0)