LangChain Memory Component Deep Dive: Chain Components and Runnable Study

In building complex AI applications, effectively managing conversation history and context information is crucial. The LangChain framework provides various memory components, enabling developers to easily implement chatbots with memory functions. This article will delve into the memory components, Chain components, and Runnable interface in LangChain to help developers better understand and use these powerful tools.

Using and Analyzing Buffer Memory Components

Types of Buffer Memory Components

LangChain offers several types of buffer memory components, each with specific purposes and advantages:

ConversationBufferMemory: The simplest buffer memory, storing all conversation information as memory.
ConversationBufferWindowMemory: Retains a certain number (2*k) of conversation pieces as history by setting a k value.
ConversationTokenBufferMemory: Decides when to clear interaction information by setting a maximum token count (max_token_limits). When conversation information exceeds this limit, old dialogue information is discarded.
ConversationStringBufferMemory: Equivalent to buffer memory, it consistently returns strings (an early memory component encapsulated by LangChain).

Example of Buffer Window Memory

Below is an example using ConversationBufferWindowMemory to implement a 2-turn conversation memory:

from operator import itemgetter
import dotenv
from langchain.memory import ConversationBufferWindowMemory
from langchain_community.chat_message_histories import FileChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()

memory = ConversationBufferWindowMemory(
    input_key="query",
    return_messages=True,
    k=2,
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a chatbot developed by OpenAI, please help users solve problems"),
    MessagesPlaceholder("history"),
    ("human", "{query}")
])

llm = ChatOpenAI(model="gpt-3.5-turbo-16k")

chain = RunnablePassthrough.assign(
    history=RunnableLambda(memory.load_memory_variables) | itemgetter("history")
) | prompt | llm | StrOutputParser()

while True:
    query = input("Human: ")
    if query == "q":
        exit(0)
    chain_input = {"query": query}
    print("AI: ", flush=True, end="")
    response = chain.stream(chain_input)
    output = ""
    for chunk in response:
        output += chunk
        print(chunk, flush=True, end="")
    print("\nhistory:", memory.load_memory_variables({}))
    memory.save_context(chain_input, {"output": output})

Using and Analyzing Summary Memory Components

Types of Summary Memory Components

LangChain provides two main types of summary memory components:

ConversationSummaryMemory: Summarizes the passed historical conversation records into summaries for storage, where the memory used is the summary rather than the dialogue data.
ConversationSummaryBufferMemory: Saves conversation history data without exceeding the max_token_limit. For parts that exceed this limit, information is extracted and summarized.

Example of Summary Buffer Mixed Memory

Below is an example using ConversationSummaryBufferMemory, limiting max_token_limit to 300:

from operator import itemgetter
import dotenv
from langchain.memory import ConversationSummaryBufferMemory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a powerful chatbot, please respond to user questions based on the corresponding context"),
    MessagesPlaceholder("history"),
    ("human", "{query}"),
])

memory = ConversationSummaryBufferMemory(
    return_messages=True,
    input_key="query",
    llm=ChatOpenAI(model="gpt-3.5-turbo-16k"),
    max_token_limit=300,
)

llm = ChatOpenAI(model="gpt-3.5-turbo-16k")

chain = RunnablePassthrough.assign(
    history=RunnableLambda(memory.load_memory_variables) | itemgetter("history")
) | prompt | llm | StrOutputParser()

while True:
    query = input("Human: ")
    if query == "q":
        exit(0)
    chain_input = {"query": query, "language": "Chinese"}
    response = chain.stream(chain_input)
    print("AI: ", flush=True, end="")
    output = ""
    for chunk in response:
        output += chunk
        print(chunk, flush=True, end="")
    memory.save_context(chain_input, {"output": output})
    print("")
    print("history:", memory.load_memory_variables({}))

When using summary memory components, pay attention to potential issues such as handling multiple system role messages and specific requirements of certain chat models regarding message formats.

Using and Analyzing Entity Memory Components

Entity memory components are used to track entities mentioned in a conversation and remember established facts about specific entities. LangChain provides the ConversationEntityMemory class to achieve this functionality.

Example of Using ConversationEntityMemory

import dotenv
from langchain.chains.conversation.base import ConversationChain
from langchain.memory import ConversationEntityMemory
from langchain.memory.prompt import ENTITY_MEMORY_CONVERSATION_TEMPLATE
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()
llm = ChatOpenAI(model="gpt-4o", temperature=0)
chain = ConversationChain(
    llm=llm,
    prompt=ENTITY_MEMORY_CONVERSATION_TEMPLATE,
    memory=ConversationEntityMemory(llm=llm),
)
print(chain.invoke({"input": "Hello, I am currently learning LangChain."}))
print(chain.invoke({"input": "My favorite programming language is Python."}))
print(chain.invoke({"input": "I live in Guangzhou."}))
res = chain.memory.entity_store.store
print(res)

Persistence of Memory Components and Third-Party Integration

LangChain's memory components do not have built-in persistence capabilities, but conversation history can be persisted using chat_memory. LangChain integrates with over 50 third-party conversation message history storage solutions, including Postgres, Redis, Kafka, MongoDB, SQLite, etc.

Using and Understanding Built-in Chain Components

Introduction and Usage of Chain

In LangChain, a Chain is used to link multiple components (such as LLMs, prompt templates, vector stores, memory, output parsers, etc.) for joint usage. LangChain supports two types of chains:

Chains built using LCEL (LangChain Expression Language)
Chains built via subclasses of the Chain class (Legacy Chains)

Basic Example of Using LLMChain

import dotenv
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()

llm = ChatOpenAI(model="gpt-3.5-turbo-16k")
prompt = ChatPromptTemplate.from_template("Tell a cold joke about {subject}")

chain = LLMChain(
    llm=llm,
    prompt=prompt,
)

print(chain("programmer"))
print(chain.run("programmer"))
print(chain.apply([{"subject": "programmer"}]))
print(chain.generate([{"subject": "programmer"}]))
print(chain.predict(subject="programmer"))
print(chain.invoke({"subject": "programmer"}))

Built-in Chains

LangChain provides various built-in Chains, including LCEL Chains and Legacy Chains. For example, the create_stuff_documents_chain function can create a document conversation chain:

import dotenv
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a powerful chatbot that can respond to user questions based on the corresponding context.\n\n<context>{context}</context>"),
    ("human", "{query}"),
])

llm = ChatOpenAI(model="gpt-3.5-turbo-16k")

chain = create_stuff_documents_chain(llm=llm, prompt=prompt)

documents = [
    Document(page_content="Xiao Ming likes green but not yellow."),
    Document(page_content="Xiao Wang likes pink and also a bit of red."),
    Document(page_content="Xiao Ze likes blue but prefers cyan.")
]

resp = chain.invoke({"query": "What colors does everyone like?", "context": documents})

print(resp)

Simplified Code and Usage of RunnableWithMessageHistory

RunnableWithMessageHistory is a wrapper that allows a chain to automatically handle the process of filling and storing historical messages.

Example of Using RunnableWithMessageHistory

import dotenv
from langchain_community.chat_message_histories import FileChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI

dotenv.load_dotenv()

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = FileChatMessageHistory(f"chat_history_{session_id}.txt")
    return store[session_id]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a powerful chatbot. Please respond to user queries based on their needs."),
    MessagesPlaceholder("history"),
    ("human", "{query}"),
])
llm = ChatOpenAI(model="gpt-3.5-turbo-16k")

chain = prompt | llm | StrOutputParser()

with_message_chain = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="query",
    history_messages_key="history",
)

while True:
    query = input("Human: ")
    if query == "q":
        exit(0)
    response = with_message_chain.stream(
        {"query": query},
        config={"configurable": {"session_id": "muxiaoke"}}
    )
    print("AI: ", flush=True, end="")
    for chunk in response:
        print(chunk, flush=True, end="")
    print("")

By using RunnableWithMessageHistory, we can more easily manage conversation histories for multiple users and automatically handle the loading and storage of historical messages in the chain. This helps developers create smarter and more personalized conversational systems.

Conclusion

LangChain provides a rich set of memory components and Chain components, enabling developers to easily build context-aware AI applications. By properly utilizing these components, we can create smarter and more personalized conversational systems. With the continuous development of LangChain, the introduction of LCEL expressions and Runnable interfaces further simplifies the application building process. Developers should choose the appropriate memory components and Chain types based on specific needs to achieve optimal application performance and user experience.