LLMs are stateless. Every time you send a message, itโs like the first time the AI has ever met you. Today, weโre going to fix that. Weโre giving our agents a "notebook" so they can remember who you are and what youโve discussed.
๐ The "Goldfish Memory" Problem
In a standard chain, the flow is a one-way street: Input โ AI โ Output. Once the response is sent, the AI's "brain" resets.
To create a real conversation, we need to feed the past messages back into the AI every single time you hit enter. LangChain makes this incredibly easy by managing the "notebook" for us.
๐ 1. The Simple Transcript: ConversationBufferMemory
This is the most straightforward memory type. It stores everythingโevery "Hi" and every "Bye"โin a raw list.
Best for: Short conversations where you need exact wording.
The Catch: If the chat gets too long, it will eventually exceed the AI's "Context Window" (the amount of text it can read at once).
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
# Manually simulating a chat
memory.save_context({"input": "Hi, I'm Gemini"}, {"output": "Hello Gemini! How can I help?"})
memory.save_context({"input": "What's my name?"}, {"output": "Your name is Gemini."})
# Peek inside the notebook
print(memory.load_memory_variables({}))
โ๏ธ 2. The Efficient Pro: ConversationSummaryMemory
As your chat grows, a raw transcript becomes expensive and slow. Summary Memory uses an LLM to summarize the conversation as it happens.
Instead of remembering 50 pages of text, the AI keeps a 1-paragraph "story so far."
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationSummaryMemory
llm = ChatOpenAI(model="gpt-4o-mini")
memory = ConversationSummaryMemory(llm=llm)
# The AI summarizes the interaction in the background,
# keeping your 'notebook' slim and focused!
๐๏ธ 3. The "Window" Trick: ConversationBufferWindowMemory
If you only care about what happened in the last 5 minutes, you use Window Memory. It only keeps the last k interactions. This is a great way to keep your token costs predictable.
from langchain.memory import ConversationBufferWindowMemory
# 'k=2' means it only remembers the last 2 turns of the conversation
memory = ConversationBufferWindowMemory(k=2)
memory.save_context({"input": "My favorite color is Blue"}, {"output": "Got it!"})
memory.save_context({"input": "I live in New York"}, {"output": "Nice city!"})
memory.save_context({"input": "I am a coder"}, {"output": "That's cool!"})
# Now, if we ask "What is my favorite color?", it will have forgotten!
# It only remembers 'I live in New York' and 'I am a coder'.
print(memory.load_memory_variables({}))
๐ฏ Day 6 Summary
Today, we graduated from building scripts to building Companions. We covered:
Statelessness: Why LLMs naturally forget.
Buffer Memory: Storing the full transcript.
Summary Memory: Condensing the history to save space.
Window Memory: Keeping only the most recent messages.
Your Homework: If you were building an AI Customer Support bot, which memory type would you choose? Think about the balance between "knowing the user's specific problem" and "not getting confused by a 2-hour long chat."
See you tomorrow! โ
Top comments (0)