Jaime Lucena Pérez

Posted on Nov 6

Building a Memory-Powered Chatbot with LangGraph: A Student's Guide to Conversational AI

#python #langchain #ai #chatbot

Have you ever wondered how chatbots like ChatGPT remember your previous messages? Or how they maintain context across multiple conversations? If you're a student diving into generative AI, understanding memory management is one of the most crucial skills you'll need.

In this tutorial, we'll build a complete chatbot application using LangGraph that demonstrates two different memory strategies: temporary (in-memory) and persistent (database-backed). By the end, you'll understand how to implement conversation memory in your own AI applications.

👉 GitHub Repository: JaimeLucena/langgraph-memory-chatbot

🎯 What You'll Learn

How to use LangGraph for building conversational AI workflows
Implementing dual memory modes (temporary vs persistent)
Managing conversation state with checkpoints
Integrating tools (Wikipedia, Weather) into your chatbot
Building a full-stack AI application with FastAPI and Streamlit

🧠 Why Memory Matters in Conversational AI

When you chat with an AI, each message needs context from previous messages. Without memory, every interaction would be isolated—the AI wouldn't remember your name, preferences, or what you discussed earlier.

Memory in AI chatbots serves two main purposes:

Context Preservation: Maintains conversation history so the AI can reference earlier messages
State Management: Tracks the conversation flow and user preferences across sessions

LangGraph provides powerful abstractions for managing this memory through checkpointers—components that save and restore conversation state.

🏗️ Project Architecture

Our chatbot uses a clean, modular architecture:

User Input → FastAPI Backend → LangGraph Workflow → Memory Store → Response

The LangGraph workflow handles:

Processing user messages
Deciding when to use tools
Managing conversation state
Persisting memory via checkpoints

📦 Setting Up the Project

First, let's set up our environment. We'll use uv for fast dependency management:

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/JaimeLucena/langgraph-memory-chatbot.git
cd langgraph-memory-chatbot

# Install dependencies
uv sync

Create a .env file:

OPENAI_API_KEY=sk-your-api-key-here
OPENAI_MODEL=gpt-4o-mini
SQLITE_PATH=.data/memory.sqlite
SYSTEM_PROMPT=You are a helpful and concise assistant.

🧩 Understanding LangGraph Memory with Checkpointers

The heart of our memory system is LangGraph's checkpointer concept. A checkpointer is responsible for saving and loading conversation state.

Two Memory Modes

1. Temporary Memory (MemorySaver)

Stored in RAM
Fast but lost on server restart
Perfect for testing and ephemeral conversations

2. Persistent Memory (SqliteSaver)

Stored in SQLite database
Survives server restarts
Essential for production applications

Let's see how we implement this in code:

from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.sqlite import SqliteSaver

def build_graph(checkpointer=None):
    """
    Builds the LangGraph workflow.
    If no checkpointer is provided, uses in-memory MemorySaver (temporary).
    For persistent mode, pass a SqliteSaver instance.
    """
    graph = StateGraph(ChatState)

    # Add nodes
    graph.add_node("respond", respond_node)
    graph.add_node("tools", TOOLS_NODE)

    # Set entry point
    graph.set_entry_point("respond")

    # Conditional routing
    graph.add_conditional_edges("respond", needs_tools, {
        "tools": "tools", 
        "end": END
    })

    # After tools, return to respond node
    graph.add_edge("tools", "respond")

    # Use provided checkpointer or default to MemorySaver
    return graph.compile(checkpointer=checkpointer or MemorySaver())

The ChatState: Our Conversation Container

The state is defined as a TypedDict that LangGraph uses to track conversation data:

from typing import Annotated, TypedDict, List
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages

class ChatState(TypedDict):
    messages: Annotated[List[BaseMessage], add_messages]
    user_input: str

The add_messages reducer automatically merges new messages into the existing list, which is perfect for conversation history.

🔄 How Memory Persistence Works

When a user sends a message, here's what happens:

State Loading: LangGraph loads previous conversation state using the thread_id (our session_id)
Message Processing: The new message is added to the state
LLM Invocation: The model processes the conversation with full context
State Saving: The updated state is saved back to the checkpointer

Let's look at how we invoke the graph with memory:

@app.post("/chat", response_model=ChatResponse)
def chat(req: ChatRequest):
    mode = req.memory
    graph = graphs[mode]  # Select temporary or persistent graph

    # Use session_id as thread_id for state management
    config = {"configurable": {"thread_id": req.session_id}}

    # Invoke with empty messages - checkpointer loads history automatically!
    result = graph.invoke(
        {"user_input": req.message, "messages": []}, 
        config=config
    )

    # Extract the AI's response
    messages = result.get("messages", [])
    reply = messages[-1].content if messages else ""

    return ChatResponse(
        session_id=req.session_id,
        reply=reply,
        mode=mode
    )

The magic here is that even though we pass an empty messages list, the checkpointer automatically loads the previous conversation history for that thread_id!

🛠️ Implementing the Respond Node

The respond_node is where the conversation logic happens. It needs to:

Load and trim conversation history
Decide whether to use tools
Invoke the LLM
Return the response

def respond_node(state: ChatState):
    # 1. Build system prompt + trimmed conversation history
    system = SystemMessage(content=settings.system_prompt)
    history = state.get("messages", [])
    trimmed = trimmer.invoke([system, *history])

    # 2. Decide tool usage (only if user requested via slash commands)
    wants_tools = user_requested_tools(state["user_input"])
    just_ran_tool = last_message_is_tool(history)

    # 3. Bind tools only if needed and we haven't just run one
    if wants_tools and not just_ran_tool:
        llm = make_llm().bind_tools(TOOLS, tool_choice="auto")
    else:
        llm = make_llm()  # No tools, just conversation

    # 4. Append user message (avoid duplicates)
    msgs_in = trimmed[:]
    if should_append_user(history, state["user_input"]):
        user_msg = HumanMessage(state["user_input"])
        msgs_in.append(user_msg)
    else:
        user_msg = None

    # 5. Invoke the model
    ai_msg: AIMessage = llm.invoke(msgs_in)

    # 6. Return messages to be persisted
    out = []
    if user_msg is not None:
        out.append(user_msg)
    out.append(ai_msg)
    return {"messages": out}

Token Management: Preventing Context Overflow

One critical aspect of memory management is preventing the conversation from exceeding the model's context window. We use LangChain's trim_messages utility:

from langchain_core.messages import trim_messages

trimmer = trim_messages(
    strategy="last",  # Keep most recent messages
    max_tokens=1200,  # Stay within token limit
    token_counter=make_llm()  # Accurate token counting
)

This ensures we always keep the most recent, relevant messages while staying within token limits.

🔀 Conditional Routing: When to Use Tools

LangGraph's conditional edges let us create dynamic workflows. We route to the tools node only when the LLM requests tool usage:

def needs_tools(state: ChatState) -> str:
    """
    Check if the last AI message contains tool calls.
    If yes, route to tools node; otherwise, end the conversation.
    """
    for msg in reversed(state.get("messages", [])):
        if isinstance(msg, AIMessage):
            if getattr(msg, "tool_calls", None):
                return "tools"
            break
    return "end"

The graph uses this function to decide the next step:

graph.add_conditional_edges(
    "respond", 
    needs_tools, 
    {"tools": "tools", "end": END}
)
graph.add_edge("tools", "respond")  # After tools, return to respond

🚀 Running the Application

Start the backend:

uv run uvicorn app.main:app --reload --port 8000

In another terminal, start the Streamlit UI:

uv run streamlit run app/ui.py

Now you can test both memory modes:

Temporary Mode: Messages are lost on page refresh
Persistent Mode: Conversations survive restarts and are stored in SQLite

💡 Key Takeaways for Students

1. Checkpointers are the Key to Memory

LangGraph's checkpointers abstract away the complexity of state management. You just need to:

Choose the right checkpointer (MemorySaver vs SqliteSaver)
Use thread_id to identify conversations
Let LangGraph handle loading/saving automatically

2. State Design Matters

Your ChatState TypedDict defines what gets remembered. Use add_messages reducer for automatic message merging.

3. Token Management is Critical

Always trim conversation history to prevent context overflow. Use trim_messages with accurate token counting.

4. Conditional Routing Enables Complex Flows

Use conditional edges to create dynamic workflows that adapt based on conversation state.

🎓 Learning Path Recommendations

If you're new to LangGraph and conversational AI, here's a suggested learning path:

Start Simple: Build a basic chatbot without memory
Add Temporary Memory: Implement MemorySaver for in-memory conversations
Upgrade to Persistent: Add SqliteSaver for production-ready persistence
Add Tools: Integrate external APIs and function calling
Optimize: Implement token trimming and advanced routing

🔗 Resources

🎯 Conclusion

Building a chatbot with proper memory management is a fundamental skill in conversational AI. LangGraph makes this accessible through its checkpointing system, allowing you to focus on the conversation logic rather than state management boilerplate.

The dual memory approach (temporary vs persistent) gives you flexibility for different use cases—from quick testing to production applications that need to remember users across sessions.

Next Steps: