DEV Community

Cover image for Building a Memory-Powered Chatbot with LangGraph: A Student's Guide to Conversational AI
Jaime Lucena Pérez
Jaime Lucena Pérez

Posted on

Building a Memory-Powered Chatbot with LangGraph: A Student's Guide to Conversational AI

Have you ever wondered how chatbots like ChatGPT remember your previous messages? Or how they maintain context across multiple conversations? If you're a student diving into generative AI, understanding memory management is one of the most crucial skills you'll need.

In this tutorial, we'll build a complete chatbot application using LangGraph that demonstrates two different memory strategies: temporary (in-memory) and persistent (database-backed). By the end, you'll understand how to implement conversation memory in your own AI applications.

👉 GitHub Repository: JaimeLucena/langgraph-memory-chatbot

🎯 What You'll Learn

  • How to use LangGraph for building conversational AI workflows
  • Implementing dual memory modes (temporary vs persistent)
  • Managing conversation state with checkpoints
  • Integrating tools (Wikipedia, Weather) into your chatbot
  • Building a full-stack AI application with FastAPI and Streamlit

🧠 Why Memory Matters in Conversational AI

When you chat with an AI, each message needs context from previous messages. Without memory, every interaction would be isolated—the AI wouldn't remember your name, preferences, or what you discussed earlier.

Memory in AI chatbots serves two main purposes:

  1. Context Preservation: Maintains conversation history so the AI can reference earlier messages
  2. State Management: Tracks the conversation flow and user preferences across sessions

LangGraph provides powerful abstractions for managing this memory through checkpointers—components that save and restore conversation state.

🏗️ Project Architecture

Our chatbot uses a clean, modular architecture:

User Input → FastAPI Backend → LangGraph Workflow → Memory Store → Response
Enter fullscreen mode Exit fullscreen mode

The LangGraph workflow handles:

  • Processing user messages
  • Deciding when to use tools
  • Managing conversation state
  • Persisting memory via checkpoints

📦 Setting Up the Project

First, let's set up our environment. We'll use uv for fast dependency management:

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/JaimeLucena/langgraph-memory-chatbot.git
cd langgraph-memory-chatbot

# Install dependencies
uv sync
Enter fullscreen mode Exit fullscreen mode

Create a .env file:

OPENAI_API_KEY=sk-your-api-key-here
OPENAI_MODEL=gpt-4o-mini
SQLITE_PATH=.data/memory.sqlite
SYSTEM_PROMPT=You are a helpful and concise assistant.
Enter fullscreen mode Exit fullscreen mode

🧩 Understanding LangGraph Memory with Checkpointers

The heart of our memory system is LangGraph's checkpointer concept. A checkpointer is responsible for saving and loading conversation state.

Two Memory Modes

1. Temporary Memory (MemorySaver)

  • Stored in RAM
  • Fast but lost on server restart
  • Perfect for testing and ephemeral conversations

2. Persistent Memory (SqliteSaver)

  • Stored in SQLite database
  • Survives server restarts
  • Essential for production applications

Let's see how we implement this in code:

from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.sqlite import SqliteSaver

def build_graph(checkpointer=None):
    """
    Builds the LangGraph workflow.
    If no checkpointer is provided, uses in-memory MemorySaver (temporary).
    For persistent mode, pass a SqliteSaver instance.
    """
    graph = StateGraph(ChatState)

    # Add nodes
    graph.add_node("respond", respond_node)
    graph.add_node("tools", TOOLS_NODE)

    # Set entry point
    graph.set_entry_point("respond")

    # Conditional routing
    graph.add_conditional_edges("respond", needs_tools, {
        "tools": "tools", 
        "end": END
    })

    # After tools, return to respond node
    graph.add_edge("tools", "respond")

    # Use provided checkpointer or default to MemorySaver
    return graph.compile(checkpointer=checkpointer or MemorySaver())
Enter fullscreen mode Exit fullscreen mode

The ChatState: Our Conversation Container

The state is defined as a TypedDict that LangGraph uses to track conversation data:

from typing import Annotated, TypedDict, List
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages

class ChatState(TypedDict):
    messages: Annotated[List[BaseMessage], add_messages]
    user_input: str
Enter fullscreen mode Exit fullscreen mode

The add_messages reducer automatically merges new messages into the existing list, which is perfect for conversation history.

🔄 How Memory Persistence Works

When a user sends a message, here's what happens:

  1. State Loading: LangGraph loads previous conversation state using the thread_id (our session_id)
  2. Message Processing: The new message is added to the state
  3. LLM Invocation: The model processes the conversation with full context
  4. State Saving: The updated state is saved back to the checkpointer

Let's look at how we invoke the graph with memory:

@app.post("/chat", response_model=ChatResponse)
def chat(req: ChatRequest):
    mode = req.memory
    graph = graphs[mode]  # Select temporary or persistent graph

    # Use session_id as thread_id for state management
    config = {"configurable": {"thread_id": req.session_id}}

    # Invoke with empty messages - checkpointer loads history automatically!
    result = graph.invoke(
        {"user_input": req.message, "messages": []}, 
        config=config
    )

    # Extract the AI's response
    messages = result.get("messages", [])
    reply = messages[-1].content if messages else ""

    return ChatResponse(
        session_id=req.session_id,
        reply=reply,
        mode=mode
    )
Enter fullscreen mode Exit fullscreen mode

The magic here is that even though we pass an empty messages list, the checkpointer automatically loads the previous conversation history for that thread_id!

🛠️ Implementing the Respond Node

The respond_node is where the conversation logic happens. It needs to:

  1. Load and trim conversation history
  2. Decide whether to use tools
  3. Invoke the LLM
  4. Return the response
def respond_node(state: ChatState):
    # 1. Build system prompt + trimmed conversation history
    system = SystemMessage(content=settings.system_prompt)
    history = state.get("messages", [])
    trimmed = trimmer.invoke([system, *history])

    # 2. Decide tool usage (only if user requested via slash commands)
    wants_tools = user_requested_tools(state["user_input"])
    just_ran_tool = last_message_is_tool(history)

    # 3. Bind tools only if needed and we haven't just run one
    if wants_tools and not just_ran_tool:
        llm = make_llm().bind_tools(TOOLS, tool_choice="auto")
    else:
        llm = make_llm()  # No tools, just conversation

    # 4. Append user message (avoid duplicates)
    msgs_in = trimmed[:]
    if should_append_user(history, state["user_input"]):
        user_msg = HumanMessage(state["user_input"])
        msgs_in.append(user_msg)
    else:
        user_msg = None

    # 5. Invoke the model
    ai_msg: AIMessage = llm.invoke(msgs_in)

    # 6. Return messages to be persisted
    out = []
    if user_msg is not None:
        out.append(user_msg)
    out.append(ai_msg)
    return {"messages": out}
Enter fullscreen mode Exit fullscreen mode

Token Management: Preventing Context Overflow

One critical aspect of memory management is preventing the conversation from exceeding the model's context window. We use LangChain's trim_messages utility:

from langchain_core.messages import trim_messages

trimmer = trim_messages(
    strategy="last",  # Keep most recent messages
    max_tokens=1200,  # Stay within token limit
    token_counter=make_llm()  # Accurate token counting
)
Enter fullscreen mode Exit fullscreen mode

This ensures we always keep the most recent, relevant messages while staying within token limits.

🔀 Conditional Routing: When to Use Tools

LangGraph's conditional edges let us create dynamic workflows. We route to the tools node only when the LLM requests tool usage:

def needs_tools(state: ChatState) -> str:
    """
    Check if the last AI message contains tool calls.
    If yes, route to tools node; otherwise, end the conversation.
    """
    for msg in reversed(state.get("messages", [])):
        if isinstance(msg, AIMessage):
            if getattr(msg, "tool_calls", None):
                return "tools"
            break
    return "end"
Enter fullscreen mode Exit fullscreen mode

The graph uses this function to decide the next step:

graph.add_conditional_edges(
    "respond", 
    needs_tools, 
    {"tools": "tools", "end": END}
)
graph.add_edge("tools", "respond")  # After tools, return to respond
Enter fullscreen mode Exit fullscreen mode

🚀 Running the Application

Start the backend:

uv run uvicorn app.main:app --reload --port 8000
Enter fullscreen mode Exit fullscreen mode

In another terminal, start the Streamlit UI:

uv run streamlit run app/ui.py
Enter fullscreen mode Exit fullscreen mode

Now you can test both memory modes:

  1. Temporary Mode: Messages are lost on page refresh
  2. Persistent Mode: Conversations survive restarts and are stored in SQLite

💡 Key Takeaways for Students

1. Checkpointers are the Key to Memory

LangGraph's checkpointers abstract away the complexity of state management. You just need to:

  • Choose the right checkpointer (MemorySaver vs SqliteSaver)
  • Use thread_id to identify conversations
  • Let LangGraph handle loading/saving automatically

2. State Design Matters

Your ChatState TypedDict defines what gets remembered. Use add_messages reducer for automatic message merging.

3. Token Management is Critical

Always trim conversation history to prevent context overflow. Use trim_messages with accurate token counting.

4. Conditional Routing Enables Complex Flows

Use conditional edges to create dynamic workflows that adapt based on conversation state.

🎓 Learning Path Recommendations

If you're new to LangGraph and conversational AI, here's a suggested learning path:

  1. Start Simple: Build a basic chatbot without memory
  2. Add Temporary Memory: Implement MemorySaver for in-memory conversations
  3. Upgrade to Persistent: Add SqliteSaver for production-ready persistence
  4. Add Tools: Integrate external APIs and function calling
  5. Optimize: Implement token trimming and advanced routing

🔗 Resources

🎯 Conclusion

Building a chatbot with proper memory management is a fundamental skill in conversational AI. LangGraph makes this accessible through its checkpointing system, allowing you to focus on the conversation logic rather than state management boilerplate.

The dual memory approach (temporary vs persistent) gives you flexibility for different use cases—from quick testing to production applications that need to remember users across sessions.

Next Steps:

  • Experiment with different checkpointer strategies
  • Add more tools to your chatbot
  • Implement user authentication and multi-user support
  • Explore advanced LangGraph features like human-in-the-loop workflows

Happy building! 🚀


If you found this tutorial helpful, consider starring the repository and sharing it with other students learning AI!

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.