DEV Community

Cover image for Building a Conversational Agent with Context Awareness
Kumar Kislay
Kumar Kislay

Posted on • Originally published at forg.to

Building a Conversational Agent with Context Awareness

This article was originally written on https://forg.to/articles/building-a-conversational-agent-with-context-awareness

Most Chatbots Have Goldfish Memory. Here's How to Fix That in 30 Lines of Python.

You ask a chatbot something. It answers. You follow up. It has no idea what you just talked about.

That's not a conversation. That's a search bar with extra steps.

The fix is not complicated. Here's how to build a conversational agent that actually remembers what you said, using LangChain and GPT-4o-mini.


What we're building

A chat agent that maintains conversation history across messages. Ask it something, follow up, and it knows exactly what you said before. The whole thing is about 30 lines of code.

Four moving parts:

  • A language model to generate responses
  • A prompt template that structures each conversation
  • A history manager that handles what gets remembered
  • A message store that keeps each session separate

Setup

Install the dependencies:

pip install langchain langchain_experimental openai python-dotenv langchain_openai
Enter fullscreen mode Exit fullscreen mode

Then the imports:

from langchain_openai import ChatOpenAI
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
import os
from dotenv import load_dotenv

load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
Enter fullscreen mode Exit fullscreen mode

Initialize the model

llm = ChatOpenAI(model="gpt-4o-mini", max_tokens=1000, temperature=0)
Enter fullscreen mode Exit fullscreen mode

temperature=0 keeps responses consistent and predictable. Good default for most use cases.


Build the message store

This is where conversation history lives. Each user gets their own session, identified by a session_id.

store = {}

def get_chat_history(session_id: str):
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]
Enter fullscreen mode Exit fullscreen mode

Simple dictionary. Session doesn't exist yet? Create it. Already exists? Return it. That's the whole thing.

This means you can run ten different conversations simultaneously and they won't bleed into each other.


Create the prompt template

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])
Enter fullscreen mode Exit fullscreen mode

Three parts: a system message that sets the AI's behavior, a placeholder where the conversation history gets injected, and the user's current message.

The MessagesPlaceholder is the important bit. Every time you send a new message, LangChain fills that slot with the full conversation history automatically. You don't manage this manually.


Wire it together

chain = prompt | llm
Enter fullscreen mode Exit fullscreen mode

That pipe operator chains the prompt into the model. Prompt renders first, output goes into the model.

Now wrap it with history management:

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_chat_history,
    input_messages_key="input",
    history_messages_key="history"
)
Enter fullscreen mode Exit fullscreen mode

RunnableWithMessageHistory is doing the heavy lifting here. It intercepts every call, fetches the right session history, injects it into the prompt, runs the chain, and saves the new messages back to the store. You don't write any of that logic yourself.


Use it

session_id = "user_123"

response1 = chain_with_history.invoke(
    {"input": "Hello! How are you?"},
    config={"configurable": {"session_id": session_id}}
)
print("AI:", response1.content)

response2 = chain_with_history.invoke(
    {"input": "What was my previous message?"},
    config={"configurable": {"session_id": session_id}}
)
print("AI:", response2.content)
Enter fullscreen mode Exit fullscreen mode

Output:

AI: Hello! I'm just a computer program, so I don't have feelings, 
but I'm here and ready to help you. How can I assist you today?

AI: Your previous message was, "Hello! How are you?" 
How can I assist you further?
Enter fullscreen mode Exit fullscreen mode

It remembered. That's the whole point.

You can inspect the full history at any point:

print("\nConversation History:")
for message in store[session_id].messages:
    print(f"{message.type}: {message.content}")
Enter fullscreen mode Exit fullscreen mode
Conversation History:
human: Hello! How are you?
ai: Hello! I'm just a computer program, so I don't have feelings...
human: What was my previous message?
ai: Your previous message was, "Hello! How are you?"...
Enter fullscreen mode Exit fullscreen mode

What this does not handle

This stores history in memory. The moment your process restarts, it's gone. For anything production-facing you'd swap ChatMessageHistory for a persistent store like Redis or a database. LangChain has built-in integrations for both.

Long conversations also get expensive fast. The entire history gets sent with every message. At some point you'll want to summarize older messages instead of passing them verbatim. But that's a problem for when your conversations actually get long.


Where to take it next

The foundation works. From here you can point it at your own data with a vector store, give it tools like web search or calculator access, or swap the system prompt to give it a specific persona or domain expertise.

The context management pattern stays the same regardless. That's why it's worth getting right first.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.