Even in this AI era, user experience (UX) is vital for applications to have. UX is key to ensuring that your users are interacting with your product comfortably. That means whatever you are building must not be irritating to your users.
The benefits of a good UX are quite a number, such that talking about them deeply requires an article of its own. One of those benefits is a lower churn rate. A reduced churn rate implies that there will be fewer uninstalls after installs, fewer users exiting
your site without fully utilizing it, or fewer force-closures of your apps. Such actions from your users are bad for business, and that's why this article was written to talk about one specific way to improve chatbots' user experience.
With the present Large Language Memory (LLM) wrappers, creating chatbots can be as easy as importing a few packages/libraries and writing a few lines of code. One way of differentiating your chatbot from the rest is through personalizing it by making it memorize conversations reliably. This type of chatbot personalization improves UX by enabling tailored, flowing, and more human-like conversations.
Chatbot personalization through memorization will be explored more later on.
Since this is a hands-on article, it's important to check what you need in the prerequisites section below.
Prerequisites
You will need:
- Python installed in your working environment.
- Basic Python knowledge.
- A mem0 account. The hobby plan is sufficient to follow through.
- A Gemini API key. We will be using Gemini as our conversational AI provider.
- Basics in Artificial Intelligence, specifically conversational AI.
What is personalization?
In software development terms, personalization is the practice of tuning or setting an application's way of working to fit a certain experience instead of having a one-size-fits-all general behavior. For chatbots, personalizing makes them adjust their conversation manner based on how a user converses and what they converse
about. Consider the following user prompt: "Do you have any army paths you'd advise me to pursue?"
Without personalization, the chatbot could respond with something like, "You can follow the Navy, the Air Force, or the Army."
With personalization, the bot's response could be, "Since you’ve previously shown interest in aircraft, I would advise you to join the Air Force."
One way to achieve that is by making the chatbot retain the conversation history and extract the appropriate contexts to be used for the next conversations.
What is AI memory?
AI memory is simply a software layer that captures context information from a running AI system and stores it so it can be used for future operations. This retention enables an AI system to give better output during the next interactions.
In chatbots, AI memory stores conversation history and feeds the history to the chatbot in subsequent replies and new sessions. At this point, you may be wondering, “Doesn’t this add larger and heavier context to the chatbot?” or “How is this different from manually feeding context each time?” To answer that, a well-designed AI memory layer should only pick out the relevant context from the stored memory to feed it to the chatbot instead of blindly reusing all the stored data.
With that said, the AI memory layer should also be intelligent for it to be capable of filtering, ranking, and recalling useful context. That may look complicated, moreso if you are thinking of building it from scratch. This is why you need a robust and optimized solution like mem0.
NOTE: The rest of the article will be using the terms AI memory, memory, memory layer, and AI memory layer interchangeably.
What is mem0?
mem0 is an intelligent AI memory layer that is designed to improve over time as it is fed with more memories. All the heavy-lifting is done for you, so you don't have to worry about creating and maintaining your own memory layer. At a high-level, mem0 works by storing meaningful memory efficiently, retrieving only relevant context from the stored memory, and updating the memory over time.
It comes in three flavours:
1. mem0 Platform
This provides a hosted, fully managed, ready-to-use memory solution for those who want a plug-and-play experience free of the infrastructure management hassle.
2. mem0 Open Source
This is a self-hosted memory layer that gives you full control over implementation, customization, and general management.
3. OpenMemory
OpenMemory is a framework designed to work across multiple AI systems and tools and is fit for those looking to port AI memory between different platforms and build interoperable AI experiences.
Why mem0?
Some of mem0's features that make it worth include:
- Memory filters, which allow you to retrieve memories based on the criteria you set.
- Entity-Scoped memory that maps memory to the fields (entities) you set. Think of it as a key–value mapping, where entities act as keys and memories as values.
- Async memory operations for supporting non‑blocking requests so your chatbot remains responsive.
- Media support for extracting key information from images and documents in your chatbot's conversations.
- Custom categories for tagging memories using labels specific to your teams or organisations
- Graph memory that creates relationships between entities, building interconnected memories for more accurate contexts.
- Retrieval quality boosters such as rerankers, custom instructions, and keyword searches to improve the relevance and ordering of memory results.
- Memory exports and imports to allow reusing of memories.
- A Model Client Protocol (MCP) for any AI tool to consume its memory features, allowing for universal integration and interoperability.
- Prebuilt integrations and SDKs for ease of setup.
Having touched on why memory matters and mem0's role in it, the next step is to see it in action in a real chatbot setup.
Building the chatbot
In this section, you will be building a simple demo therapy chatbot named wellness0. It will be terminal-based, so its functioning will be built on a Read-Evaluate-Print-Loop (REPL) pattern. Once running, it will be waiting for the user input, capture it, process it, and display the output from the chatbot in a repeating cycle.
This section is divided into two parts:
- Building the chatbot without memorization
- Adding memorization to the chatbot
As the naming suggests, in the first section, we will build wellness0 without the memory feature and check its functioning. The second section covers the AI memory layer addition using mem0.
Building the chatbot without memorization
The tree structure of the project is shown below:
.
├── chatbot.py
├── .env
└── main.py
Creating the files
Start by creating the directory and the files by running the command below in your terminal:
mkdir mem0ry_chatbot && cd mem0ry_chatbot && touch chatbot.py main.py .env
main.py will be used to create the REPL functionality, while chatbot.py will house the Gemini calling logic. To store your API keys, you will use the .env file.
Creating a virtual environment
Proceed to create a virtual environment for easy dependency management.
python -m venv venv
source venv/bin/activate
Installing the dependencies
Install the dependencies for Gemini AI (google-genai) and loading environment variables (python-dotenv) using:
pip install google-genai python-dotenv
Setting the Gemini API key
Set your Gemini API key by opening the .env file and pasting the following variable into it:
GEMINI_API_KEY=<your-key>
Copy your API key and replace <your-key> like so:
GEMINI_API_KEY=AIxxxxxxxxxxxxxxxxxxxxxx
Calling the Gemini API
Open the chatbot.py and paste the following contents into it:
from google import genai
from dotenv import load_dotenv
import os
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=GEMINI_API_KEY)
def capture_and_process_input(user_input: str):
context = "Imagine you are a virtual therapist to this user."
user_input = f"{user_input}\n\n Context: {context}"
try:
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents=user_input
)
return response.text
except Exception as e:
return f"Error: {e}"
- The script begins by importing the necessary packages, loading the environment variables, and initializing Gemini.
- It then defines a function (
capture_and_process_input) that accepts one parameter. The parameter is the user input captured from the terminal. - Inside the function, a context is appended to the user input, instructing Gemini to imagine it's the user's virtual therapist. The Gemini client's
-
generate_contentfunction is called while passing in the model to be used and the user input to themodelandcontentsparameters, respectively. You can get models available in the Gemini API documentation. - The response is then returned as a string. The
textfunction returns then concatenation of all text parts in the response - The
try-exceptblock is helpful in handling exceptions
Capturing the user input (REPL)
The main.py will be used for this. Paste the contents below into the file.
from chatbot import capture_and_process_input
def repl():
print("--------------------------------\nYour virtual therapist is ready!\n\nType how you're feeling.\nNo rush. No judgement.\n\nType 'exit' or press CTRL + C to close.\n--------------------------------")
try:
while True:
user_input = input("$ wellness0> ")
if user_input.lower() == "exit":
print("\nBYE!\n----------------------------------------------")
break
try:
response = capture_and_process_input(user_input)
print(f"{response}\n")
except Exception as e:
print("Error:", e)
except KeyboardInterrupt:
print("\nBYE!\n----------------------------------------------")
if __name__ == '__main__':
repl()
- It starts by importing the
capture_and_process_inputfunction from the chatbot script discussed. - A function called
replis defined. Theprintstatements with hard-coded string literals are used to display a greeting message to the user and prettify the output. - The
whileloop runs recursively until the user either types 'exit' or clicks the CTRL + C key combination. - Inside the
try-exceptblock, a request is sent to Gemini by calling thecapture_and_process_inputmethod while passing in the user's typed input.
Running the chatbot
You can try out your chatbot by running the command below in your terminal:
python main.py
You should see something like:
--------------------------------
Your virtual therapist is ready!
Type how you're feeling.
No rush. No judgement.
Type 'exit' or press CTRL + C to close.
--------------------------------
$ wellness0>
Type in a message describing a past experience, for example, "I am a war veteran, and I was deployed in Afghanistan. Oh my, I still have PTSD from that experience. I keep having flashbacks and nightmares."
It will print out some information about counselling.
Proceed to ask it a question like, "Do you know of any physical therapy centers in the US?"
Finally, test the contextual memory by feeding this: "Thanks. Do you remember my war story? The country?" It will print out something like:
I don’t have a memory of our past conversations once a session is closed, so I don't currently know the name of the country or the details of your story.
However, I would love to hear about it again! If you give me a quick reminder—the name of the country, the setting, or even just a few characters—we can pick up right where you left off.
**What was the country called?*
Such a response can be very frustrating for your users. The chatbot displays that because it doesn't have the context of the conversation. Let's get rid of this by adding a memory layer to it.
Adding AI memory to the chatbot
To set things off, you will need to get a primer on how mem0 adds memories. We will infer from the API reference docs. To add a memory, you will use the add memories endpoint: POST /v1/memories/.
The endpoint's request body needs something like:
{
"user_id": "alice",
"messages": [
{"role": "user", "content": "<user-message>"},
{"role": "assistant", "content": "<assistant-response>"}
{"role": "user", "content": "<user-message>"},
{"role": "assistant", "content": "<assistant-response>"}
],
"metadata": {
"source": "onboarding_form"
}
This means for a session, we need to be dynamically populating the messages JSON array with the conversations from the user and the responses from the chatbot. The question is, “How do we store them as long as our REPL Interactive session is active?” Storing it in a database or a file will cause some overhead in read/write operations. That will be overkill since the data is only needed temporarily.
A simpler solution is using a Singleton class that contains an in memory list where we store the conversations. The class will also be used to store the mem0 API keys. Proceed to the next section, where we will create this class and set it up for use in our
chatbot.
Creating the Singleton class
Begin the addition by installing the mem0's dependency that houses the Python SDK you will be using.
pip install mem0ai
In the same manner that you added the environment variable for Gemini, create a second variable in the .env file:
MEM0_API_KEY=
Copy your mem0 API key from your developer dashboard and assign it to the
variable by pasting it.
MEM0_API_KEY=m0-xxxxxxxxxxx
In the root of the mem0ry_chatbot directory, create a file called
ai_memory_engine.py.
touch ai_memory_engine.py
Paste the contents below into it:
from mem0 import MemoryClient
from dotenv import load_dotenv
import os
class ChatbotMem0ry:
load_dotenv()
MEM0_API_KEY = os.getenv("MEM0_API_KEY")
mem0_client = MemoryClient(api_key=MEM0_API_KEY)
messages_store: list[dict[str, str]] = []
user_id = "chattymouse9880989"
@classmethod
def add(cls, role: str, content: str):
cls.messages_store.append({
"role": role,
"content": content
})
cls.add_to_mem0()
@classmethod
def messages(cls):
return cls.messages_store
@classmethod
def add_to_mem0(cls):
try:
memory_add_response = cls.mem0_client.add(cls.messages_store,
user_id=cls.user_id,
version="v2",
output_format="v1.1"
)
except Exception as e:
print(f"Error: {e}")
@classmethod
def search_from_mem0(cls, user_input: str):
try:
filters = {"AND": [{"user_id": cls.user_id}]}
memories_response = cls.mem0_client.search(filters=filters, query=user_input)
if memories_response:
# Extract all memory texts and join them into one string seperated by fullstops
memories = ". ".join([m["memory"] for m in memories_response["results"]])
return memories
except Exception as e:
print(f"Error: {e}")
@classmethod
def get_all_user_memories(cls):
try:
filters = {"AND": [{"user_id": cls.user_id}]}
memories_response = cls.mem0_client.get_all(filters=filters)
return memories_response.get("results", [])
except Exception as e:
print(f"Error: {e}")
- After the necessary importations and declarations, the class defines two variables:
-
messages_storelist for storing the conversations. -
user_idfor mapping the user to the memory. This is very important.
-
- We add the messages to the list using the
addmethod. Additionally, theadd_to_mem0method is called. Theadd_to_mem0method uses the mem0 client SDK'saddfunction to add the list and the user ID by passing them in as parameters. - To return all messages stored in the list, the
messagesmethod is used. -
search_from_mem0method searches for the relevant context from stored messages in mem0. It uses a filter to limit the memory search to those of the specified user ID.This method is called when a user sends a new message to the chatbot.
search_from_mem0first gets the relevant context from mem0 and then returns the memories as a string joined by fullstops for flow since the messages are returned in the form ofdictsin a JSON array.The mem0 client's
searchfunction is used for this, where the filter and the user input are passed to thefiltersandquerynamed arguments, respectively. get_all_user_memoriesworks in almost a similar way tosearch_from_mem0, only that no user input is being passed in and we are retrieving all the memory stored. You will use this later on to check if any memory tied to a user exists.
For this singleton class to work, you will need to modify the chatbot.py and main.py files.
Modifying the chatbot.py file
Two things will be done:
- Changing the user input capture flow
- Adding the chatbot's response to the
ChatbotMem0ry.messages_storelist.
To change the input capture flow, the code below will be used:
def capture_and_process_input(user_input: str) -> str:
if ChatbotMem0ry.get_all_user_memories():
searched_memories = ChatbotMem0ry.search_from_mem0(user_input)
user_input = f"{user_input}\nContext: {searched_memories}"
elif len(ChatbotMem0ry.messages()) < 2:
context = "Imagine you are a virtual therapist to this user."
user_input = f"{user_input}\n\n Context: {context}"
...
The code checks for both new and active sessions.
- For a new session, it checks if there are any messages stored in mem0. If there is, then a relevant context is searched. If not, that's a new user whose
ChatbotMem0ry.messages_storelist has no chatbot response entry (less than 2 entries, with only one from the user). - For an old, active session, it does the same check again.
To add the chatbot's response to the ChatbotMem0ry.messages_store, we use the line below:
...
ChatbotMem0ry.add(role ="assistant", content = txt_response)
...
Below is the fully modified chatbot.py file:
from google import genai
from dotenv import load_dotenv
import os
from ai_memory_engine import ChatbotMem0ry
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
gemini_client = genai.Client(api_key=GEMINI_API_KEY)
def capture_and_process_input(user_input: str) -> str:
if ChatbotMem0ry.get_all_user_memories():
searched_memories = ChatbotMem0ry.search_from_mem0(user_input)
user_input = f"{user_input}\nContext: {searched_memories}"
elif len(ChatbotMem0ry.messages()) < 2:
context = "Imagine you are a virtual therapist to this user."
user_input = f"{user_input}\n\n Context: {context}"
try:
response = gemini_client.models.generate_content(
model="gemini-3-pro-preview",
contents=user_input
)
txt_response = response.text
ChatbotMem0ry.add(role ="assistant", content = txt_response)
return txt_response
except Exception as e:
return f"Error: {e}"
Modifying the main.py file
The only thing we have to do here is add the code for storing the user's captured input to the ChatbotMem0ry.messages_store list.
ChatbotMem0ry.add(role="user", content=user_input)
...
This is the fully modified code:
from chatbot import capture_and_process_input
from ai_memory_engine import ChatbotMem0ry
def repl():
print("--------------------------------\nYour virtual therapist is ready!\n\nType how you're feeling.\nNo rush. No judgement.\n\nType 'exit' or press CTRL + C to close.\n--------------------------------")
try:
while True:
user_input = input("$ wellness0> ")
if user_input.lower() == "exit":
print("\nBYE!\n----------------------------------------------")
break
try:
ChatbotMem0ry.add(role="user", content=user_input)
response = capture_and_process_input(user_input)
print(f"{response}\n")
except Exception as e:
print("Error:", e)
except KeyboardInterrupt:
print("\nBYE!\n----------------------------------------------")
if __name__ == '__main__':
repl()
Running the memorized chatbot
Just like the other bot, run this one using:
python main.py
Repeat the same prompts that we used earlier. You will get a different answer for the last prompt. On my end, I got this:
"Yes, I do. Based on what you’ve shared, you were deployed in
**Afghanistan**.
I also remember that you are currently navigating the challenges of PTSD, including flashbacks and nightmares, and that you are looking for physical therapy centers within the United States.
To help you find the right physical therapy center, could you tell me which **state or city** you are located in? Also, are you looking for a center that specifically works with veterans or accepts VA benefits (like TRICARE or Community Care)?
***
*Note: If you are currently experiencing a difficult flashback or feeling overwhelmed, please remember you can contact the **Veterans Crisis Line** anytime by dialing **988 and pressing 1**, or texting **838255**.*
You can clearly see the difference that a memory layer brings to your chatbot. In the next section, we will compare two approaches of building memory layers visually using a table.
Building your own memory vs using mem0
| Aspect | Building a custom memory layer | Using mem0 |
|---|---|---|
| Initial setup | It requires designing and developing storage, embeddings, indexing, and retrieval logic from scratch | It's a plug-and-play setup with SDKs and prebuilt integrations |
| Time to production | Long due to months of developing, testing, and iteration | Short. Depending on your needs, you technically need minutes to a few hours to get a working memory layer |
| Scalability | Needs careful and sensitive planning for growth of the memory operations | It scales automatically |
| Maintenance overhead | High due to the self-management | None as mem0 handles everything for you |
| Reliability | This highly depends on the in-house experts that you have | It is well-tested and production-ready |
Next steps
We have barely scratched the surface of what mem0 can do for your chatbots. Since you have an introduction, get your hands dirty and implement more mem0 features like:
- Graph memories
- Updating and deleting memories
- Memory imports and exports
- Advanced filtering operations
- Adding custom instructions to your filters
- Extracting information from media chats
- User feedback collection
- Webhooks
Happy building!
Top comments (0)