DEV Community

Terrence Aluda
Terrence Aluda

Posted on

Chatbot personalization: How to build a chatbot that uses memory to create a better user experience

Even in this AI era, user experience (UX) is vital for applications to have. UX is key to ensuring that your users are interacting with your product comfortably. That means whatever you are building must not be irritating to your users.

The benefits of a good UX are quite a number, such that talking about them deeply requires an article of its own. One of those benefits is a lower churn rate. A reduced churn rate implies that there will be fewer uninstalls after installs, fewer users exiting
your site without fully utilizing it, or fewer force-closures of your apps. Such actions from your users are bad for business, and that's why this article was written to talk about one specific way to improve chatbots' user experience.

With the present Large Language Memory (LLM) wrappers, creating chatbots can be as easy as importing a few packages/libraries and writing a few lines of code. One way of differentiating your chatbot from the rest is through personalizing it by making it memorize conversations reliably. This type of chatbot personalization improves UX by enabling tailored, flowing, and more human-like conversations.
Chatbot personalization through memorization will be explored more later on.

Since this is a hands-on article, it's important to check what you need in the prerequisites section below.

Prerequisites

You will need:

  • Python installed in your working environment.
  • Basic Python knowledge.
  • A mem0 account. The hobby plan is sufficient to follow through.
  • A Gemini API key. We will be using Gemini as our conversational AI provider.
  • Basics in Artificial Intelligence, specifically conversational AI.

What is personalization?

In software development terms, personalization is the practice of tuning or setting an application's way of working to fit a certain experience instead of having a one-size-fits-all general behavior. For chatbots, personalizing makes them adjust their conversation manner based on how a user converses and what they converse
about. Consider the following user prompt: "Do you have any army paths you'd advise me to pursue?"

  • ​Without personalization, the chatbot could respond with something like, "You can follow the Navy, the Air Force, or the Army."

  • With personalization, the bot's response could be, "Since you’ve previously shown interest in aircraft, I would advise you to join the Air Force."

One way to achieve that is by making the chatbot retain the conversation history and extract the appropriate contexts to be used for the next conversations.

What is AI memory?

AI memory is simply a software layer that captures context information from a running AI system and stores it so it can be used for future operations. This retention enables an AI system to give better output during the next interactions.

In chatbots, AI memory stores conversation history and feeds the history to the chatbot in subsequent replies and new sessions. At this point, you may be wondering, “Doesn’t this add larger and heavier context to the chatbot?” or “How is this different from manually feeding context each time?” To answer that, a well-designed AI memory layer should only pick out the relevant context from the stored memory to feed it to the chatbot instead of blindly reusing all the stored data.

With that said, the AI memory layer should also be intelligent for it to be capable of filtering, ranking, and recalling useful context. That may look complicated, moreso if you are thinking of building it from scratch. This is why you need a robust and optimized solution like mem0.

NOTE: The rest of the article will be using the terms AI memory, memory, memory layer, and AI memory layer interchangeably.

What is mem0?

mem0 is an intelligent AI memory layer that is designed to improve over time as it is fed with more memories. All the heavy-lifting is done for you, so you don't have to worry about creating and maintaining your own memory layer. At a high-level, mem0 works by storing meaningful memory efficiently, retrieving only relevant context from the stored memory, and updating the memory over time.

It comes in three flavours:

1.​ mem0 Platform

This provides a hosted, fully managed, ready-to-use memory solution for those who want a plug-and-play experience free of the infrastructure management hassle.

2.​ mem0 Open Source

This is a self-hosted memory layer that gives you full control over implementation, customization, and general management.

3.​ OpenMemory

OpenMemory is a framework designed to work across multiple AI systems and tools and is fit for those looking to port AI memory between different platforms and build interoperable AI experiences.

Why mem0?

Some ​of mem0's features that make it worth include:

  • Memory filters, which allow you to retrieve memories based on the criteria you set.
  • Entity-Scoped memory that maps memory to the fields (entities) you set. Think of it as a key–value mapping, where entities act as keys and memories as values.
  • Async memory operations for supporting non‑blocking requests so your chatbot remains responsive.
  • Media support for extracting key information from images and documents in your chatbot's conversations.
  • Custom categories for tagging memories using labels specific to your teams or organisations
  • Graph memory that creates relationships between entities, building interconnected memories for more accurate contexts.
  • Retrieval quality boosters such as rerankers, custom instructions, and keyword searches to improve the relevance and ordering of memory results.
  • Memory exports and imports to allow reusing of memories.
  • A Model Client Protocol (MCP) for any AI tool to consume its memory features, allowing for universal integration and interoperability.
  • Prebuilt integrations and SDKs for ease of setup.

Having touched on why memory matters and mem0's role in it, the next step is to see it in action in a real chatbot setup.

Building the chatbot

In this section, you will be building a simple demo therapy chatbot named wellness0. It will be terminal-based, so its functioning will be built on a Read-Evaluate-Print-Loop (REPL) pattern. Once running, it will be waiting for the user input, capture it, process it, and display the output from the chatbot in a repeating cycle.

This section is divided into two parts:

  • Building the chatbot without memorization
  • Adding memorization to the chatbot

As the naming suggests, in the first section, we will build wellness0 without the memory feature and check its functioning. The second section covers the AI memory layer addition using mem0.

Building the chatbot without memorization

The tree structure of the project is shown below:

.
├── chatbot.py
├── .env
└── main.py
Enter fullscreen mode Exit fullscreen mode

Creating the files

Start by creating the directory and the files by running the command below in your terminal:

mkdir mem0ry_chatbot && cd mem0ry_chatbot && touch chatbot.py main.py .env
Enter fullscreen mode Exit fullscreen mode

main.py will be used to create the REPL functionality, while chatbot.py will house the Gemini calling logic. To store your API keys, you will use the .env file.

Creating a virtual environment

Proceed to create a virtual environment for easy dependency management.

python -m venv venv
source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Installing the dependencies

Install the dependencies for Gemini AI (google-genai) and loading environment variables (python-dotenv) using:

pip install google-genai python-dotenv
Enter fullscreen mode Exit fullscreen mode

Setting the Gemini API key

Set your Gemini API key by opening the .env file and pasting the following variable into it:

GEMINI_API_KEY=<your-key>
Enter fullscreen mode Exit fullscreen mode

Copy your API key and replace <your-key> like so:

GEMINI_API_KEY=AIxxxxxxxxxxxxxxxxxxxxxx
Enter fullscreen mode Exit fullscreen mode

Calling the Gemini API

Open the chatbot.py and paste the following contents into it:

from google import genai
from dotenv import load_dotenv
import os

load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=GEMINI_API_KEY)

def capture_and_process_input(user_input: str):
    context = "Imagine you are a virtual therapist to this user."
    user_input = f"{user_input}\n\n Context: {context}"
    try:
        response = client.models.generate_content(
            model="gemini-3-flash-preview",
            contents=user_input
        )
        return response.text
    except Exception as e:
        return f"Error: {e}"
Enter fullscreen mode Exit fullscreen mode
  • The script begins by importing the necessary packages, loading the environment variables, and initializing Gemini.
  • It then defines a function (capture_and_process_input) that accepts one parameter. The parameter is the user input captured from the terminal.
  • Inside the function, a context is appended to the user input, instructing Gemini to imagine it's the user's virtual therapist. The Gemini client's
  • generate_content function is called while passing in the model to be used and the user input to the model and contents parameters, respectively. You can get models available in the Gemini API documentation.
  • The response is then returned as a string. The text function returns then concatenation of all text parts in the response
  • The try-except block is helpful in handling exceptions

Capturing the user input (REPL)

The main.py will be used for this. Paste the contents below into the file.

from chatbot import capture_and_process_input

def repl():
    print("--------------------------------\nYour virtual therapist is ready!\n\nType how you're feeling.\nNo rush. No judgement.\n\nType 'exit' or press CTRL + C to close.\n--------------------------------")

    try:
        while True:
            user_input = input("$ wellness0> ")
            if user_input.lower() == "exit":
                print("\nBYE!\n----------------------------------------------")
                break

            try:
                response = capture_and_process_input(user_input)
                print(f"{response}\n")
            except Exception as e:
                print("Error:", e)
    except KeyboardInterrupt:
        print("\nBYE!\n----------------------------------------------")

if __name__ == '__main__':
    repl()
Enter fullscreen mode Exit fullscreen mode
  • It starts by importing the capture_and_process_input function from the chatbot script discussed.
  • A function called repl is defined. The print statements with hard-coded string literals are used to display a greeting message to the user and prettify the output.
  • The while loop runs recursively until the user either types 'exit' or clicks the CTRL + C key combination.
  • Inside the try-except block, a request is sent to Gemini by calling the capture_and_process_input method while passing in the user's typed input.

Running the chatbot

You can try out your chatbot by running the command below in your terminal:

python main.py
Enter fullscreen mode Exit fullscreen mode

You should see something like:

--------------------------------
Your virtual therapist is ready!
Type how you're feeling.
No rush. No judgement.
Type 'exit' or press CTRL + C to close.
--------------------------------
$ wellness0>
Enter fullscreen mode Exit fullscreen mode

Type in a message describing a past experience, for example, "I am a war veteran, and I was deployed in Afghanistan. Oh my, I still have PTSD from that experience. I keep having flashbacks and nightmares."

It will print out some information about counselling.
Proceed to ask it a question like, "Do you know of any physical therapy centers in the US?"

Finally, test the contextual memory by feeding this: "Thanks. Do you remember my war story? The country?" It will print out something like:

I don’t have a memory of our past conversations once a session is closed, so I don't currently know the name of the country or the details of your story.
However, I would love to hear about it again! If you give me a quick reminder—the name of the country, the setting, or even just a few characters—we can pick up right where you left off.

**What was the country called?*
Enter fullscreen mode Exit fullscreen mode

Such a response can be very frustrating for your users. The chatbot displays that because it doesn't have the context of the conversation. Let's get rid of this by adding a memory layer to it.

Adding AI memory to the chatbot

To set things off, you will need to get a primer on how mem0 adds memories. We will infer from the API reference docs. To add a memory, you will use the add memories endpoint: POST /v1/memories/.

The endpoint's request body needs something like:

{
"user_id": "alice",
"messages": [
    {"role": "user", "content": "<user-message>"},
    {"role": "assistant", "content": "<assistant-response>"}
    {"role": "user", "content": "<user-message>"},
    {"role": "assistant", "content": "<assistant-response>"}
],
"metadata": {
    "source": "onboarding_form"
}
Enter fullscreen mode Exit fullscreen mode

This means for a session, we need to be dynamically populating the messages JSON array with the conversations from the user and the responses from the chatbot. The question is, “How do we store them as long as our REPL Interactive session is active?” Storing it in a database or a file will cause some overhead in read/write operations. That will be overkill since the data is only needed temporarily.

A simpler solution is using a Singleton class that contains an in memory list where we store the conversations. The class will also be used to store the mem0 API keys. Proceed to the next section, where we will create this class and set it up for use in our
chatbot.

Creating the Singleton class

Begin the addition by installing the mem0's dependency that houses the Python SDK you will be using.

pip install mem0ai
Enter fullscreen mode Exit fullscreen mode

In the same manner that you added the environment variable for Gemini, create a second variable in the .env file:

MEM0_API_KEY=
Enter fullscreen mode Exit fullscreen mode

Copy your mem0 API key from your developer dashboard and assign it to the
variable by pasting it.

MEM0_API_KEY=m0-xxxxxxxxxxx
Enter fullscreen mode Exit fullscreen mode

In the root of the mem0ry_chatbot directory, create a file called
ai_memory_engine.py.

touch ai_memory_engine.py
Enter fullscreen mode Exit fullscreen mode

Paste the contents below into it:

from mem0 import MemoryClient
from dotenv import load_dotenv
import os


class ChatbotMem0ry:
    load_dotenv()

    MEM0_API_KEY = os.getenv("MEM0_API_KEY")
    mem0_client = MemoryClient(api_key=MEM0_API_KEY)

    messages_store: list[dict[str, str]] = []
    user_id = "chattymouse9880989"

    @classmethod
    def add(cls, role: str, content: str):
        cls.messages_store.append({
            "role": role,
            "content": content
        })
        cls.add_to_mem0()

    @classmethod
    def messages(cls):
        return cls.messages_store

    @classmethod
    def add_to_mem0(cls):
        try:
           memory_add_response = cls.mem0_client.add(cls.messages_store,
                                user_id=cls.user_id,
                                version="v2",
                                output_format="v1.1"
                                )
        except Exception as e:
            print(f"Error: {e}")

    @classmethod
    def search_from_mem0(cls, user_input: str):
        try:
            filters = {"AND": [{"user_id": cls.user_id}]}
            memories_response = cls.mem0_client.search(filters=filters, query=user_input)
            if memories_response:
                # Extract all memory texts and join them into one string seperated by fullstops
                memories = ". ".join([m["memory"] for m in memories_response["results"]])
                return memories

        except Exception as e:
            print(f"Error: {e}")

    @classmethod
    def get_all_user_memories(cls):
        try:
            filters = {"AND": [{"user_id": cls.user_id}]}
            memories_response = cls.mem0_client.get_all(filters=filters)
            return memories_response.get("results", [])

        except Exception as e:
            print(f"Error: {e}")
Enter fullscreen mode Exit fullscreen mode
  • After the necessary importations and declarations, the class defines two variables:
    • messages_store list for storing the conversations.
    • user_id for mapping the user to the memory. This is very important.
  • We add the messages to the list using the add method. Additionally, the add_to_mem0 method is called. The add_to_mem0 method uses the mem0 client SDK's add function to add the list and the user ID by passing them in as parameters.
  • To return all messages stored in the list, the messages method is used.
  • search_from_mem0 method searches for the relevant context from stored messages in mem0. It uses a filter to limit the memory search to those of the specified user ID.

    This method is called when a user sends a new message to the chatbot. search_from_mem0 first gets the relevant context from mem0 and then returns the memories as a string joined by fullstops for flow since the messages are returned in the form of dicts in a JSON array.

    The mem0 client's search function is used for this, where the filter and the user input are passed to the filters and query named arguments, respectively.

  • get_all_user_memories works in almost a similar way to search_from_mem0, only that no user input is being passed in and we are retrieving all the memory stored. You will use this later on to check if any memory tied to a user exists.

For this singleton class to work, you will need to modify the chatbot.py and main.py files.

Modifying the chatbot.py file

Two things will be done:

  1. Changing the user input capture flow
  2. Adding the chatbot's response to the ChatbotMem0ry.messages_store list.

To change the input capture flow, the code below will be used:

def capture_and_process_input(user_input: str) -> str:
    if ChatbotMem0ry.get_all_user_memories():
        searched_memories = ChatbotMem0ry.search_from_mem0(user_input)
        user_input = f"{user_input}\nContext: {searched_memories}"
    elif len(ChatbotMem0ry.messages()) < 2:
        context = "Imagine you are a virtual therapist to this user."
        user_input = f"{user_input}\n\n Context: {context}"

    ...
Enter fullscreen mode Exit fullscreen mode

The code checks for both new and active sessions.

  • For a new session, it checks if there are any messages stored in mem0. If there is, then a relevant context is searched. If not, that's a new user whose ChatbotMem0ry.messages_store list has no chatbot response entry (less than 2 entries, with only one from the user).
  • For an old, active session, it does the same check again.

To add the chatbot's response to the ChatbotMem0ry.messages_store, we use the line below:

...
ChatbotMem0ry.add(role ="assistant", content = txt_response)
...
Enter fullscreen mode Exit fullscreen mode

Below is the fully modified chatbot.py file:

from google import genai
from dotenv import load_dotenv
import os
from ai_memory_engine import ChatbotMem0ry

load_dotenv()

GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
gemini_client = genai.Client(api_key=GEMINI_API_KEY)

def capture_and_process_input(user_input: str) -> str:
    if ChatbotMem0ry.get_all_user_memories():
        searched_memories = ChatbotMem0ry.search_from_mem0(user_input)
        user_input = f"{user_input}\nContext: {searched_memories}"
    elif len(ChatbotMem0ry.messages()) < 2:
        context = "Imagine you are a virtual therapist to this user."
        user_input = f"{user_input}\n\n Context: {context}"

    try:

        response = gemini_client.models.generate_content(
            model="gemini-3-pro-preview",
            contents=user_input
        )
        txt_response = response.text
        ChatbotMem0ry.add(role ="assistant", content = txt_response)
        return txt_response

    except Exception as e:
        return f"Error: {e}"
Enter fullscreen mode Exit fullscreen mode

Modifying the main.py file

The only thing we have to do here is add the code for storing the user's captured input to the ChatbotMem0ry.messages_store list.

ChatbotMem0ry.add(role="user", content=user_input)

...
Enter fullscreen mode Exit fullscreen mode

This is the fully modified code:

from chatbot import capture_and_process_input
from ai_memory_engine import ChatbotMem0ry

def repl():
    print("--------------------------------\nYour virtual therapist is ready!\n\nType how you're feeling.\nNo rush. No judgement.\n\nType 'exit' or press CTRL + C to close.\n--------------------------------")

    try:
        while True:
            user_input = input("$ wellness0> ")
            if user_input.lower() == "exit":
                print("\nBYE!\n----------------------------------------------")
                break

            try:
                ChatbotMem0ry.add(role="user", content=user_input)
                response = capture_and_process_input(user_input)
                print(f"{response}\n")
            except Exception as e:
                print("Error:", e)
    except KeyboardInterrupt:
        print("\nBYE!\n----------------------------------------------")

if __name__ == '__main__':
    repl()
Enter fullscreen mode Exit fullscreen mode

Running the memorized chatbot

Just like the other bot, run this one using:

python main.py
Enter fullscreen mode Exit fullscreen mode

Repeat the same prompts that we used earlier. You will get a different answer for the last prompt. On my end, I got this:

"Yes, I do. Based on what you’ve shared, you were deployed in
**Afghanistan**.
I also remember that you are currently navigating the challenges of PTSD, including flashbacks and nightmares, and that you are looking for physical therapy centers within the United States.
To help you find the right physical therapy center, could you tell me which **state or city** you are located in? Also, are you looking for a center that specifically works with veterans or accepts VA benefits (like TRICARE or Community Care)?
***

*Note: If you are currently experiencing a difficult flashback or feeling overwhelmed, please remember you can contact the **Veterans Crisis Line** anytime by dialing **988 and pressing 1**, or texting **838255**.*
Enter fullscreen mode Exit fullscreen mode

You can clearly see the difference that a memory layer brings to your chatbot. In the next section, we will compare two approaches of building memory layers visually using a table.

Building your own memory vs using mem0

Aspect Building a custom memory layer Using mem0
Initial setup It requires designing and developing storage, embeddings, indexing, and retrieval logic from scratch It's a plug-and-play setup with SDKs and prebuilt integrations
Time to production Long due to months of developing, testing, and iteration Short. Depending on your needs, you technically need minutes to a few hours to get a working memory layer
Scalability Needs careful and sensitive planning for growth of the memory operations It scales automatically
Maintenance overhead High due to the self-management None as mem0 handles everything for you
Reliability This highly depends on the in-house experts that you have It is well-tested and production-ready

Next steps

We have barely scratched the surface of what mem0 can do for your chatbots. Since you have an introduction, get your hands dirty and implement more mem0 features like:

  • Graph memories
  • Updating and deleting memories
  • Memory imports and exports
  • Advanced filtering operations
  • Adding custom instructions to your filters
  • Extracting information from media chats
  • User feedback collection
  • Webhooks

Happy building!

Top comments (0)