By default, an LLM forgets you the moment a conversation ends. Start a new chat and it has no idea who you are, what you told it last week, or what you prefer. For a real product, that is a dealbreaker. Users expect the app to remember.
The standard fix is a memory pipeline you build yourself. Extract the important facts from each conversation. Turn them into embeddings. Store the vectors in a database. On every new message, run a similarity search, pull the relevant facts, and inject them into the prompt. That is a meaningful chunk of engineering, and you maintain it forever.
Backboard collapses that into one parameter: memory. Set it to "Auto" and your assistant remembers.
The one parameter
Memory is stored on the assistant, so pass the same assistant_id and memory="Auto". Facts the user shares in one conversation are recalled in the next.
Python
pip install backboard-sdk
import asyncio
from backboard import BackboardClient
async def main():
client = BackboardClient(api_key="YOUR_API_KEY")
# Conversation 1: tell it something
await client.send_message(
"My name is Sarah. I work at Google as a software engineer.",
assistant_id="your-assistant-id",
memory="Auto",
)
# Conversation 2: new thread, same assistant, it remembers
reply = await client.send_message(
"What do you remember about me?",
assistant_id="your-assistant-id",
memory="Auto",
)
print(reply.content) # name, employer, and role
asyncio.run(main())
JavaScript (Node 18+)
const send = (body) =>
fetch("https://app.backboard.io/api/threads/messages", {
method: "POST",
headers: {
"X-API-Key": "YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify(body),
}).then((r) => r.json());
await send({
content: "My name is Sarah. I work at Google as a software engineer.",
assistant_id: "your-assistant-id",
memory: "Auto",
});
const reply = await send({
content: "What do you remember about me?",
assistant_id: "your-assistant-id",
memory: "Auto",
});
console.log(reply.content);
cURL
# Save: memory="Auto" extracts and stores facts
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "My name is Sarah. I work at Google as a software engineer.", "assistant_id": "your-assistant-id", "memory": "Auto"}'
# Recall: same assistant, new conversation
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "What do you remember about me?", "assistant_id": "your-assistant-id", "memory": "Auto"}'
No embedding step. No vector database. No retrieval code. One parameter, and the assistant extracts the facts, stores them, and recalls them when they are relevant.
What "Auto" actually does
Behind that single value, Backboard runs the full loop:
- Extraction pulls key facts from the conversation, like "works at Google" or "prefers dark mode."
- Storage saves them to a semantic knowledge base tied to the assistant.
- Retrieval finds the relevant facts on future messages and feeds them to the model.
It works across every thread under the same assistant, which is exactly the behavior you want: the user is remembered no matter which conversation they are in.
The modes
memory is a per-turn parameter. Pass it on each call where you want memory active. Pick one value:
| Parameter | Value | Saves? | Retrieves? | Use it when |
|---|---|---|---|---|
memory |
"Auto" |
Yes | Yes | The recommended default for most apps |
memory |
"Readonly" |
No | Yes | Recall facts without writing new ones |
memory |
"off" |
No | No | One-off requests that should not be remembered |
memory_pro |
"Auto" |
Yes | Yes | You need higher-accuracy recall and accept higher cost |
memory_pro |
"Readonly" |
No | Yes | High-accuracy recall only |
memory and memory_pro cannot be used together in the same message. Use memory for everyday recall and memory_pro when accuracy matters more than cost.
# Higher-accuracy retrieval
response = await client.send_message(
"What were my project deadlines?",
assistant_id="your-assistant-id",
memory_pro="Auto",
)
When you want manual control
"Auto" covers most apps. When you need to manage memory directly, the assistant exposes full CRUD: list, add, search, update, and delete. You own the data and can export it whenever you want.
# Add a fact yourself
await client.add_memory(
assistant_id,
content="User prefers dark mode in all applications",
)
# Semantic search over what the assistant knows
results = await client.search_memories(
assistant_id,
query="user interface preferences",
limit=5,
)
for m in results["memories"]:
print(m["content"])
The point
Persistent memory is usually a project: an extraction pipeline, a vector store, retrieval code, and ongoing upkeep. Backboard makes it a parameter. Set memory="Auto", reuse the assistant, and your AI remembers your users across every conversation. When you need precision or control, switch to memory_pro or manage memories directly. No database required.
Grab a key and try it: app.backboard.io
Memory docs: docs.backboard.io/concepts/memory
Top comments (0)