Give your AI memory in one parameter

#ai #api #tutorial #beginners

By default, an LLM forgets you the moment a conversation ends. Start a new chat and it has no idea who you are, what you told it last week, or what you prefer. For a real product, that is a dealbreaker. Users expect the app to remember.

The standard fix is a memory pipeline you build yourself. Extract the important facts from each conversation. Turn them into embeddings. Store the vectors in a database. On every new message, run a similarity search, pull the relevant facts, and inject them into the prompt. That is a meaningful chunk of engineering, and you maintain it forever.

Backboard collapses that into one parameter: memory. Set it to "Auto" and your assistant remembers.

The one parameter

Memory is stored on the assistant, so pass the same assistant_id and memory="Auto". Facts the user shares in one conversation are recalled in the next.

Python

pip install backboard-sdk

import asyncio
from backboard import BackboardClient

async def main():
    client = BackboardClient(api_key="YOUR_API_KEY")

    # Conversation 1: tell it something
    await client.send_message(
        "My name is Sarah. I work at Google as a software engineer.",
        assistant_id="your-assistant-id",
        memory="Auto",
    )

    # Conversation 2: new thread, same assistant, it remembers
    reply = await client.send_message(
        "What do you remember about me?",
        assistant_id="your-assistant-id",
        memory="Auto",
    )
    print(reply.content)  # name, employer, and role

asyncio.run(main())

JavaScript (Node 18+)

const send = (body) =>
  fetch("https://app.backboard.io/api/threads/messages", {
    method: "POST",
    headers: {
      "X-API-Key": "YOUR_API_KEY",
      "Content-Type": "application/json",
    },
    body: JSON.stringify(body),
  }).then((r) => r.json());

await send({
  content: "My name is Sarah. I work at Google as a software engineer.",
  assistant_id: "your-assistant-id",
  memory: "Auto",
});

const reply = await send({
  content: "What do you remember about me?",
  assistant_id: "your-assistant-id",
  memory: "Auto",
});

console.log(reply.content);

cURL

# Save: memory="Auto" extracts and stores facts
curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "My name is Sarah. I work at Google as a software engineer.", "assistant_id": "your-assistant-id", "memory": "Auto"}'

# Recall: same assistant, new conversation
curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "What do you remember about me?", "assistant_id": "your-assistant-id", "memory": "Auto"}'

No embedding step. No vector database. No retrieval code. One parameter, and the assistant extracts the facts, stores them, and recalls them when they are relevant.

What `"Auto"` actually does

Behind that single value, Backboard runs the full loop:

Extraction pulls key facts from the conversation, like "works at Google" or "prefers dark mode."
Storage saves them to a semantic knowledge base tied to the assistant.
Retrieval finds the relevant facts on future messages and feeds them to the model.

It works across every thread under the same assistant, which is exactly the behavior you want: the user is remembered no matter which conversation they are in.

The modes

memory is a per-turn parameter. Pass it on each call where you want memory active. Pick one value:

Parameter	Value	Saves?	Retrieves?	Use it when
`memory`	`"Auto"`	Yes	Yes	The recommended default for most apps
`memory`	`"Readonly"`	No	Yes	Recall facts without writing new ones
`memory`	`"off"`	No	No	One-off requests that should not be remembered
`memory_pro`	`"Auto"`	Yes	Yes	You need higher-accuracy recall and accept higher cost
`memory_pro`	`"Readonly"`	No	Yes	High-accuracy recall only

memory and memory_pro cannot be used together in the same message. Use memory for everyday recall and memory_pro when accuracy matters more than cost.

# Higher-accuracy retrieval
response = await client.send_message(
    "What were my project deadlines?",
    assistant_id="your-assistant-id",
    memory_pro="Auto",
)

When you want manual control

"Auto" covers most apps. When you need to manage memory directly, the assistant exposes full CRUD: list, add, search, update, and delete. You own the data and can export it whenever you want.

# Add a fact yourself
await client.add_memory(
    assistant_id,
    content="User prefers dark mode in all applications",
)

# Semantic search over what the assistant knows
results = await client.search_memories(
    assistant_id,
    query="user interface preferences",
    limit=5,
)
for m in results["memories"]:
    print(m["content"])

The point

Persistent memory is usually a project: an extraction pipeline, a vector store, retrieval code, and ongoing upkeep. Backboard makes it a parameter. Set memory="Auto", reuse the assistant, and your AI remembers your users across every conversation. When you need precision or control, switch to memory_pro or manage memories directly. No database required.

Grab a key and try it: app.backboard.io

Memory docs: docs.backboard.io/concepts/memory