LLMs are stateless. Every API call to a raw model is a blank slate. The model has no idea what was said two messages ago. So the moment you want a chatbot that remembers the conversation, you are on the hook for state.
The usual answer is infrastructure. Spin up Postgres to store message history. Add Redis to cache sessions. Stand up a vector database for long-term memory. Write the code that loads history, trims it to fit the context window, stitches it into every prompt, and saves the new turn. That is a lot of plumbing before the bot says hello.
Backboard handles state for you. Two ideas replace the whole stack: threads and assistants. You never run a database.
The model
Three things, nested:
- Message is one turn. A user message in, an assistant reply out.
-
Thread is one conversation. An ordered list of messages. Pass its
thread_idon the next call and the model sees the full history. - Assistant is the profile above the thread. It holds the name, default instructions, tools, and memory. One assistant can own many threads, for example one thread per end-user.
Memory lives on the assistant, so it is shared across every thread under it. History lives on the thread. Both persist on Backboard's side. Nothing to provision.
Threads: state within one conversation
Send a first message and a thread is created automatically. The response hands you a thread_id. Pass it back on the next call and the conversation continues with full context.
Python
pip install backboard-sdk
import asyncio
from backboard import BackboardClient
async def main():
client = BackboardClient(api_key="YOUR_API_KEY")
first = await client.send_message("My favorite color is blue.")
# Same thread: the model remembers the previous turn
second = await client.send_message(
"What did I just tell you?",
thread_id=first.thread_id,
)
print(second.content) # "You told me your favorite color is blue."
asyncio.run(main())
JavaScript (Node 18+)
const send = (body) =>
fetch("https://app.backboard.io/api/threads/messages", {
method: "POST",
headers: {
"X-API-Key": "YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify(body),
}).then((r) => r.json());
const first = await send({ content: "My favorite color is blue." });
// Same thread: pass the thread_id back
const second = await send({
content: "What did I just tell you?",
thread_id: first.thread_id,
});
console.log(second.content);
cURL
# First message, thread auto-created
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "My favorite color is blue."}'
# Continue: pass the thread_id from the first response
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "What did I just tell you?", "thread_id": "THREAD_ID_FROM_FIRST_RESPONSE"}'
No history table. No prompt-stitching code. The thread_id is your conversation state, and Backboard stores it. When a thread gets long enough to crowd the context window, Backboard summarizes older messages automatically so you do not have to manage trimming.
Assistants: state across conversations
A thread remembers one chat. An assistant remembers the user across many chats. Memory is stored per assistant, so to carry facts into a brand new conversation you reuse the same assistant_id and start a fresh thread.
Python
# Conversation 1
await client.send_message(
"I'm allergic to peanuts.",
assistant_id="your-assistant-id",
memory="Auto",
)
# Conversation 2: new thread, same assistant, memory carries over
reply = await client.send_message(
"Any dietary restrictions you remember?",
assistant_id="your-assistant-id",
memory="Auto",
)
print(reply.content) # "You mentioned you're allergic to peanuts."
JavaScript (Node 18+)
await send({
content: "I'm allergic to peanuts.",
assistant_id: "your-assistant-id",
memory: "Auto",
});
const reply = await send({
content: "Any dietary restrictions you remember?",
assistant_id: "your-assistant-id",
memory: "Auto",
});
console.log(reply.content);
cURL
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "I am allergic to peanuts.", "assistant_id": "your-assistant-id", "memory": "Auto"}'
curl -X POST "https://app.backboard.io/api/threads/messages" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "Any dietary restrictions you remember?", "assistant_id": "your-assistant-id", "memory": "Auto"}'
This is the part that normally requires a vector database: embedding facts, storing vectors, running similarity search on every request. Here it is one parameter, memory="Auto", and the assistant owns it.
When to pass what
| Goal | Pass |
|---|---|
| Keep talking in the same chat | The same thread_id every call |
| New chat, but remember the user | Omit thread_id, reuse the same assistant_id with memory="Auto"
|
| One assistant, many users | One assistant_id, a separate thread_id per user |
That last row is the whole pattern for a multi-user app. One assistant defines your AI. Each user gets their own thread. State stays separated without a schema you designed, a migration you ran, or a database you babysit.
The point
Stateless models force you to build a state layer. Backboard makes that layer part of the API. Threads hold the conversation. Assistants hold the profile and the memory. Both persist server-side. You ship a stateful, multi-user AI app and never write a line of database code.
Grab a key and try it: app.backboard.io
Architecture in full: docs.backboard.io/concepts/architecture
Top comments (0)