DEV Community

Cover image for Stateful AI without a database: threads and assistants
Jonathan Murray for Backboard.io

Posted on

Stateful AI without a database: threads and assistants

LLMs are stateless. Every API call to a raw model is a blank slate. The model has no idea what was said two messages ago. So the moment you want a chatbot that remembers the conversation, you are on the hook for state.

The usual answer is infrastructure. Spin up Postgres to store message history. Add Redis to cache sessions. Stand up a vector database for long-term memory. Write the code that loads history, trims it to fit the context window, stitches it into every prompt, and saves the new turn. That is a lot of plumbing before the bot says hello.

Backboard handles state for you. Two ideas replace the whole stack: threads and assistants. You never run a database.

The model

Three things, nested:

  • Message is one turn. A user message in, an assistant reply out.
  • Thread is one conversation. An ordered list of messages. Pass its thread_id on the next call and the model sees the full history.
  • Assistant is the profile above the thread. It holds the name, default instructions, tools, and memory. One assistant can own many threads, for example one thread per end-user.

Memory lives on the assistant, so it is shared across every thread under it. History lives on the thread. Both persist on Backboard's side. Nothing to provision.

Threads: state within one conversation

Send a first message and a thread is created automatically. The response hands you a thread_id. Pass it back on the next call and the conversation continues with full context.

Python

pip install backboard-sdk
Enter fullscreen mode Exit fullscreen mode
import asyncio
from backboard import BackboardClient

async def main():
    client = BackboardClient(api_key="YOUR_API_KEY")

    first = await client.send_message("My favorite color is blue.")

    # Same thread: the model remembers the previous turn
    second = await client.send_message(
        "What did I just tell you?",
        thread_id=first.thread_id,
    )
    print(second.content)  # "You told me your favorite color is blue."

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

JavaScript (Node 18+)

const send = (body) =>
  fetch("https://app.backboard.io/api/threads/messages", {
    method: "POST",
    headers: {
      "X-API-Key": "YOUR_API_KEY",
      "Content-Type": "application/json",
    },
    body: JSON.stringify(body),
  }).then((r) => r.json());

const first = await send({ content: "My favorite color is blue." });

// Same thread: pass the thread_id back
const second = await send({
  content: "What did I just tell you?",
  thread_id: first.thread_id,
});

console.log(second.content);
Enter fullscreen mode Exit fullscreen mode

cURL

# First message, thread auto-created
curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "My favorite color is blue."}'

# Continue: pass the thread_id from the first response
curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "What did I just tell you?", "thread_id": "THREAD_ID_FROM_FIRST_RESPONSE"}'
Enter fullscreen mode Exit fullscreen mode

No history table. No prompt-stitching code. The thread_id is your conversation state, and Backboard stores it. When a thread gets long enough to crowd the context window, Backboard summarizes older messages automatically so you do not have to manage trimming.

Assistants: state across conversations

A thread remembers one chat. An assistant remembers the user across many chats. Memory is stored per assistant, so to carry facts into a brand new conversation you reuse the same assistant_id and start a fresh thread.

Python

# Conversation 1
await client.send_message(
    "I'm allergic to peanuts.",
    assistant_id="your-assistant-id",
    memory="Auto",
)

# Conversation 2: new thread, same assistant, memory carries over
reply = await client.send_message(
    "Any dietary restrictions you remember?",
    assistant_id="your-assistant-id",
    memory="Auto",
)
print(reply.content)  # "You mentioned you're allergic to peanuts."
Enter fullscreen mode Exit fullscreen mode

JavaScript (Node 18+)

await send({
  content: "I'm allergic to peanuts.",
  assistant_id: "your-assistant-id",
  memory: "Auto",
});

const reply = await send({
  content: "Any dietary restrictions you remember?",
  assistant_id: "your-assistant-id",
  memory: "Auto",
});

console.log(reply.content);
Enter fullscreen mode Exit fullscreen mode

cURL

curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "I am allergic to peanuts.", "assistant_id": "your-assistant-id", "memory": "Auto"}'

curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "Any dietary restrictions you remember?", "assistant_id": "your-assistant-id", "memory": "Auto"}'
Enter fullscreen mode Exit fullscreen mode

This is the part that normally requires a vector database: embedding facts, storing vectors, running similarity search on every request. Here it is one parameter, memory="Auto", and the assistant owns it.

When to pass what

Goal Pass
Keep talking in the same chat The same thread_id every call
New chat, but remember the user Omit thread_id, reuse the same assistant_id with memory="Auto"
One assistant, many users One assistant_id, a separate thread_id per user

That last row is the whole pattern for a multi-user app. One assistant defines your AI. Each user gets their own thread. State stays separated without a schema you designed, a migration you ran, or a database you babysit.

The point

Stateless models force you to build a state layer. Backboard makes that layer part of the API. Threads hold the conversation. Assistants hold the profile and the memory. Both persist server-side. You ship a stateful, multi-user AI app and never write a line of database code.

Grab a key and try it: app.backboard.io

Architecture in full: docs.backboard.io/concepts/architecture

Top comments (9)

Collapse
 
syedahmershah profile image
Syed Ahmer Shah

Calling the stateless nature of LLMs an 'engineering tax' is the perfect way to frame this. Everyone starts building an AI app thinking about prompt engineering, only to realize 80% of their time is actually spent babysitting database schemas, managing Redis caches, and manually stitching together conversation arrays.

Collapse
 
benjamin_nguyen_8ca6ff360 profile image
Benjamin Nguyen

I am curious! Are you using any security protocol to prevent any data breach's from hacker?

Collapse
 
jon_at_backboardio profile image
Jonathan Murray Backboard.io

Yes. We're SOC 2, with encryption in transit and at rest, and scoped API keys you control and can rotate to lock down access. You also keep full CRUD over your threads and memory, so you can export or delete your data anytime. And if you don't want data leaving your walls at all, you can run Backboard on an open-source model fully on-prem or air-gapped, so there's no external attack surface to breach. Happy to go deeper if you're evaluating for something specific.

Collapse
 
benjamin_nguyen_8ca6ff360 profile image
Benjamin Nguyen

neat! Sure. Do you have linkedin? If you feel more comfortable to continue our conservation there.

Thread Thread
 
jon_at_backboardio profile image
Jonathan Murray Backboard.io
Thread Thread
 
benjamin_nguyen_8ca6ff360 profile image
Benjamin Nguyen

Thank you! I will sent you an invitation.

Thread Thread
Collapse
 
uzoma_uche_3ec83974b4a8a5 profile image
Echo

Good walkthrough. The 'observe before you optimize' framing is what most agent benchmarks skip.

Collapse
 
jon_at_backboardio profile image
Jonathan Murray Backboard.io

Thanks! We've been doing a poor job communicating all of our stack value lol so I'm trying to release a blog a day explaining the different features and benefits!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.