DEV Community

Cover image for Chat with your documents: agentic RAG in a few lines
Jonathan Murray for Backboard.io

Posted on

Chat with your documents: agentic RAG in a few lines

RAG is not new. Chunk a document, embed the chunks, store them in a vector database, run a retrieval step on each query, then feed the results to the model. Every team building with AI has wired this up at least once. It works, and it is also a stack of moving parts you have to assemble and keep running: a parser, an embedding model, a vector store, a retriever, and the glue between them.

Backboard does nothing novel here. It just puts the whole thing behind one API. Upload a file, wait for it to index, ask a question. Retrieval happens automatically inside the same send_message call you already use. The point is not a new idea, it is that it is all unified and easy.

Three steps

  1. Upload a document to an assistant.
  2. Wait for it to reach indexed status.
  3. Ask a question. RAG runs on its own.

Python

pip install backboard-sdk
Enter fullscreen mode Exit fullscreen mode
import asyncio
from backboard import BackboardClient

async def main():
    client = BackboardClient(api_key="YOUR_API_KEY")

    # 1. Create an assistant and upload a file to it
    assistant = await client.create_assistant(
        name="Docs Assistant",
        system_prompt="Answer questions using the uploaded documents.",
    )
    document = await client.upload_document_to_assistant(
        assistant.assistant_id,
        "knowledge-base.pdf",
    )

    # 2. Wait until the document is indexed
    while True:
        status = await client.get_document_status(document.document_id)
        if status.status == "indexed":
            break
        if status.status == "error":
            raise RuntimeError(status.status_message)
        await asyncio.sleep(2)

    # 3. Ask. Retrieval happens inside send_message
    reply = await client.send_message(
        "What are the key points in the document?",
        assistant_id=assistant.assistant_id,
    )
    print(reply.content)
    print(f"Files used: {reply.retrieved_files_count}")

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

JavaScript (Node 18+)

import { readFileSync } from "node:fs";

const KEY = "YOUR_API_KEY";
const base = "https://app.backboard.io/api";

// 1. Create an assistant
const assistant = await fetch(`${base}/assistants`, {
  method: "POST",
  headers: { "X-API-Key": KEY, "Content-Type": "application/json" },
  body: JSON.stringify({
    name: "Docs Assistant",
    system_prompt: "Answer questions using the uploaded documents.",
  }),
}).then((r) => r.json());

// Upload a file (multipart, no Content-Type header so fetch sets the boundary)
const form = new FormData();
form.append("file", new Blob([readFileSync("knowledge-base.pdf")]), "knowledge-base.pdf");

const document = await fetch(
  `${base}/assistants/${assistant.assistant_id}/documents`,
  { method: "POST", headers: { "X-API-Key": KEY }, body: form }
).then((r) => r.json());

// 2. Poll until indexed
let status;
do {
  await new Promise((r) => setTimeout(r, 2000));
  status = await fetch(`${base}/documents/${document.document_id}/status`, {
    headers: { "X-API-Key": KEY },
  }).then((r) => r.json());
} while (status.status !== "indexed" && status.status !== "error");

// 3. Ask
const reply = await fetch(`${base}/threads/messages`, {
  method: "POST",
  headers: { "X-API-Key": KEY, "Content-Type": "application/json" },
  body: JSON.stringify({
    content: "What are the key points in the document?",
    assistant_id: assistant.assistant_id,
  }),
}).then((r) => r.json());

console.log(reply.content);
console.log(`Files used: ${reply.retrieved_files_count}`);
Enter fullscreen mode Exit fullscreen mode

cURL

# 1. Create an assistant
curl -X POST "https://app.backboard.io/api/assistants" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "Docs Assistant", "system_prompt": "Answer questions using the uploaded documents."}'

# Upload a file (use the assistant_id from above)
curl -X POST "https://app.backboard.io/api/assistants/ASSISTANT_ID/documents" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@knowledge-base.pdf"

# 2. Check status until it returns "indexed"
curl "https://app.backboard.io/api/documents/DOCUMENT_ID/status" \
  -H "X-API-Key: YOUR_API_KEY"

# 3. Ask a question
curl -X POST "https://app.backboard.io/api/threads/messages" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "What are the key points in the document?", "assistant_id": "ASSISTANT_ID"}'
Enter fullscreen mode Exit fullscreen mode

That is the whole RAG pipeline. No vector database to provision, no embedding service to call, no retriever to write. You uploaded a file and asked a question.

What "agentic" means here

You never call a retrieval endpoint. When you send a message to an assistant that has documents, Backboard decides what to fetch and pulls the relevant chunks with hybrid search (keyword and vector together), then answers. The response tells you what it used:

reply = await client.send_message(
    "Summarize section 3.",
    assistant_id=assistant.assistant_id,
)
print(reply.retrieved_files)        # filenames used as context
print(reply.retrieved_files_count)  # how many
Enter fullscreen mode Exit fullscreen mode

Want deeper or shallower retrieval? Set tok_k on the assistant. It is the number of chunks pulled per query (default 10).

assistant = await client.create_assistant(
    name="Docs Assistant",
    system_prompt="Answer using the documents.",
    tok_k=20,  # retrieve more context per query
)
Enter fullscreen mode Exit fullscreen mode

Two scopes

Where you upload decides who can see the document:

  • Assistant scope (upload_document_to_assistant): shared across every thread under that assistant. Use it for a knowledge base, product docs, or policies that all users should query.
  • Thread scope (upload_document_to_thread): visible only in that one conversation. Use it for a file a single user drops into a single chat.
# This file is only visible in one conversation
await client.upload_document_to_thread(thread_id, "meeting-notes.pdf")
Enter fullscreen mode Exit fullscreen mode

Same upload, same query, different reach. No extra config.

Supported files

PDFs, Office files (.docx, .pptx, .xlsx), text and data (.txt, .csv, .md, .json, .xml), source code in most languages, and images. Upload it, and it is searchable once indexed.

The point

Agentic RAG is not a new trick. The win is that you do not build it. One upload, one status check, one message, and your assistant answers from your documents with retrieval handled inside the call. It is all in the same API, and that is the entire feature.

Grab a key and try it: app.backboard.io

Documents docs: docs.backboard.io/concepts/documents

Top comments (0)