Josh Mo

Posted on Feb 5, 2025 • Edited on Apr 18, 2025

Building RAG using Rig, SurrealDB & DeepSeek on Cloudflare Workers

Introduction

Retrieval-Augmented Generation (RAG) enhances AI responses by grounding them in relevant external knowledge. In this tutorial, you'll learn how to build a Cloudflare Worker that runs a RAG pipeline using SurrealDB, Rig AI, and DeepSeek via GroqCloud - all at blazing speeds of 275 tokens per second!

By the end of this guide, you'll have an understanding of how you can effectively use SurrealDB to create a Cloudflare worker that can execute Retrieval Augmented Generation (RAG) by creating a Cargo workspace with the following:

An application that can seed a SurrealDB instance
A Cloudflare worker that can carry out cosine similarity based vector search in a SurrealDB instance and use the found document to help answer a user query.

Interested in checking out the final code or just want to skip to the end result? Have a look at the code repository.

Getting Started

Pre-requisites

To get started with this tutorial, you will need the following:

A SurrealDB Cloud account
A Cloudflare account
An OpenAI account, with an API key
A GroqCloud account, with an API key

Before we go any further, you will need to ensure that your OpenAI and Groq API keys are stored as env variables locally to be able to run the SurrealDB instance locally. You can do this like so:

export OPENAI_API_KEY=<your-openai-api-key>
export GROQ_API_KEY=<your-groq-api-key>

Preparing our SurrealDB instance

To get started, register an account with SurrealDB Cloud. You'll then need to create a SurrealDB instance (they have a free instance available which you can use).

Next, open the instance in Surrealist (using the Connect drop-down) and you'll find a query box. Paste the following into the query and run it:

-- define and use namespace/database for queries
DEFINE NAMESPACE cfw_surrealdb_rig;
USE NAMESPACE cfw_surrealdb_rig;
DEFINE DATABASE cfw_surrealdb_rig;
USE DATABASE cfw_surrealdb_rig;

-- define root user
-- note that in prod, your roles should be properly locked down
DEFINE USER root ON ROOT PASSWORD 'root' ROLES OWNER;

-- define table & fields
DEFINE TABLE words SCHEMAFULL;
DEFINE field word on table words type string;
DEFINE field description on table words type string;
DEFINE field embedding on table words type array<float>;

-- define index on embedding field
DEFINE INDEX IF NOT EXISTS words_embedding_vector_index ON words
     FIELDS embedding
     MTREE DIMENSION 1536
     DIST COSINE;

This will do the following:

Create a new namespace & database, and then use it for the additional steps in the query
Define a new user with Owner permissions
Defines a table that has a schema with relevant fields for the word, a description and the embedding representation of the description
Defines an index on the embedding field to make vector search faster

Note that you will need the namespace & database names (cfw_surrealdb_rig) and the username & password for logging in later (root/root). Make sure you also store the SurrealDB hostname somewhere - you can find this by clicking the three vertical dots at the top right of your instance (from the Instances menu), then clicking "Copy hostname".

Project Setup

To get started, we'll create a workspace:

mkdir cfw-surrealdb-rig
cd cfw-surrealdb-rig

This will eventually hold the following:

A small program for filling the SurrealDB instance with data
Our Cloudflare Worker

For now, we will create a Cargo.toml file in our main file to signify that we're using Cargo workspaces:

# Cargo.toml
[workspace]
members = []
resolver = "2"

[workspace.dependencies]
rig-core = { version = "0.7.0", default-features = false, features = [
    "worker",
    "derive",
] }
surrealdb = { version = "2.1.4", features = ["protocol-ws"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = "1.0.138"

See the list below for an explanation of dependencies:

rig-core: The AI framework we'll be using to connect with the various providers we need.
surrealdb: The Rust crate for using SurrealDB. We connect to it via Websocket, so the websocket protocol has been added.
serde: A (de)serialization library. We add the derive feature for extremely easy derive macro usage on structs.
serde-json: A library that allows (de)serialization to/from JSON.

Because most of our dependencies will be shared, we can store them in the workspace Cargo.toml file. When creating our new projects, we can then simply set workspace = true rather than having to manually add everything.

Seeding our SurrealDB instance

(note: If you want to skip writing all the code for this step, you can do this by visiting the repo and following the instructions)

To get started we'll create a new program in our workspace:

cargo init seeder
cd seeder

Next, we will add the following dependencies specific to this workspace crate with cargo add:

cargo add tokio

We'll also additionally add our workspace dependencies:

# cfw-surrealdb-rig/seeder/Cargo.toml
## .. rest of your .toml file up here

[dependencies]
tokio = "1.43.0" # or whatever the latest version is
surrealdb = { workspace = true }
rig-core = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }

Now it's time to start coding!

We'll firstly set up our main function to use the Tokio runtime and properly return an error instead of forcing a panic at runtime - this will allow us to use the ? operator which saves space and allows callstack propagation:

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // .. your code here
}

Next, we'll connect to our SurrealDB instance (put this code in your main function!). Note that these values have been filled in according to the assumed values, so if you're using different values you'll want to quickly double-check you have the right values here.

use surrealdb::{
    engine::remote::ws::Wss,
    opt::auth::Root,
    Surreal,
};

let surreal =
        Surreal::new::<Wss>("<your-surrealdb-hostname-here>")
            .await?;

surreal.signin(Root {
            username: "root",
            password: "root",
        })
        .await?;

surreal.use_ns("cfw_surrealdb_rig")
       .use_db("cfw_surrealdb_rig")
       .await?;

Next, we will create an OpenAI client that will be used to embed documents. We'll use the OPENAI_API_KEY environment variable to set up our OpenAI client and we'll select our embedding model.

    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    let embedding_model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002);

Next we'll create our data to be embedded. Note that we implement a new trait, rig::Embed, which allows the struct to be automatically embedded. During the embedding process, our seeder will only embed the description, then when we iterate on the final Embeddings struct it'll map into embeddings and the documents themselves.

use rig::embeddings::EmbeddingModel;

#[derive(rig::Embed, Serialize, Deserialize, Clone, Debug, Eq, PartialEq, Default)]
struct WordDefinition {
    id: String,
    word: String,
    #[embed]
    description: String,
}

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Default)]
struct WordDefinitionEmbed {
    word: String,
    description: String,
    embedding: Vec<f64>,
}


let embeddings = EmbeddingsBuilder::new(embedding_model.clone())
            .documents(vec![
                WordDefinition {
                    id: "doc0".to_string(),
                    word: "flurbo".to_string(),
                    description:
                        "1. *flurbo* (name): A flurbo is a green alien that lives on cold planets.".to_string(),
                },
                WordDefinition {
                    id: "doc1".to_string(),
                    word: "glarb-glarb".to_string(),
                    description: "2. *glarb-glarb* (noun): A fictional creature found in the distant, swampy marshlands of the planet Glibbo in the Andromeda galaxy.".to_string()

                },
                WordDefinition {
                    id: "doc2".to_string(),
                    word: "linglingdong".to_string(),
                    description: "1. *linglingdong* (noun): A rare, mystical instrument crafted by the ancient monks of the Nebulon Mountain Ranges on the planet Quarm.".to_string()

                },
            ])?
            .build()
            .await?;

let embeddings = embeddings
        .into_iter()
        .map(|(d, embeddings)| {
            let embedding: Vec<f64> = embeddings.first().vec.iter().map(|&x| x as f64).collect();
            WordDefinitionEmbed {
                word: d.word,
                description: d.description,
                embedding,
            }
        })
        .collect::<Vec<WordDefinitionEmbed>>();

for embedding in embeddings {
    let _: Option<WordDefinitionEmbed> = surreal.create("words").content(embedding).await?;
    }

Ok(())

If we run this with cargo run -p seeder, you should now have some records in your SurrealDB instance. You can verify this by going back to your SurrealDB instance and connecting with Surrealist (then using SELECT * FROM words as the query), or using the Explorer.

Setting up the worker

Next, we'll need to build our worker. Note that to do this, we'll need to be able to use only libraries that compile to WebAssembly - fortunately everything we're using in this post does by default!

If you haven't already, you'll need to install the wasm32-unknown-unknown compilation target, as well as cargo-generate which is used to initialise a lot of different types of boilerpalate:

rustup target add wasm32-unknown-unknown
cargo install cargo-generate

Next, we'll use cargo-generate to create our boilerplate:

cargo generate cloudflare/workers-rs

Make sure to select the hello world template, as that's what we'll be using. Note that upon generation, the package will be automatically added to your workspace - so you don't need to add it yourself.

We'll also additionally add our workspace dependencies:

# cfw-surrealdb-rig/seeder/Cargo.toml
## .. rest of your .toml file up here

[lib]
crate-type = ["cdylib"]

[dependencies]
worker = { version = "0.5.0" }
worker-macros = { version = "0.5.0" }
console_error_panic_hook = { version = "0.1.1" }
rig-core = { workspace = true }
surrealdb = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }

Like before, we need to connect to our SurrealDB instance - but this time we'll be using env.secret instead of std::env::var. This is because Cloudflare Workers use the Cloudflare deployment environment which supports environment variables in this manner rather than environment variables.

use surrealdb::{
    engine::remote::ws::Wss,
    opt::auth::Root,
    Surreal,
};

#[derive(Deserialize)]
struct UserQuery {
    contents: String,
}

#[event(fetch)]
async fn fetch(mut req: Request, env: Env, _ctx: Context) -> Result<Response> {
    // serialize raw json body to UserQuery struct type
    let query: UserQuery = req.json().await?;

    // environment secrets (we will talk about this later)
    let surrealdb_url = env.secret("SURREALDB_URL")?.to_string();
    let surrealdb_user = env.secret("SURREALDB_USER")?.to_string();
    let surrealdb_password = env.secret("SURREALDB_PASSWORD")?.to_string();
    let surrealdb_ns = env.secret("SURREALDB_NS")?.to_string();
    let surrealdb_db = env.secret("SURREALDB_DB")?.to_string();
    let openai_api_key = env.secret("OPENAI_API_KEY")?.to_string();
    let groq_api_key = env.secret("GROQ_API_KEY")?.to_string();

    // then we comnnect to SurrealDB as usual
    let surreal = Surreal::new::<Wss>(&surrealdb_url)
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?; // Log into the database

    surreal
        .signin(Root {
            username: &surrealdb_user,
            password: &surrealdb_password,
        })
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?;

    surreal
        .use_ns(&surrealdb_ns)
        .use_db(&surrealdb_db)
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?;

    // .. rest of code goes here
}

Next, we'll create our OpenAI embedding model as usual and embed our text. We will query SurrealDB using the new embedding to find documents with the highest cosine similarity to our embedding. We then use .take(0) to retrieve a result.

Note that because we haven't set a threshold for the similarity score, this should always return a record as long as there is at least 1 record in the words table.

use rig::providers::openai::{Client, TEXT_EMBEDDING_ADA_002};
use rig::Embed;

// create openAI client
let openai_client = Client::new(&openai_api_key);

#[derive(Serialize, Deserialize, Clone, Debug, Default)]
struct EmbedResult {
    word: String,
    description: String,
    score: f32,
}

let embedding_model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002);

let text: Vec<f64> = embedding_model
        .embed_text(&query.contents)
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?
        .vec
        .iter()
        .map(|&x| x as f64)
        .collect();

let mut result = surreal
        .query("SELECT word, description, vector::similarity::cosine($vec, embedding) as score FROM words order by score desc limit 1")
        .bind(("vec", text))
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?;

let result: Vec<EmbedResult> = result
        .take(0)
        .map_err(|x| worker::Error::RustError(x.to_string()))?;

let result_as_string = result
        .into_iter()
        .map(|x| x.description)
        .collect::<Vec<String>>()
        .join("\n");

Finally, we just need to add our Groq client. We can leverage the openai module to be able to quickly and efficiently access the GroqCloud chat completion API as it has full compatibility with the OpenAI chat completion API.

use rig::completion::Prompt;

    let groq_client = Client::from_url(&groq_api_key, "https://api.groq.com/openai/v1");

    let chat_model = groq_client
        .agent("deepseek-r1-distill-llama-70b")
        .preamble("You are an agent designed to answer questions based on user input.")
        .build();

    let prompt = format!(
        "{}

            Context: {result_as_string}",
        query.contents
    );

    let answer = chat_model
        .prompt(&prompt)
        .await
        .map_err(|x| worker::Error::RustError(x.to_string()))?;

    Response::ok(answer)

Deployment

Now for the fun part: deploying our worker! For this you'll need to make sure to either have the wrangler CLI installed or npm (which you can then either use npm install to install wrangler, or use npx wrangler to use the CLI that way). Don't forget to use wrangler login to give wrangler access to your Cloudflare account if this is your first time!

Secrets

Although you can add secrets directly from the Workers web UI. That being said if you're already in your development environment it's often easier to just write your secrets in a git-ignored file and then use them that way. The wrangler CLI accepts uploading secrets in bulk by including them as a JSON file - which we'll do below.

To get started, create a file called secrets.json that looks like this (make sure to fill out the relevant variables):

{
  "SURREALDB_URL": "<surrealdb-instance-url-here>",
  "SURREALDB_USER": "root",
  "SURREALDB_PASSWORD": "root",
  "SURREALDB_NS": "cfw_surrealdb_rig",
  "SURREALDB_DB": "cfw_surrealdb_rig",
  "OPENAI_API_KEY": "<openai-api-key-here>",
  "GROQ_API_KEY": "<groq-api-key-here>"
}

Once done, use wrangler secret bulk --json secrets.json to upload your secrets in bulk. Nice and easy!

Observability

Enabling logging for your Worker, especially in production, can be hugely useful for debugging issues and getting your worker back up again.

To enable it, add the observability object to your wrangler.toml file then add the enabled = true property under it.

# wrangler.toml
[observability]
enabled = true

Now if there's any issues, you should be able to find what the issue is by simply looking at your logs for the worker from the web UI!

How to deploy

To deploy, all you need to do is use wrangler deploy. That's it! The wrangler CLI will then build your worker and upload it to Cloudflare to be used whenever someone triggers the endpoint.

Finishing Up

Thanks for reading! Hopefully this article has given you a better idea of how you can use SurrealDB with Rig to make effective AI agents.

Extending this further

Interested in extending this example? Here's a couple of ideas you can try:

Got a bunch of related documents you want to embed? You could try implementing Graph RAG to improve on the initial RAG method.
You could also try implementing semantic routing and other common RAG techniques that greatly improve the effectiveness of your AI agents.

DEV Community