DEV Community

Cover image for How to implement a vector database for AI
Matt Angelosanto for LogRocket

Posted on • Originally published at blog.logrocket.com

How to implement a vector database for AI

Written by Ovie Okeh✏️

If you’ve been online in the past year, you’ve most likely come across an AI model in the wild. And for good reason, too: these models are quickly becoming very powerful, with the best ones capable of producing some truly amazing results.

However, AI models are a bit like gourmet chefs. They can create wonders, but they need good quality ingredients. AI models do alright with most input, but they truly shine if they receive input in the format they’re most tuned to. This is the whole point of vector databases.

Over the course of this article, we’ll dive deep into what vector databases are, why they’re becoming increasingly important in the AI world, and then we’ll look at a step-by-step guide to implementing a vector database.

Jump ahead:

What is a vector database?

Before we start exploring vector databases, it's important to understand what vectors are in the context of programming and machine learning.

In programming, a vector is essentially a one-dimensional array of numbers. If you've ever written code that involves 3D graphics or machine learning algorithms, chances are you've used vectors.

const vector4_example = [0.5, 1.5, 6.0, 3.4]
Enter fullscreen mode Exit fullscreen mode

They're just arrays of numbers, typically floating point numbers, that we refer to by their dimensions. For example, a vector3 is a three-element array of floating point numbers, and a vector4 is a four-element array of floating point numbers. It's as simple as that!

But vectors are more than just arrays of numbers. In the context of machine learning, vectors play a critical role in representing and manipulating data in high-dimensional spaces. This enables us to perform complex operations and computations that drive AI models.

Now that we've got a grip on vectors, let's turn our attention to vector databases.

At first glance, you might think, "Hey, since vectors are just arrays of numbers, can't we use our regular databases to store them?" Well, technically, you can. But here's where things get interesting.

A vector database is a specialized type of database that is optimized for storing and performing operations on large amounts of vector data. So while your regular database can indeed store arrays, a vector database goes a step further by providing specialized tools and operations to handle vectors.

In the next section, we'll talk about why vector databases are necessary and the advantages they bring to the table. So stick around, because things are about to get even more interesting!

Why are vector databases necessary?

Now that we have a solid understanding of what a vector database is, let's dive into why they are so necessary in the realm of AI and machine learning.

The key word here is performance. Vector databases often processing hundreds of millions of vectors per query, and this kind of performance is significantly faster than what traditional databases are able to achieve when handling vectors.

So, what makes vector databases so speedy and efficient? Let's take a look at some of the key features that set them apart.

Complex mathematical operations

Vector databases are designed to perform complex mathematical operations on vectors, such as filtering and locating "nearby" vectors. These operations are crucial in machine learning contexts, where models often need to find vectors that are close to each other in a high-dimensional space.

For example, a common data analysis technique, Cosine Similarity, is often used to measure how similar two vectors are. Vector databases excel at these kinds of computations.

Specialized vector indexes

Like a well-organized library, a database needs a good indexing system to quickly retrieve the data requested. Vector databases provide specialized vector indexes that make retrieving data significantly faster and more deterministic (as opposed to stochastic) than traditional databases.

With these indexes, vector databases can rapidly locate the vectors that your AI model needs and produce results quickly.

Compact storage

In the world of big data, storage space is a precious commodity. Vector databases shine here as well, storing vectors in a way that makes them more compact. Techniques like compressing and quantizing vectors are used to keep as much data in memory as possible, which further reduces load and query latency.

Sharding

When dealing with massive amounts of data, it can be beneficial to distribute the data across multiple machines, a process known as sharding. Many databases can do this, but SQL databases in particular take more effort to scale out. Vector databases, on the other hand, often have sharding built into their architecture, allowing them to easily handle large amounts of data.

In short, while traditional databases can store and perform operations on vectors, they aren't optimized for the task. Vector databases, on the other hand, are built for exactly this purpose. They provide the speed, efficiency, and specialized tools necessary to handle large amounts of vector data, making them an essential tool in the world of AI and machine learning.

In the next section we'll compare vector databases with other types of databases and explain how they fit into the larger database ecosystem. We're just getting started!

Implementing a vector database: Step-by-step guide

For the purposes of this guide, we will be using Weaviate, a popular vector database service, to implement a simple vector database that you can build on for any use case.

You can clone a starter template here and run npm install to set it up.

Prerequisites

  • Prior JS knowledge will be helpful: All code written in this tutorial will be in JavaScript and we’ll use the Weaviate JavaScript SDK as well
  • Node and npm: We’ll be working in a JavaScript environment on the server
  • An OpenAI API key: We’ll use their embedding model to convert our data into embeddings to store in our database
  • A Weaviate account: We’ll use their hosted database service; you can get a free account here

Setting up a Weaviate project

After creating an account, you’ll need to set up a project through the Weaviate dashboard. Go to the WCS Console and click Create cluster: Create a Weaviate cluster Select the Free sandbox tier and provide a Cluster name. When it asks you to enable authentication, select Yes: Enable authentication in the cluster Click Create. After a few minutes, you should see a checkmark when it’s finished.

Click Details to see your cluster details, as we’ll need them in the next section. These include:

  • A Weaviate URL
  • Authentication details (Weaviate API key; click the key icon to reveal)

Creating our Node.js project

With the prerequisites out of the way, we can create the vector database and query it. To follow along, you’ll need a new Node project; you can clone the template on GitHub here, which includes everything you’ll need to get started.

Or you can create one by running the following commands:

mkdir weaviate-vector-database && cd weaviate-vector-database
npm init -y && npm install dotenv openai weaviate-ts-client
mkdir src
Enter fullscreen mode Exit fullscreen mode

Edit your package.json file and add a start script, like so:

// ...rest of package.json
  "scripts": {
    "start": "node src/index.js"
  },
// ...rest of package.json
Enter fullscreen mode Exit fullscreen mode

Create a .env file to store sensitive information like API keys. Write the touch .env command and open the newly created .env file in your code editor, then paste in the following and ensure to replace the placeholders with the actual values:

// .env
OPENAI_KEY="<OPENAI_API_KEY>"
WEAVIATE_API_KEY="<WEAVIATE_API_KEY>"
WEAVIATE_URL="<WEAVIATE_URL>"
DATA_CLASSNAME="Document"
Enter fullscreen mode Exit fullscreen mode

Setting up our vector database

With the project setup complete, we can add some code to setup and work with our vector database. Let’s quickly summarize what we’re going to implement:

  • Helper functions, which:
    • Connect to our database
    • Batch vectorize and upload documents
    • Query the most similar items
  • A main function, which uses the helper functions above to upload documents and query the database all in one go

Setting up the client

With that said, let’s create our first file to store our database connection and helper functions. Create a new file by running touch src/database.js and let’s start filling it out:

// src/database.js
import weaviate, { ApiKey } from "weaviate-ts-client";
import { config } from "dotenv";

config();

async function setupClient() {
  let client;

  try {
    client = weaviate.client({
      scheme: "https",
      host: process.env.WEAVIATE_URL,
      apiKey: new ApiKey(process.env.WEAVIATE_API_KEY),
      headers: { "X-OpenAI-Api-Key": process.env.OPENAI_API_KEY },
    });
  } catch (err) {
    console.error("error >>>", err.message);
  }

  return client;
}
// ... code continues below
Enter fullscreen mode Exit fullscreen mode

Let’s break down what’s happening here. First, we import the necessary packages, the Weaviate client and the dotenv configuration. dotenv is a package that loads environment variables from a .env file into process.env. The Weaviate and OpenAI keys and URLs are typically stored in environment variables to keep them secret and out of your codebase.

Here’s what happened in the setupClient() function:

  1. We initialized a variable client
  2. We have a try catch block where we set up the connection to the Weaviate server. If any errors occur during this process, we print the error message to the console
    • Inside the try block, we create a new Weaviate client using the weaviate.client() method. The scheme, host, and apiKey parameters are taken from the environment variables that we've set
  3. Finally, we pass in the header for OpenAI, as we'll be using OpenAI's Ada model to vectorize our data

Migrating data

With the client set up, let’s run a migration with some dummy data, a collection of fictional beings, places, and events. Later on, we’ll query GPT-3 on this data.

If you did not clone the starter template, follow the steps below:

Take some time to explore the data in src/data.js. Then, add a new import at the top of the src/database.js file:

// ...other imports
import { FAKE_XORDIA_HISTORY } from "./data";
Enter fullscreen mode Exit fullscreen mode

Below the setupClient function, add a new function like below:

async function migrate(shouldDeleteAllDocuments = false) {
  try {
    const classObj = {
      class: process.env.DATA_CLASSNAME,
      vectorizer: "text2vec-openai",
      moduleConfig: {
        "text2vec-openai": {
          model: "ada",
          modelVersion: "002",
          type: "text",
        },
      },
    };

    const client = await setupClient();

    try {
      const schema = await client.schema
        .classCreator()
        .withClass(classObj)
        .do();
      console.info("created schema >>>", schema);
    } catch (err) {
      console.error("schema already exists");
    }

    if (!FAKE_XORDIA_HISTORY.length) {
      console.error(`Data is empty`);
      process.exit(1);
    }

    if (shouldDeleteAllDocuments) {
      console.info(`Deleting all documents`);
      await deleteAllDocuments();
    }

    console.info(`Inserting documents`);
    await addDocuments(FAKE_XORDIA_HISTORY);
  } catch (err) {
    console.error("error >>>", err.message);
  }
}
Enter fullscreen mode Exit fullscreen mode

Once again, let’s break down what’s happening here.

The migrate function takes in a single argument, shouldDeleteAllDocuments, which determines whether to clear the database when migrating data.

Within our try…catch block, we create a class object named classObj. This object represents a schema for a class in Weaviate (ensure that you add a CLASS_NAME in the .env file), which uses the text2vec-openai vectorizer. This determines how text documents will be configured and represented in the database, and tells Weaviate to vectorize our data using OpenAI's "ada" model.

We then create the schema using the client.schema.classCreator().withClass(classObj).do() method chain. This sends a request to the Weaviate server to create the document class as defined in the classObj. Once the schema is successfully created, we log the schema object to the console with the message created schema >>>. Errors are now handled with a simple message logged to the console.

We can check the length of the dummy data being migrated. If empty, the code ends here. We can clear the database using a deleteAllDocuments function (which we’ll add later) if shouldDeleteAllDocuments is true.

Finally, using an addDocuments function (which we’ll add next), we upload all the entries to be vectorized and stored in Weaviate.

Adding documents

We can move on to vectorizing and uploading our text documents. This is actually a two-step process where:

  1. The raw text string is converted into vectors using the OpenAI Ada model
  2. The converted vectors are uploaded to our Weaviate database

Thankfully, these are handled automatically by the Weaviate SDK we’re using. Let’s go ahead and create the function to do this. Open the same src/database.js file and paste in the following:

// code continues from above
const addDocuments = async (data = []) => {
  const client = await setupClient();
  let batcher = client.batch.objectsBatcher();
  let counter = 0;
  const batchSize = 100;

  for (const document of data) {
    const obj = {
      class: process.env.DATA_CLASSNAME,
      properties: { ...document },
    };

    batcher = batcher.withObject(obj);
    if (counter++ == batchSize) {
      await batcher.do();
      counter = 0;
      batcher = client.batch.objectsBatcher();
    }
  }

  const res = await batcher.do();
  return res;
};
// ... code continues below
Enter fullscreen mode Exit fullscreen mode

As before, let’s break down what is happening here.

  1. First, we call the previously defined setupClient() function to set up and get a Weaviate client instance
  2. We initialize a batcher using client.batch.objectsBatcher(), which is used to collect documents and upload them to Weaviate all at once, making the process more efficient
  3. We also define a counter variable and a batchSize variable, and set it to 100. The counter keeps track of how many documents have been added to the current batch, and the batchSize defines how many documents should be in each batch
  4. Then, we loop over each document in the data array:
    • For each document, we create an object that represents the document in the format that Weaviate expects so it can be spread into the properties of this object
    • We then add the document object to the batcher using batcher.withObject(obj)
    • If the counter is equal to the batch size (meaning the batch is full), we upload the batch to Weaviate using batcher.do(), reset the counter to 0, and create a new batcher for the next batch of documents

Once all the documents have been processed and added to batches, if there's a batch left that hasn't been uploaded yet (because it didn't reach the batchSize), you can upload that remaining batch with batcher.do().

The final step here occurs when the function returns the response from the last batcher.do() call. This response will contain details about the upload, such as if it was successful and any errors that occurred.

In essence, the addDocuments() function helps us efficiently upload a large number of documents to our Weaviate instance by grouping them into manageable batches.

Deleting documents

Let’s add the code for deleteAllDocuments which was used in the migrate function. Below the addDocuments function, add the code below:

// code continues from above
async function deleteAllDocuments() {
  const client = await setupClient();
  const documents = await client.graphql
    .get()
    .withClassName(process.env.DATA_CLASSNAME)
    .withFields("_additional { id }")
    .do();

  for (const document of documents.data.Get[process.env.DATA_CLASSNAME]) {
    await client.data
      .deleter()
      .withClassName(process.env.DATA_CLASSNAME)
      .withId(document._additional.id)
      .do();
  }
}
// ... code continues below
Enter fullscreen mode Exit fullscreen mode

This function is relatively simple.

  1. We set up the client using setupClient and retrieve the ids of all documents with the class name of Document
  2. Then using a for ... of loop, we delete each document using its id

This approach works because we have a small amount of data. For larger databases, a batching technique would be required to delete all the documents, as the limit for each request is only 200 entries at a time.

Adding querying functions to the database

Now that we have a way to upload data to our database, let’s add a function to query the database. In this case, we’ll be doing a “nearest neighbor search” to find documents that are similar to our query.

In the same src/database.js file, add in the following:

// code continues from above
async function nearTextQuery({
  concepts = [""],
  fields = "text category",
  limit = 1,
}) {
  const client = await setupClient();
  const res = await client.graphql
    .get()
    .withClassName("Document")
    .withFields(fields)
    .withNearText({ concepts })
    .withLimit(limit)
    .do();

  return res.data.Get[process.env.DATA_CLASSNAME];
}

export { migrate, addDocuments, deleteAllDocuments, nearTextQuery };
Enter fullscreen mode Exit fullscreen mode

Again, let’s do a breakdown of what’s going on here:

  1. nearTextQuery() is an asynchronous function that accepts an object as an argument. This object can contain three properties:
    • concepts: An array of string(s) that represents the term(s) we’re searching for
    • fields: A string that represents the fields we want to get back in your search results. In this case, we are requesting from the text and category fields
    • limit: The maximum number of results we want to get back from our search query
  2. We call the setupClient() function to get a Weaviate client instance
  3. We build the GraphQL query using a chain of methods:
    • client.graphql.get(): Initializes a GraphQL query
    • .withClassName("Document"): We specify that we want to search within "Document" objects
    • .withFields(fields): We specify which fields we want to be returned in the results
    • .withNearText({ concepts }): This is where the magic happens! We specify the concepts that Weaviate will use to search for semantically similar documents
    • .withLimit(limit): We specify the maximum number of results to return
    • Finally, .do() executes the query
  4. The response from the query is stored in the res variable, which is then returned on the next line
  5. Finally, we export all functions defined here to be used elsewhere

In short, the nearTextQuery() function helps us search for semantically similar documents in our Weaviate instance based on the provided terms.

Let’s migrate our data so we can query it in the next section. Open your terminal and run npm run start "migrate".

Combining vector embeddings and AI

Large language models like GPT-3 and ChatGPT are designed to process input and generate useful output, a task that requires understanding the intricate meanings and relationships between words and phrases.

They do this by representing words, sentences, or even entire documents as high-dimensional vectors. By analyzing the similarities and differences between these vectors, the AI model can understand the context, semantics, and even subtle nuances in our language.

So, where do vector databases come in? Let's think of vector databases as librarians for the AI model. In a vast library of books (or, in our case, vectors), the AI model needs to quickly find the books that are most relevant to a specific query. Vector databases enable this by efficiently storing these “books” and providing quick and precise retrieval when needed.

This is crucial for many AI applications. For example, in a chatbot application, the AI model needs to find the most relevant responses to a user's question. It does this by turning the user's question and potential responses into vectors, and then using the vector database to find the response that's most similar to the user's question.

With that in mind, we’ll be using our database above to provide an AI model, GPT-3.5, with context on our own data. This will allow the model to answer questions on data it wasn’t trained on.

AI model setup

Create a new file by running touch src/data.js and paste in the following:

import { Configuration, OpenAIApi } from "openai";

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

async function getChatCompletion({ prompt, context }) {
  const chatCompletion = await openai.createChatCompletion({
    model: "gpt-3.5-turbo",
    messages: [
      {
        role: "system",
        content: `
        You are a knowledgebase oracle. You are given a question and a context. You answer the question based on the context.
        Analyse the information from the context and draw fundamental insights to accurately answer the question to the best of your ability.
        Context: ${context}
      `,
      },
      { role: "user", content: prompt },
    ],
  });

  return chatCompletion.data.choices[0].message;
}

export { getChatCompletion };
Enter fullscreen mode Exit fullscreen mode

As usual, let’s break down the file:

  1. We import some required modules from the openai package and initialize an openai instance
  2. We define a getChatCompletion function that takes in a prompt, some context, and configures the GPT-3.5 model to respond as a knowledge base oracle
  3. Finally, we return the response and export the function

Querying our data

With our vector database and AI model set up, we can finally query our data by combining both systems. Using the powerful effects of embeddings and the impressive natural language capabilities of GPT-3.5, we’ll be able to interact with our data in a more expressive and customisable manner.

Start by creating a new file and running touch src/index.js. Then paste in the following:

import { config } from "dotenv";
import { nearTextQuery } from "./database.js";
import { getChatCompletion } from "./model.js";

config();

const queryDatabase = async (prompt) => {
  console.info(`Querying database`);
  const questionContext = await nearTextQuery({
    concepts: [prompt],
    fields: "title text date",
    limit: 50,
  });

  const context = questionContext
    .map((context, index) => {
      const { title, text, date } = context;
      return `
        Document ${index + 1}
        Date: ${date}
        Title: ${title}
        ${text}
  `;
    })
    .join("\n\n");

  const aiResponse = await getChatCompletion({ prompt, context });
  return aiResponse.content;
};

const main = async () => {
  const command = process.argv[2];
  const params = process.argv[3];

  switch (command) {
    case "migrate":
      return await migrate(params === "--delete-all");
    case "query":
      return console.log(await queryDatabase(params));
    default:
      // do nothing
      break;
  }
};

main();
Enter fullscreen mode Exit fullscreen mode

In this file, we're bringing together all the work we’ve done so far to allow us query our data through the command line. As usual, let’s explore what’s going on here:

  1. First, we import the necessary modules and set up our environment variables with the dotenv package
  2. Next, we create a queryDatabase function that takes in a text prompt, which we use to perform a “near text” query on our vector database. We limit our results to 50, and we're specifically asking for the 'title', 'text', and 'date' fields of the matched concepts
  3. This basically returns documents that are semantically similar to any significant terms in our search query (embeddings are powerful!)
  4. We then map over the received context, format it, and pass it on to the AI model to generate a completion. Using the context, the natural language processing (NLP) abilities of GPT-3.5 shine as it’s able to generate much more accurate and meaningful responses based on our data
  5. Finally, we reach the main function. Here, we use command-line arguments to perform various tasks. If we pass migrate, we get to migrate our data (with an optional --delete-all flag, just in case we want to clean our slate and start afresh), and with query, we can test our query function

Testing our query

Congratulations. If you got this far you deserve a pat on the back — and you can finally test your code.

Open the terminal and run this command:

npm run start "query" "what are the 3 most impressive achievements of humanity in the story?"
Enter fullscreen mode Exit fullscreen mode

The query is sent to your Weaviate vector database where it is vectorized, compared to other similar vectors, and returns the 50 most similar ones based on their text. This context data is then formatted and sent along with your query to OpenAI’s GPT-3.5 model where it is processed and a response is generated.

If all goes well, you should get a similar response to the one below: The response from our query test Feel free to explore this fictional world with more queries, or better yet, bring your own data and witness the power of vectors and embeddings firsthand.

If you run into any errors at this point, compare your code with the final version here and ensure you have a .env file created and filled in.

Conclusion and next steps

Throughout this tutorial, we've lightly explored the powerful capabilities of vectors and vector databases. Using tools like Weaviate and GPT-3, we've seen firsthand the potential these technologies have to shape AI applications, from improving personalized chatbots to enhancing machine learning algorithms. Make sure you take a look at our GitHub too!

However, this is just the beginning. If you’d like to learn more about working with vector databases, consider:

  • Diving deeper into advanced concepts such as working with vector metadata, sharding, compression for more flexible and efficient data storage and retrieval
  • Experimenting with more sophisticated ways of integrating vector embeddings into AI applications for richer, more nuanced results

Thanks for sticking to the very end and hopefully, this was a productive use of your time.


Are you adding new JS libraries to improve performance or build new features? What if they’re doing the opposite?

There’s no doubt that frontends are getting more complex. As you add new JavaScript libraries and other dependencies to your app, you’ll need more visibility to ensure your users don’t run into unknown issues.

LogRocket is a frontend application monitoring solution that lets you replay JavaScript errors as if they happened in your own browser so you can react to bugs more effectively.

LogRocket Signup

LogRocket works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store. Instead of guessing why problems happen, you can aggregate and report on what state your application was in when an issue occurred. LogRocket also monitors your app’s performance, reporting metrics like client CPU load, client memory usage, and more.

Build confidently — Start monitoring for free.

Top comments (0)