RAG - Vector DB

#ai #beginners

What is a Vector Database?

A vector database is a database used to store vectors (points in space) where data with similar meanings are positioned close together. These vectors are generated using embedding models or LLM embedding models. One of the embedding models is nomic-embed-text. We can download this model using Ollama.

Why Vector DB in RAG?

One-hot encoding is a technique used to convert categorical data (like words) into binary vectors.

How it works:

Each unique word in a vocabulary is mapped to a vector that is mostly zeros except for a single 1 at a specific index.

Example:

Today is Wednesday
Tomorrow is Thursday
I am travelling Today
Wednesday is a nice series

Vocabulary values:

[Today, is, Wednesday, Tomorrow, Thursday, I, am, Travelling, a, nice, series]

Vector representation:

Line 1 = [1,1,1,0,0,0,0,0,0,0,0]
Line 2 = [0,1,0,1,1,0,0,0,0,0,0]
Line 3 = [1,0,0,0,0,1,1,1,0,0,0]
Line 4 = [0,1,1,0,0,0,0,0,1,1,1]

Disadvantages:
No semantic meaning
High dimensionality
Not scalable

Because of these limitations, modern RAG systems use vector databases where chunks are converted into vectors in a high-dimensional space, where similar meanings are positioned close together.

How Data is Stored In a vector DB:

Documents will be split into chunks. Each chunk will be converted into a vector using an embedding model. The resulting vector will be stored in the vector DB. Chunks with similar semantic meaning are stored closer together in vector space.

Similarity Search

When a user query arrives, the LLM will search for the vectors that are closest to the user query by distance.

To calculate the distance, we can use:
Euclidean Distance (based on the Pythagorean theorem)
Manhattan method
Cosine similarity (finds the smaller angle to the user vector)

Calculating similarity against every vector becomes computationally expensive. For that, we use ANN and KNN algorithms.

Popular Vector DBs

Some of the popular vector databases are:
Chroma
FAISS
Pinecone
Qdrant – commonly used for embeddings, semantic search, and image similarity search.
MongoDB – It also has vector database support

End-to-End Flow

Data Ingestion
Data or documents will be split into chunks.
Each chunk will be converted into vectors using embedding models
Stored in the vector DB.

Data Retrieval
User query will be converted into a vector using an embedding model. Semantically related vectors will be obtained using search algorithms in the vector DB. Along with the user query, the retrieved chunks are provided to the LLM as context to get output in human-readable format.

DEV Community