Ayush

Posted on Aug 9

🧠 GenAI as a Backend Engineer: Part 2 - Vector DBs

#ai #python #vectordatabase

🚀 Next Up: Vector Databases (Hands-On!)

We’re about to build a tiny semantic search engine from scratch.

The secret ingredient? Vector databases — a tool that’s surprisingly easy to grasp, but powerful enough to make modern AI work.

By the end of this article, you’ll:

Understand what embeddings are (and why they’re magic)
Learn what a vector DB does
Spin up Qdrant in Docker
Store and search your own embeddings
See how this forms the foundation for RAG (Retrieval-Augmented Generation)

Step 1 — From Words to Numbers: Embeddings

Computers don’t understand “dog” or “car” like we do.

Instead, we turn these into embeddings — long lists of numbers that capture meaning and relationships between concepts.

Example:

"dog"   → [0.1, 0.6, -0.4, ...]  
"puppy" → vector very close to "dog"  
"car"   → far away from both

Different embedding models (like OpenAI or SentenceTransformers) can convert:

Text → a 768- or 1536-dimensional vector
Images → a visual vector

📖 Want to go deeper?

Step 2 — Why a Vector Database?

Once you have embeddings, you need a way to:

Store them (there can be millions)
Quickly find which ones are most similar to a new query

That’s exactly what a Vector Database does.

Instead of “WHERE id = 123” lookups, it supports similarity search — finding the closest k vectors to your query in high-dimensional space.

Key terms you’ll run into:

ANN Search — Approximate Nearest Neighbor, fast search in huge vector spaces
Index — special data structure to speed up similarity lookups
Top-K Search — get the k most similar vectors (e.g., top 5 related docs)

Some popular choices:

Tool	Notes
FAISS	Local, in-memory, blazing fast
Qdrant	Rust-based, open-source, persistent — we’ll use this
Pinecone / Weaviate	Fully managed cloud services

📖 More on concepts: Qdrant Overview | Qdrant Concepts

🎥 Optional: FAISS Top-K Search explained

Step 3 — Setting Up Qdrant

We’ll run Qdrant locally using Docker:

docker run -p 6333:6333 qdrant/qdrant

(If Docker isn’t an option, try Qdrant Cloud Free Tier).

You can also explore the Qdrant dashboard locally by visiting http://localhost:6333/dashboard in your browser.

Step 4 — Creating Our Project

Let’s set up a Python environment for our experiment:

mkdir vector && cd vector
python -m venv venv
.env\Scriptsctivate

pip install openai qdrant-client numpy sentence-transformers

Step 5 — Storing Embeddings in Qdrant

We’ll embed a few sample sentences and store them in Qdrant:

# embed_and_store.py
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct, VectorParams, Distance
import uuid

# Sample documents
docs = [
    "Apple is a fruit",
    "Dogs are loyal animals",
    "SpaceX launches rockets",
    "Oranges are citrus",
    "Cats are independent"
]

# Connect to Qdrant
client = QdrantClient(host="localhost", port=6333)

# Create (or reset) a collection
client.recreate_collection(
    collection_name="embeddings",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)

# Load embedding model
embed_model = SentenceTransformer("all-MiniLM-L6-v2")
def get_embedding(txt): return embed_model.encode(txt)

# Store documents
points = [
    PointStruct(id=str(uuid.uuid4()), vector=get_embedding(txt), payload={"text": txt})
    for txt in docs
]

client.upsert(collection_name="embeddings", points=points)
print("Stored docs.")

Step 6 — Running a Similarity Search

Now, let’s query for something related to rockets:

# embed_and_store.py
query = get_embedding("Tell me about rockets")
hits = client.search(
    collection_name="embeddings",
    query_vector=query,
    limit=2
)
for hit in hits:
    print(hit.payload["text"], "Score:", hit.score)

Example output:

# python embed_and_store.py
SpaceX launches rockets Score: 0.92
Dogs are loyal animals Score: 0.12

Step 7 — From Search to AI: RAG

At this point, you’ve built the backbone of a semantic search system.

But right now, you’re only returning similar documents — not answers.

That’s where RAG (Retrieval-Augmented Generation) comes in:

Fetch the top results from your vector DB
Feed them into a large language model (LLM)
Let the model generate an answer using that context

In the next part, we’ll hook Qdrant up to an LLM so it can respond to natural language questions with real context.

💡 Key Takeaways:

Embeddings turn meaning into numbers
Vector DBs store and search embeddings efficiently
Qdrant + SentenceTransformers is an easy, powerful local setup
RAG is the magic step that turns similarity search into a conversational AI

💻 GitHub Repo

🔗 Other Parts:

DEV Community