🚀 Next Up: Vector Databases (Hands-On!)
We’re about to build a tiny semantic search engine from scratch.
The secret ingredient? Vector databases — a tool that’s surprisingly easy to grasp, but powerful enough to make modern AI work.
By the end of this article, you’ll:
- Understand what embeddings are (and why they’re magic)
- Learn what a vector DB does
- Spin up Qdrant in Docker
- Store and search your own embeddings
- See how this forms the foundation for RAG (Retrieval-Augmented Generation)
Step 1 — From Words to Numbers: Embeddings
Computers don’t understand “dog” or “car” like we do.
Instead, we turn these into embeddings — long lists of numbers that capture meaning and relationships between concepts.
Example:
"dog" → [0.1, 0.6, -0.4, ...]
"puppy" → vector very close to "dog"
"car" → far away from both
Different embedding models (like OpenAI or SentenceTransformers) can convert:
- Text → a 768- or 1536-dimensional vector
- Images → a visual vector
đź“– Want to go deeper?
Step 2 — Why a Vector Database?
Once you have embeddings, you need a way to:
- Store them (there can be millions)
- Quickly find which ones are most similar to a new query
That’s exactly what a Vector Database does.
Instead of “WHERE id = 123” lookups, it supports similarity search — finding the closest k vectors to your query in high-dimensional space.
Key terms you’ll run into:
- ANN Search — Approximate Nearest Neighbor, fast search in huge vector spaces
- Index — special data structure to speed up similarity lookups
- Top-K Search — get the k most similar vectors (e.g., top 5 related docs)
Some popular choices:
Tool | Notes |
---|---|
FAISS | Local, in-memory, blazing fast |
Qdrant | Rust-based, open-source, persistent — we’ll use this |
Pinecone / Weaviate | Fully managed cloud services |
đź“– More on concepts: Qdrant Overview | Qdrant Concepts
🎥 Optional: FAISS Top-K Search explained
Step 3 — Setting Up Qdrant
We’ll run Qdrant locally using Docker:
docker run -p 6333:6333 qdrant/qdrant
(If Docker isn’t an option, try Qdrant Cloud Free Tier).
You can also explore the Qdrant dashboard locally by visiting http://localhost:6333/dashboard in your browser.
Step 4 — Creating Our Project
Let’s set up a Python environment for our experiment:
mkdir vector && cd vector
python -m venv venv
.env\Scriptsctivate
pip install openai qdrant-client numpy sentence-transformers
Step 5 — Storing Embeddings in Qdrant
We’ll embed a few sample sentences and store them in Qdrant:
# embed_and_store.py
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct, VectorParams, Distance
import uuid
# Sample documents
docs = [
"Apple is a fruit",
"Dogs are loyal animals",
"SpaceX launches rockets",
"Oranges are citrus",
"Cats are independent"
]
# Connect to Qdrant
client = QdrantClient(host="localhost", port=6333)
# Create (or reset) a collection
client.recreate_collection(
collection_name="embeddings",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
# Load embedding model
embed_model = SentenceTransformer("all-MiniLM-L6-v2")
def get_embedding(txt): return embed_model.encode(txt)
# Store documents
points = [
PointStruct(id=str(uuid.uuid4()), vector=get_embedding(txt), payload={"text": txt})
for txt in docs
]
client.upsert(collection_name="embeddings", points=points)
print("Stored docs.")
Step 6 — Running a Similarity Search
Now, let’s query for something related to rockets:
# embed_and_store.py
query = get_embedding("Tell me about rockets")
hits = client.search(
collection_name="embeddings",
query_vector=query,
limit=2
)
for hit in hits:
print(hit.payload["text"], "Score:", hit.score)
Example output:
# python embed_and_store.py
SpaceX launches rockets Score: 0.92
Dogs are loyal animals Score: 0.12
Step 7 — From Search to AI: RAG
At this point, you’ve built the backbone of a semantic search system.
But right now, you’re only returning similar documents — not answers.
That’s where RAG (Retrieval-Augmented Generation) comes in:
- Fetch the top results from your vector DB
- Feed them into a large language model (LLM)
- Let the model generate an answer using that context
In the next part, we’ll hook Qdrant up to an LLM so it can respond to natural language questions with real context.
đź’ˇ Key Takeaways:
- Embeddings turn meaning into numbers
- Vector DBs store and search embeddings efficiently
- Qdrant + SentenceTransformers is an easy, powerful local setup
- RAG is the magic step that turns similarity search into a conversational AI
đź’» GitHub Repo
đź”— Other Parts:
Top comments (0)