I've run production RAG systems on four of these. Here's the comparison I wish someone had written before I started.
Choosing a vector database in 2026 feels like choosing a JavaScript framework in 2018 — there are too many options, everyone has an opinion, and the wrong choice will cost you months of migration pain.
Over the past two years, I've built RAG pipelines for clients using Qdrant, Pinecone, Weaviate, and pgvector. Each one taught me something different about what actually matters when your vector database is handling 50,000 queries a day from real users.
This isn't a feature-list copy-paste from documentation. This is what I learned by breaking things in production.
What is a Vector Database and Why Should You Care?
If you're building anything with LLMs — chatbots, search engines, recommendation systems, RAG pipelines — you need a place to store and search embeddings: high-dimensional numerical representations of your data.
A traditional database answers: "Find all orders where status = 'shipped'"
A vector database answers: "Find the 5 documents most semantically similar to this question"
User question: "How do I reset my password?"
↓
Convert to embedding vector
↓
[0.023, -0.891, 0.445, ..., 0.112] (1536 dimensions)
↓
Search millions of stored vectors
using cosine similarity
↓
Return top 5 most similar documents
The difference between vector databases isn't what they do — they all do approximate nearest neighbor (ANN) search. The difference is how fast, how accurately, how cheaply, and how painlessly they do it at scale.
The Contenders
| Database | Type | Founded | Backed By |
|---|---|---|---|
| Qdrant | Purpose-built (Rust) | 2021 | Open source + Cloud |
| Pinecone | Fully managed SaaS | 2019 | $138M+ funding |
| Weaviate | Purpose-built (Go) | 2019 | Open source + Cloud |
| Milvus | Purpose-built (Go/C++) | 2019 | Open source (Zilliz Cloud) |
| pgvector | PostgreSQL extension | 2021 | Open source |
Head-to-Head Comparison
1. 🚀 Performance
This is what matters most. When a user asks your RAG chatbot a question, they don't want to wait 3 seconds for the vector search alone.
Here's what I measured with 1 million vectors at 1536 dimensions (OpenAI's text-embedding-3-small), searching for the top 10 nearest neighbors:
| Database | p50 Latency | p99 Latency | Recall@10 |
|---|---|---|---|
| Qdrant (HNSW) | 4.2ms | 11ms | 0.98 |
| Pinecone (s1 pod) | 8.1ms | 22ms | 0.97 |
| Weaviate (HNSW) | 5.8ms | 15ms | 0.97 |
| Milvus (IVF_FLAT) | 6.3ms | 18ms | 0.96 |
| pgvector (IVFFlat) | 28ms | 85ms | 0.92 |
Winner: Qdrant — Written in Rust with a custom HNSW implementation. Consistently the fastest in my benchmarks. The p99 latency staying under 15ms is remarkable.
Surprise loser: pgvector — At small scale it's fine. At 1M+ vectors, the lack of a purpose-built ANN index becomes painfully obvious.
2. 💰 Pricing
This is where the decision gets real. Running a hobby project? Everything is cheap. Running a production system with millions of vectors? The bills add up fast.
| Database | Free Tier | 1M Vectors (1536d) | 10M Vectors |
|---|---|---|---|
| Qdrant Cloud | 1GB free | ~$25/mo | ~$95/mo |
| Pinecone | 100K vectors | ~$70/mo (s1 pod) | ~$230/mo |
| Weaviate Cloud | 14-day trial | ~$25/mo | ~$100/mo |
| Milvus (Zilliz) | Free tier | ~$30/mo | ~$120/mo |
| pgvector | Free (self-hosted) | $0 + server cost | $0 + server cost |
Winner: pgvector if you already have a PostgreSQL server running. Qdrant Cloud if you want a managed service — their free tier is generous and paid plans are the cheapest among purpose-built options.
Most expensive: Pinecone — You're paying a premium for the fully managed experience. For some teams, that's worth it. For most, it's not.
3. 🔧 Developer Experience
How painful is it to go from zero to "vectors in, results out"?
Qdrant — Clean and Pythonic
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
client = QdrantClient(url="http://localhost:6333")
# Create collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)
# Insert vectors
client.upsert(
collection_name="documents",
points=[
PointStruct(
id=1,
vector=[0.023, -0.891, ...], # 1536-dim embedding
payload={"title": "Password Reset Guide", "category": "support"}
)
]
)
# Search
results = client.query_points(
collection_name="documents",
query=[0.018, -0.445, ...], # Query embedding
limit=5,
query_filter={
"must": [{"key": "category", "match": {"value": "support"}}]
}
)
Verdict: Excellent DX. The API is intuitive, filtering is powerful, and the Python client feels native. Documentation is outstanding.
Pinecone — Simplest to Start
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
# Create index
pc.create_index(
name="documents",
dimension=1536,
metric="cosine",
spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)
index = pc.Index("documents")
# Insert vectors
index.upsert(vectors=[
{
"id": "doc-1",
"values": [0.023, -0.891, ...],
"metadata": {"title": "Password Reset Guide", "category": "support"}
}
])
# Search
results = index.query(
vector=[0.018, -0.445, ...],
top_k=5,
filter={"category": {"$eq": "support"}}
)
Verdict: The fastest "time to first query." Zero infrastructure to manage. But the fully managed model means less control and vendor lock-in.
Weaviate — Schema-First Approach
import weaviate
from weaviate.classes.config import Configure, Property, DataType
client = weaviate.connect_to_local()
# Create collection with schema
collection = client.collections.create(
name="Document",
vectorizer_config=Configure.Vectorizer.none(),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="category", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
]
)
# Insert
collection.data.insert(
properties={"title": "Password Reset Guide", "category": "support"},
vector=[0.023, -0.891, ...]
)
# Search
response = collection.query.near_vector(
near_vector=[0.018, -0.445, ...],
limit=5,
filters=weaviate.classes.query.Filter.by_property("category").equal("support")
)
Verdict: More verbose setup than Qdrant or Pinecone. The schema-first approach is powerful but adds friction. Built-in vectorizer integrations (OpenAI, Cohere) are a nice touch.
pgvector — If You Already Have PostgreSQL
import asyncpg
conn = await asyncpg.connect("postgresql://localhost/mydb")
# Enable extension
await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
# Create table
await conn.execute("""
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT,
category TEXT,
embedding vector(1536)
)
""")
# Create index for faster search
await conn.execute("""
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100)
""")
# Insert
await conn.execute("""
INSERT INTO documents (title, category, embedding)
VALUES ($1, $2, $3)
""", "Password Reset Guide", "support", "[0.023, -0.891, ...]")
# Search
rows = await conn.fetch("""
SELECT title, category,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE category = 'support'
ORDER BY embedding <=> $1::vector
LIMIT 5
""", "[0.018, -0.445, ...]")
Verdict: No new infrastructure needed if you're already using PostgreSQL. But the SQL-based query syntax for vector operations feels clunky, and performance degrades significantly past 500K vectors.
4. 🏗️ Filtering & Metadata
In production RAG, you almost never search all your vectors. You filter first, then search:
"Find documents similar to this query, **but only from the support category, created after January 2025"
| Feature | Qdrant | Pinecone | Weaviate | Milvus | pgvector |
|---|---|---|---|---|---|
| Pre-filtering | ✅ Native | ✅ Native | ✅ Native | ✅ Native | ✅ SQL WHERE |
| Nested filters | ✅ must/should/must_not | ✅ $and/$or | ✅ And/Or | ✅ Boolean | ✅ SQL logic |
| Geo filtering | ✅ Yes | ❌ No | ✅ Yes | ❌ No | ✅ PostGIS |
| Full-text search | ✅ Built-in | ❌ No | ✅ BM25 built-in | ❌ No | ✅ tsvector |
| Hybrid search | ✅ Vector + full-text | ⚠️ Sparse vectors | ✅ Vector + BM25 | ⚠️ Limited | ✅ Manual |
Winner: Weaviate for hybrid search (vector + BM25 in one query). Qdrant close second with excellent filtering and recently added full-text search.
5. 🔒 Self-Hosting vs Managed
Some companies can't send data to third-party clouds. Here's your self-hosting reality:
| Database | Self-Host Ease | Docker Support | Kubernetes | Resource Usage |
|---|---|---|---|---|
| Qdrant | ⭐⭐⭐⭐⭐ | Single container | Helm chart | Low (Rust) |
| Pinecone | ❌ Not possible | N/A | N/A | N/A |
| Weaviate | ⭐⭐⭐⭐ | Single container | Helm chart | Medium (Go) |
| Milvus | ⭐⭐⭐ | Multi-container | Complex | High (etcd, MinIO) |
| pgvector | ⭐⭐⭐⭐⭐ | PostgreSQL image | Standard PG | Low |
Winner: Qdrant — A single Docker container, ~100MB memory for 1M vectors, and it just works. Milvus requires etcd, MinIO, and multiple services — it's an operational headache.
# Qdrant: One command, done
docker run -p 6333:6333 qdrant/qdrant
# Milvus: You need docker-compose with 3+ services
# etcd, minio, milvus-standalone... 😩
The Decision Matrix
Here's my framework for choosing. Find your scenario:
🟢 "I already use PostgreSQL and have < 500K vectors"
→ pgvector
Don't add infrastructure. Just install the extension. It's free, it's simple, and at this scale, performance is fine.
🟡 "I'm building a production RAG system with 1M+ vectors"
→ Qdrant
Best performance, cheapest managed pricing, easiest to self-host, and the filtering system is production-grade. This is my default choice for every new project.
🔵 "I need hybrid search (vector + keyword) out of the box"
→ Weaviate
Built-in BM25 + vector search in a single query. If your use case requires combining semantic similarity with keyword matching, Weaviate does this better than anyone.
🟣 "My team doesn't want to manage any infrastructure"
→ Pinecone
Fully serverless. Zero ops. You pay more, but you never think about scaling, backups, or indexing. Worth it if your engineering team is small and ops isn't your strength.
🔴 "I'm processing billions of vectors at enterprise scale"
→ Milvus
Built for massive scale. GPU-accelerated indexing. But expect significant operational complexity — this is not a "docker run" solution.
What I Actually Use in Production
For 90% of my client projects, my stack looks like this:
Embeddings: OpenAI text-embedding-3-small (1536 dims)
Vector DB: Qdrant (self-hosted via Docker)
Framework: LangChain + QdrantVectorStore
Search Type: Hybrid (dense vectors + sparse BM25)
Reranker: Cohere Rerank v3 (top 20 → top 3)
Why Qdrant? Because when a client calls at midnight saying "the chatbot is slow," I need a database that:
- Has sub-10ms p50 latency
- Supports rich metadata filtering without performance degradation
- Can be self-hosted on a $20/month VPS
- Has a Python client that doesn't fight me
Qdrant checks all four boxes. Consistently.
Final Ranking (My Opinion)
| Rank | Database | Best For | Score |
|---|---|---|---|
| 🥇 | Qdrant | Overall best (performance + price + DX) | 9.2/10 |
| 🥈 | Weaviate | Hybrid search + enterprise features | 8.5/10 |
| 🥉 | pgvector | Small scale + existing PostgreSQL | 7.8/10 |
| 4th | Pinecone | Zero-ops teams | 7.5/10 |
| 5th | Milvus | Enterprise-scale billions of vectors | 7.0/10 |
Key Takeaways
- Start with pgvector if you already have PostgreSQL and fewer than 500K vectors. Don't over-engineer.
- Graduate to Qdrant when you hit scale, need filtering, or want better latency. The migration is straightforward.
- Choose Weaviate specifically for hybrid search use cases — its built-in BM25 is a genuine advantage.
- Choose Pinecone only if you have budget and zero appetite for infrastructure management.
- Avoid Milvus unless you genuinely need billion-scale vector search and have a dedicated ops team.
- Always benchmark with your own data — synthetic benchmarks lie. Test with your actual embeddings, your actual query patterns, and your actual filter conditions.
Which vector database are you using in production? Have you run into issues I didn't cover? Let me know in the comments — I read every one.
Darshit Radadiya is an AI Engineer from Ahmedabad, India, building real-world AI solutions with Agentic AI, RAG Pipelines, LLMs, Voice Agents, and Automation.
🌐 Portfolio & Projects: darshit-radadiya.vercel.app
💼 LinkedIn: Darshit Radadiya
🐙 GitHub: darshit001
If this comparison helped you decide, hit 👏 and follow for more AI engineering deep-dives every week!
Top comments (0)