Every team building a RAG (Retrieval-Augmented Generation) system faces the same question: which vector database should I use?
pgvector, Qdrant, and Milvus are the three dominant options today, representing three distinct philosophies: lightweight integration, high-performance specialization, and distributed scale. Choosing wrong means expensive migrations when your data grows.
This guide covers the core trade-offs to help you decide once and get it right.
Why Vector DB Selection Matters So Much
A vector database does one core job: find the most similar vectors in high-dimensional space, fast.
When a user asks a question, the LLM needs to retrieve the 5 most relevant passages from 100,000 documents. This isn't exact matching — it's Approximate Nearest Neighbor (ANN) search. Your choice of vector DB determines how fast, how accurate, and how scalable that search will be.
Three reasons the choice matters deeply:
- Deep data coupling: Vector embeddings, index structures, and metadata are all stored inside — migrating means reprocessing everything
- Massive performance variance: Same data volume, 10x latency difference between systems
- Dramatic ops complexity difference: From "install a PostgreSQL extension" to "maintain a distributed cluster"
The Three Schools
┌────────────────────────────────────────────────────┐
│ Vector DB Philosophies │
├──────────────┬────────────────┬────────────────────┤
│ pgvector │ Qdrant │ Milvus │
│ Lightweight │ High-perf │ Distributed │
│ Integration │ Specialized │ Scale │
├──────────────┼────────────────┼────────────────────┤
│ PostgreSQL │ Rust-native │ Cloud-native arch │
│ extension │ standalone svc │ independent cluster│
│ Small-medium │ Medium-large │ Hyperscale │
└──────────────┴────────────────┴────────────────────┘
pgvector: Vector Search as a PostgreSQL Column Type
CREATE EXTENSION vector;
CREATE TABLE documents (
id bigserial PRIMARY KEY,
content text,
embedding vector(1536)
);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
SELECT content, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT 5;
Strengths: Zero extra ops overhead, full SQL ecosystem, native hybrid search (vector + SQL filters in one query)
Weaknesses: Performance degrades beyond ~5M rows; no real-time multi-tenant index isolation
Qdrant: A Rust Engine Born for Vector Search
results = client.search(
collection_name="documents",
query_vector=[0.1, 0.2, ...],
query_filter={"must": [{"key": "source", "match": {"value": "blog"}}]},
limit=5
)
Strengths: Lowest latency (Rust), best filtering performance co-optimized with ANN, multi-vector support for multimodal
Weaknesses: Extra service to deploy; distributed mode requires paid plan
Milvus: Industrial-Grade Distributed Vector DB
Strengths: True horizontal scaling (separate data/query/index nodes), billion-scale support, GPU-accelerated indexing
Weaknesses: Complex architecture (etcd + MinIO + Pulsar + multiple node types), steep learning curve
Core Metrics Comparison
| Dimension | pgvector | Qdrant | Milvus |
|---|---|---|---|
| Scale | <5M rows | 1M–100M | 100M+ |
| P99 Latency | 10-100ms | 1-10ms | 5-50ms |
| Ops Complexity | ★☆☆☆☆ | ★★☆☆☆ | ★★★★☆ |
| Filter Queries | Excellent (SQL) | Strong | Medium |
| Horizontal Scale | Limited | Medium (paid) | Native |
| ACID | Full | Eventual | Eventual |
Decision Tree
How many vectors do you expect?
│
├─ < 1M
│ └─ Already using PostgreSQL? → YES: pgvector / NO: Qdrant
│
├─ 1M – 50M
│ └─ Complex filtering + joins? → YES: pgvector (tuned) or Qdrant
│ → NO: Qdrant (preferred)
│
└─ 50M+ / need horizontal scale → Milvus (only reasonable choice)
Production War Stories
pgvector traps: HNSW params (m, ef_construction) are immutable post-creation; always add ORDER BY ... LIMIT N to avoid full scans; consider IVFFlat for dimensions > 2000
Qdrant traps: Collection vector dimension is immutable; payload indexes aren't auto-created; tune hnsw_config.m for your recall requirements
Milvus traps: Benchmark nlist/nprobe before production; compact() must be called manually to reclaim storage; never use Lite mode in production
Conclusion: There's No Best, Only Most Appropriate
- pgvector: Default choice for PostgreSQL shops, fully sufficient under 5M vectors
- Qdrant: Recommended for most standalone AI applications requiring high performance
- Milvus: Industrial-grade solution for genuinely large-scale distributed scenarios
Selection principle: Build first, optimize later. Don't spin up a Milvus cluster today for data you might have three years from now.
References: pgvector 0.7.0 docs | Qdrant Official Benchmarks | Milvus 2.4 docs | Knowledge Card W12D5
Top comments (0)