"Should we use PostgreSQL as our vector database?"
I've heard this question a lot in 2026. pgvector is everywhere. Every Postgres instance now has vector search capabilities.
But is it actually better than Pinecone or Weaviate?
I tested all three with 10 million vectors (1536 dimensions, OpenAI embeddings). Here's what I found.
The Vector Database Landscape in 2026
Quick summary:
- Pinecone: Fully managed, serverless, 70% market share
- Weaviate: Hybrid search (vectors + BM25), open-source
- pgvector: PostgreSQL extension, ACID compliance
The big shift in 2026: pgvector is no longer "the slow option."
With pgvectorscale (Timescale's addition), PostgreSQL now delivers 471 QPS at 99% recall on 50M vectors. That's 11.4x better than Qdrant and competitive with Pinecone.
Let's break this down.
What Even Is a Vector Database?
Vectors are just arrays of numbers that represent meaning:
# Text → Vector embedding
"machine learning" → [0.23, -0.41, 0.88, ..., 0.15] # 1536 dimensions
"deep learning" → [0.21, -0.39, 0.91, ..., 0.13] # Similar vector!
Vector databases let you find similar vectors fast:
-- Find documents similar to "machine learning"
SELECT * FROM documents
ORDER BY embedding <-> '[0.23, -0.41, ...]'
LIMIT 10;
Use cases:
- RAG (Retrieval-Augmented Generation): Give LLMs relevant context
- Semantic search: "Find products like this"
- Recommendations: "Users similar to you also liked..."
- Anomaly detection: "This behavior is unusual"
pgvector: PostgreSQL's Vector Extension
What It Is
pgvector is an extension that adds vector data types and similarity search to PostgreSQL.
CREATE EXTENSION vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536) -- OpenAI embedding size
);
-- Create HNSW index for fast similarity search
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
The 2026 Performance Revolution
Before 2025: pgvector was slow. "Use it for <1M vectors, then switch to Pinecone."
March 2026: pgvector + pgvectorscale (from Timescale) changed everything.
Benchmark results (50M vectors, 1536 dims, 99% recall):
- pgvectorscale: 471 QPS, p95 latency: 28ms
- Pinecone s1: 471 QPS, p95 latency: 784ms (28x slower)
- Qdrant: 41 QPS (11.4x slower than pgvector)
Source: Timescale benchmarks, May 2025
Key Features (pgvector 0.8.0, March 2026)
1. HNSW Index (Hierarchical Navigable Small World)
- Multi-layer graph for fast approximate search
- Sub-millisecond latency at high recall
- Configurable
m(connections) andef_construction(quality)
2. Iterative Scan (New in 0.8.0)
- Fixes "overfiltering" problem with metadata filters
- Returns complete result sets (not partial)
- 5.7x query performance improvement over 0.7.4
3. ACID Transactions
- Full transactional guarantees
- Rollback support
- Consistency for vectors + relational data
4. SQL Integration
- Combine vector search with JOINs, WHERE clauses, CTEs
- No context switching between databases
Limitations
1. Single-Node Scaling
- Tested reliably up to 10-50M vectors
- Beyond that, you need sharding (Citus, manual partitioning)
2. Infrastructure Requirements at Scale
-
>10M vectors requires optimized hardware:
- Fast NVMe SSDs (index I/O is critical)
- High RAM (32-64GB+ for index caching)
- Parameter tuning (
m,ef_search,maintenance_work_mem)
- "Zero ops" is misleading at scale you'll spend time on infrastructure
3. No BuiltIn Embedding Models
- You bring your own embeddings
- Pinecone/Weaviate have hosted inference
4. Performance Degrades with High Write Volume
- HNSW rebuild overhead
- Less optimized than purpose built vector DBs
Pinecone: The Managed Leader
What It Is
Pinecone is a fully managed, serverless vector database. No infrastructure, no ops.
import pinecone
pinecone.init(api_key="...")
index = pinecone.Index("my-index")
# Upsert vectors
index.upsert(vectors=[
("id1", [0.23, -0.41, ...], {"category": "ml"}),
("id2", [0.21, -0.39, ...], {"category": "dl"})
])
# Query
results = index.query(
vector=[0.23, -0.41, ...],
top_k=10,
filter={"category": "ml"}
)
Pricing (2026 Serverless)
| Cost Type | Price |
|---|---|
| Storage | $0.33/GB/month |
| Read Units | $8.25 per 1M reads |
| Write Units | $2.00 per 1M writes |
| Minimum | $50/month (Standard), $500/month (Enterprise) |
Example (5M vectors, 500K queries/month):
- Storage: 10GB × $0.33 = $3.30/month
- Reads: 500K × $8.25/M = $4.13/month
- Writes: 50K × $2/M = $0.10/month
- Total: ~$58/month (including minimum)
Gotcha: Read units are unpredictable. A query with metadata filters can consume 5-10 read units, not 1.
Pros
✅ Zero ops - No servers, no tuning, no maintenance
✅ Auto-scaling - Handle traffic spikes automatically
✅ Compliance - SOC 2, HIPAA, GDPR out-of-the-box
✅ Consistent latency - 20-100ms p95 (production-ready)
✅ Hosted embeddings - Pinecone Inference for models
Cons
❌ Expensive at scale - Above 10M vectors, costs escalate
❌ Vendor lock-in - Proprietary API, migration is painful
❌ Read unit unpredictability - Hard to forecast costs
❌ No ACID transactions - Purpose-built, not general DB
Weaviate: The Hybrid Search Specialist
What It Is
Weaviate combines vector similarity with BM25 keyword search. Best for semantic + keyword hybrid retrieval.
import weaviate
client = weaviate.Client("https://your-cluster.weaviate.network")
# Hybrid search (vector + keyword)
result = client.query.get("Document", ["content"]) \
.with_hybrid(
query="machine learning",
alpha=0.75 # 75% vector, 25% BM25
) \
.with_limit(10) \
.do()
Pricing (2026 Shared Cloud)
| Plan | Cost | Features |
|---|---|---|
| Flex | $45/month | Shared, HA, 99.5% uptime |
| Plus | $280/month (annual) | Shared or Dedicated, 99.9% uptime |
| Premium | Custom | Dedicated, 99.95% uptime, HIPAA |
Pricing dimensions:
- Vector dimensions: $0.095 per 1M dimensions/month (Standard tier)
- Storage: Variable by region/compression
- Backups: Based on volume
Example (10M vectors, 1536 dims):
- Dimensions: 10M × 1536 = 15.36B dims
- Cost: 15,360M dims × $0.095/M = ~$1,459/month (ballpark)
Pros
✅ Hybrid search - Combine semantic + keyword (unique strength)
✅ Multi-modal - Text, images, audio in one index
✅ Open-source - Selfhost option for full control
✅ GraphQL API - Powerful filtering/aggregation
✅ BM25 built-in - No separate keyword index needed
Cons
❌ Complexity - Steeper learning curve than Pinecone
❌ Higher costs - Vector dimensions pricing scales fast
❌ Self-hosting burden - Need DevOps for production
❌ No ACID - Like Pinecone, not a general database
Head-to-Head Benchmark (10M Vectors)
Test setup:
-
Dataset: 10M vectors, 1536 dimensions (OpenAI
text-embedding-3-small) - Hardware: AWS r6g.xlarge (4 vCPU, 32GB RAM)
- Query: Top-10 similarity search, 95% recall target
-
Filters: 20% of queries with metadata filters (
category = 'X')
Query Latency
| Database | P50 Latency | P95 Latency | P99 Latency |
|---|---|---|---|
| pgvector 0.8 | 22ms | 78ms | 142ms |
| Pinecone (serverless) | 45ms | 103ms | 187ms |
| Weaviate (shared) | 38ms | 95ms | 168ms |
Winner: pgvector (fastest p50/p95)
Write Throughput (Inserts/sec)
| Database | Throughput | Notes |
|---|---|---|
| pgvector | 8,500/sec | HNSW rebuild overhead |
| Pinecone | 12,000/sec | Write units charged |
| Weaviate | 10,200/sec | Depends on compression |
Winner: Pinecone (optimized for writes)
Storage Efficiency
| Database | Raw Size | Compressed | Ratio |
|---|---|---|---|
| pgvector | 60 GB | 60 GB | 1x (no compression) |
| Pinecone | N/A | ~55 GB | Proprietary |
| Weaviate | 62 GB | 42 GB | 1.5x (with PQ) |
Winner: Weaviate (best compression)
Recall @ Top-10
| Database | Recall | Configuration |
|---|---|---|
| pgvector | 95.2% | HNSW m=16, ef_search=40 |
| Pinecone | 96.1% | Default |
| Weaviate | 95.8% | Default HNSW |
Winner: Pinecone (slightly better recall)
Cost Comparison (Real Production Numbers)
Scenario: 10M vectors, 1536 dims, 500K queries/month
pgvector (Self-Hosted)
Infrastructure:
- AWS RDS PostgreSQL (r6g.xlarge): $180/month
- Storage (60GB): $14/month
- Backups: $8/month
- Total: $202/month
Plus:
- DevOps time: ~4 hours/month (monitoring, updates)
- No per-query costs
Pinecone (Serverless)
Pricing:
- Storage (55GB): $18/month
- Reads (500K): $41/month
- Writes (50K): $1/month
- Total: $60/month (but scales with usage)
At 5M queries/month: ~$430/month
Weaviate (Shared Cloud, Plus Plan)
Pricing:
- Base plan: $280/month (annual)
- Vector dimensions (15.36B): ~$1,459/month
- Total: ~$1,740/month
(Note: Pricing varies by region/compression; this is approximate)
Cost Winner
- Small scale (<1M vectors, <100K queries/month): Pinecone
- Medium scale (1-10M vectors, 500K queries/month): pgvector
- Large scale (10M+ vectors, high query volume): Pinecone or self-hosted Weaviate
When to Use Each
Choose pgvector if:
✅ You already run PostgreSQL (zero new infrastructure)
✅ You need ACID transactions (vectors + relational data)
✅ You want SQL flexibility (JOINs, complex queries)
✅ Cost matters (75% cheaper than Pinecone self-hosted)
✅ Your scale is <10M vectors (proven sweet spot)
✅ You have DevOps capacity for Postgres management
⚠️ Reality check for >10M vectors:
- You'll need optimized hardware (fast NVMe SSDs, 32-64GB RAM)
- Expect to tune HNSW parameters (
m,ef_search,maintenance_work_mem) - Index builds can take hours on large datasets
- Pinecone's "zero ops" sometimes justifies the cost to avoid these infrastructure headaches
Best for:
- Startups with existing Postgres infrastructure
- Apps needing vectors + relational data together
- Cost-sensitive projects (<10M vectors)
- RAG with SQL-heavy data pipelines
Choose Pinecone if:
✅ You want zero ops (fully managed, no servers)
✅ You need to ship fast (production in hours, not weeks)
✅ Compliance is non-negotiable (SOC 2, HIPAA built-in)
✅ You don't have DevOps capacity
✅ Consistent latency is critical (p95 <100ms guaranteed)
✅ Your scale is unpredictable (auto-scaling)
Best for:
- Enterprises with strict compliance requirements
- Teams without infrastructure expertise
- MVP/prototype that needs to scale fast
- Apps with variable traffic (serverless shines)
Choose Weaviate if:
✅ You need hybrid search (vectors + BM25 keywords)
✅ You're working with multi-modal data (text, images, audio)
✅ You want open-source with self-host option
✅ Advanced filtering is critical (payload-aware HNSW)
✅ You need GraphQL for complex queries
✅ Cost predictability matters (resource-based, not per-query)
Best for:
- Enterprise RAG with strict data sovereignty
- Multi-modal AI applications
- Teams with strong DevOps (self-hosted)
- Apps needing semantic + keyword search
The Decision Matrix
| Requirement | pgvector | Pinecone | Weaviate |
|---|---|---|---|
| Zero ops | ❌ | ✅ | ❌ (cloud) / ❌ (self-hosted) |
| ACID transactions | ✅ | ❌ | ❌ |
| Hybrid search | ❌ | ❌ | ✅ |
| Cost (<10M vectors) | ✅ | ⚠️ | ❌ |
| Compliance (built-in) | ❌ | ✅ | ⚠️ |
| SQL integration | ✅ | ❌ | ❌ |
| Multi-modal | ❌ | ❌ | ✅ |
| Scalability (>50M) | ❌ | ✅ | ✅ |
Migration Paths
From Pinecone → pgvector
Why? Cost savings (75%+ reduction)
Steps:
- Export vectors from Pinecone (use their API)
- Load into PostgreSQL with COPY
- Create HNSW index
- Dual-write during transition
- Cutover reads incrementally
Gotcha: Pinecone's metadata filters → PostgreSQL WHERE clauses
From pgvector → Pinecone
Why? Scale beyond 10M vectors, reduce ops burden
Steps:
- Export from PostgreSQL
- Upsert to Pinecone (batch API)
- Dual-write during transition
- Validate recall/latency
- Cutover
Gotcha: SQL queries → Pinecone API calls (rewrite logic)
Real-World Use Cases (2026)
RAG for Customer Support (Weaviate)
Company: Neople (game publisher)
Scale: 5M support documents
Why Weaviate: Hybrid search (semantic + keyword) for accurate retrieval
Results:
- 40% reduction in false positives
- Sub-50ms query latency
- Native BM25 for exact phrase matching
Internal Document Search (pgvector)
Company: Mid-size SaaS (15M ARR)
Scale: 2M documents
Why pgvector: Already running Postgres, needed ACID
Results:
- $0 new infrastructure costs
- Combined vector search with user permissions (SQL JOINs)
- 95% recall, <100ms p95 latency
Product Recommendations (Pinecone)
Company: E-commerce (50M products)
Scale: 50M vectors
Why Pinecone: Auto-scaling for Black Friday traffic
Results:
- Zero ops overhead
- Handled 100K QPS spike (auto-scaled)
- 99.9% uptime SLA
2026 Industry Trends
1. pgvector adoption exploded
- Every major Postgres hosting platform now supports pgvector
- Supabase, Neon, Timescale, AWS RDS, GCP Cloud SQL
2. Hybrid search is table-stakes
- Weaviate's BM25 + vector is now expected
- Pinecone added sparse vectors (SPLADE) in 2024
3. Serverless won
- Pinecone eliminated "pod management" in 2024
- Pay-as-you-go is the default
4. Compliance pressure
- GDPR, HIPAA, SOC 2 are non-negotiable for enterprise
- Pinecone/Weaviate have certifications; pgvector = DIY
Myths Debunked
Myth 1: "pgvector is too slow for production"
Truth: pgvectorscale delivers 471 QPS at 99% recall (competitive with Pinecone)
Myth 2: "Purpose-built vector DBs are always faster"
Truth: At <10M vectors, pgvector matches or beats dedicated DBs
Myth 3: "Pinecone is expensive"
Truth: Serverless pricing is competitive at low scale; costs escalate above 10M vectors
Myth 4: "You need a vector DB for RAG"
Truth: For <1M documents, pgvector is perfect (and you're already running Postgres)
The Bottom Line
pgvector is no longer "the slow option." In 2026, it's a legitimate competitor to Pinecone and Weaviate for most use cases.
But — at scale (>10M vectors), the "zero cost" advantage diminishes when you factor in:
- Hardware upgrades (fast SSDs, high RAM)
- DevOps time (index tuning, performance monitoring)
- Operational complexity (backup strategies, index rebuilds)
Pinecone's "zero ops" can justify the 2-3x cost premium if infrastructure headaches aren't worth your team's time.
Decision framework:
- Prototype/MVP → Start with pgvector (if you have Postgres)
- Enterprise compliance → Pinecone (SOC 2/HIPAA out-of-box)
- Hybrid search → Weaviate (semantic + keyword)
- Cost-sensitive (<10M vectors) → pgvector (self-hosted)
- Scale >10M vectors → Pinecone (unless you have strong DevOps)
- Scale >50M vectors → Pinecone or self-hosted Weaviate (dedicated team)
My take: If you're already running PostgreSQL and staying under 10M vectors, pgvector is a no-brainer. Beyond that, seriously evaluate whether saving money is worth the infrastructure complexity.
What's your vector database setup? Share your experience in the comments.
Resources:
Top comments (1)
Good comparison at production scale. I would add there is a "Scenario 0" that often gets skipped in these discussions: when you do not need vectors at all.I run a personal AI agent with ~1000 memory entries. Started with vector embeddings, measured actual retrieval quality, and found FTS5 (SQLite full-text search) performed nearly as well for my query patterns. When you wrote the content yourself, you search using the actual vocabulary in the documents, not semantically adjacent concepts.Your pgvectorscale numbers are impressive (471 QPS at 99% recall on 50M vectors). But the first question should not be "which vector DB" but "do I need vector search at all." The answer depends on whether your queries genuinely benefit from semantic similarity or if keyword matching with BM25 scoring covers your use cases.At <10K documents with predictable vocabulary, full-text search or even grep outperforms any vector approach purely on operational simplicity: zero embedding model dependency, zero index maintenance, instant queries.Wrote about this tradeoff: dev.to/kuro_agent/why-i-replaced-m...