Choosing a Vector Database in 2026: pgvector vs Pinecone vs Qdrant vs Weaviate

#ai #database #machinelearning #webdev

Once your RAG prototype works, the next decision has real consequences: where do the embeddings live? Pick wrong and you either over-pay for a managed service you didn't need, or you bolt a vector index onto a database that buckles at scale. This is the comparison I wish existed when I had to choose.

We'll look at four representative options — pgvector, Pinecone, Qdrant, and Weaviate — and, more usefully, the questions that should drive the decision.

First, do you even need a dedicated vector database?

If you have under a few hundred thousand vectors and already run Postgres, the answer is often no. pgvector adds vector search to a database you already operate, back up, and trust.

CREATE EXTENSION vector;
CREATE TABLE docs (id bigserial, embedding vector(1536), content text);
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

SELECT content FROM docs
ORDER BY embedding <=> '[...]'   -- cosine distance
LIMIT 5;

One datastore, transactional consistency with your real data, no new service. The trade-off is that very large or very high-QPS workloads will eventually outgrow it, and its filtering/index tuning is less specialized than purpose-built engines.

The four, at a glance

	pgvector	Pinecone	Qdrant	Weaviate
Model	Postgres extension	fully managed SaaS	open-source + managed	open-source + managed
Run it yourself	yes (it's Postgres)	no	yes	yes
Best when	already on Postgres, modest scale	want zero ops, fast scale	want control + strong filtering	want built-in hybrid/modules
Metadata filtering	SQL `WHERE`	good	excellent	good
Ops burden	low (existing DB)	none	medium	medium

Pinecone — pay to not think about infrastructure

Pinecone is fully managed: no servers, no index tuning, scales horizontally on demand. You trade money and vendor lock-in for never paging anyone about your vector store.

Choose it when ops time is more expensive than service fees, you need to scale fast, and you don't want to own infrastructure. Be wary when cost at scale matters or you need to self-host for data-residency reasons.

Qdrant — open-source with serious filtering

Qdrant is a purpose-built vector engine (Rust) with excellent payload filtering — combining vector similarity with structured conditions efficiently, which is exactly what production RAG needs ("similar chunks, but only from this tenant, in the last 90 days").

client.search(
    collection_name="docs",
    query_vector=embedding,
    query_filter={"must": [{"key": "tenant", "match": {"value": "acme"}}]},
    limit=5,
)

Choose it when you want to self-host, need strong metadata filtering, and want a managed option available later. Run it in Docker for dev, scale to a cluster for prod.

Weaviate — batteries-included with hybrid search

Weaviate bundles more out of the box: built-in hybrid search (combining keyword BM25 with vector similarity) and optional modules that generate embeddings for you. It's the "more than a vector store" option.

Choose it when you want hybrid search without wiring it yourself, or want the database to handle vectorization. Be wary when you prefer a minimal, do-one-thing component you fully control.

The questions that actually decide it

Forget feature checklists; answer these:

Scale. Under ~1M vectors and modest QPS? pgvector likely wins on simplicity. Tens of millions and growing? Look at the purpose-built engines.
Ops appetite. Want zero infrastructure? Pinecone (or a managed Qdrant/Weaviate). Happy to run a container? Self-host and save.
Filtering. If most queries are "similar and matching these attributes," weight filtering quality heavily — Qdrant is strong here, and pgvector gets full SQL.
Data gravity. If your source data is already in Postgres, keeping vectors there removes a sync pipeline and a consistency headache.
Hybrid search. Need keyword + semantic together? Weaviate gives it natively; others need extra work.

My default recommendation

Start with pgvector. Most teams over-engineer this decision and reach for a distributed vector database to store 50,000 chunks. Ship on Postgres, measure real query latency and recall, and migrate to a dedicated engine only when you have evidence you've outgrown it. The migration is straightforward once you know your actual access patterns — and you'll choose far better with real numbers than with a benchmark blog post.

If you'd rather skip the boilerplate of wiring up embeddings, indexing, and a swappable retrieval layer, the Vector Database Toolkit provides adapters and ingestion patterns across these backends so you can prototype on pgvector and switch engines later without rewriting your app.

Bottom line

There's no universally best vector database — there's the one that matches your scale, ops appetite, and filtering needs. Resist premature optimization: begin with the simplest thing that could work (often pgvector), instrument it, and let real usage, not hype, tell you when to graduate.