This article was originally published on aifoss.dev
---
title: 'Chroma vs Qdrant vs Weaviate 2026: RAG Database Compared'
description: 'Compare Chroma, Qdrant, and Weaviate for local RAG in 2026: version snapshots, filtering tradeoffs, hybrid search, quantization, and a clear pick by use case.'
pubDate: 'May 27 2026'
tags: ["vectordb", "ai", "rag", "python", "opensource"]
The three most commonly recommended open-source vector databases for RAG — Chroma, Qdrant, and Weaviate — are not interchangeable. Chroma is a prototyping tool that grew into a real product. Qdrant is a production workhorse written in Rust with the best filtering performance of the three. Weaviate is an enterprise-grade platform with hybrid search and the most built-in integrations. Using Weaviate when you need Chroma adds unnecessary ops overhead. Using Chroma when you need Qdrant means migrating under pressure when your collection outgrows it.
Versions covered: ChromaDB v1.5.9 (May 2026), Qdrant v1.17.1 (March 2026), Weaviate v1.37 (May 2026).
The quick answer
| Situation | Best choice |
|---|---|
| Local prototyping, notebooks, under 100K vectors | Chroma |
| Embedded in a Python process — no separate service | Chroma |
| Production RAG with filtering-heavy queries | Qdrant |
| Multi-user deployment, concurrent queries | Qdrant |
| Memory-constrained deployment at millions of vectors | Qdrant |
| Hybrid search (BM25 + vector in one query) | Weaviate |
| Multi-modal retrieval (text + images + audio) | Weaviate |
| Built-in re-ranking or generative AI modules | Weaviate |
| Kubernetes, team-operated, agentic MCP workflows | Weaviate |
| Getting from zero to working RAG in 10 minutes | Chroma |
What each tool actually is
ChromaDB (Apache 2.0, chroma-core/chroma) started as a pure-Python embedded database and was rebuilt in Rust for the v1.0 release. The Rust core eliminates Python's GIL bottlenecks and delivers roughly 4× faster writes and queries compared to the pre-1.0 implementation — write throughput went from ~10K to ~40K+ vectors/second in server mode. Chroma's design priority is developer ergonomics: pip install chromadb, three lines of Python, and you have a working local vector store. The default mode runs in-process — no Docker, no service to start, no YAML. You can run it in server mode for multi-client access when you're ready.
Qdrant (Apache 2.0, qdrant/qdrant) is written entirely in Rust and optimized for production-grade vector similarity search. It runs as a standalone service via Docker with REST and gRPC APIs. Qdrant's main differentiators are its payload filtering system — which combines vector similarity with structured metadata filters inside the HNSW traversal rather than as a post-filter — and its quantization stack (Scalar, Binary, Product, TurboQuant), which lets you compress large collections by up to 32× to stay within affordable RAM budgets. The Qdrant team publishes transparent benchmarks and consistently posts among the lowest latency at the highest recall in the ANN-benchmarks suite.
Weaviate (BSD-3-Clause, weaviate/weaviate) is the most feature-complete of the three. Written in Go, it combines HNSW vector search with BM25 keyword search in a single unified query — what Weaviate calls hybrid search. Pure vector similarity fails on exact string matches like model names, product codes, and proper nouns; BM25 fills those gaps. Weaviate v1.37 added a built-in MCP (Model Context Protocol) server, meaning Claude Code, Cursor, and any MCP-compatible agent can read and write to your database natively without glue code. It also added Diversity Search (MMR) and query profiling with per-shard timing breakdowns.
Versions, licensing, and architecture
| ChromaDB | Qdrant | Weaviate | |
|---|---|---|---|
| Current version | v1.5.9 (May 2026) | v1.17.1 (Mar 2026) | v1.37 (May 2026) |
| License | Apache 2.0 | Apache 2.0 | BSD-3-Clause |
| Core language | Rust (Python + JS clients) | Rust | Go |
| API surface | REST, Python, JS | REST + gRPC | REST + GraphQL + gRPC |
| Self-hostable | Yes | Yes | Yes |
| Managed cloud | Chroma Cloud | Qdrant Cloud | Weaviate Cloud |
Hardware requirements
All three are CPU-capable for small workloads. RAM is the real constraint — vector indexes need to live in memory for fast queries.
| ChromaDB v1.5.9 | Qdrant v1.17.1 | Weaviate v1.37 | |
|---|---|---|---|
| Development minimum | 2 GB RAM | 2 GB RAM | 8 GB RAM (recommended) |
| Production minimum | 8 GB RAM | 4 GB RAM | 16 GB RAM |
| GPU required | No | No | No |
| Python required | Yes (client) | No (Docker binary) | No (Docker) |
| OS support | Linux, macOS, Windows | Linux, macOS, Windows | Linux, macOS, Windows |
Qdrant's numbers are worth digging into. To serve 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-large) in float32, you need roughly 1.2 GB RAM. Enable Scalar quantization and that drops to ~300 MB for the same dataset, per the Qdrant memory consumption documentation. This is the reason Qdrant wins for memory-constrained production setups.
Weaviate's higher baseline memory usage comes from its module system. Each built-in vectorizer, re-ranker, or generative model you enable loads additional components into the container. For raw vector storage with external embeddings, Weaviate is comparable to Qdrant; the gap appears when you start enabling modules.
ChromaDB's storage overhead runs 2–4× your data size on disk (data + HNSW index + WAL). For development, the in-process client needs only enough RAM to load the collection.
Installation
Chroma — in-process or server
pip install chromadb
import chromadb
# Ephemeral (in-memory, lost on restart)
client = chromadb.Client()
# Persistent (saved to disk, no separate process)
client = chromadb.PersistentClient(path="./chroma_data")
collection = client.get_or_create_collection("my_docs")
collection.add(
documents=["doc one", "doc two"],
ids=["id1", "id2"]
)
results = collection.query(query_texts=["example query"], n_results=2)
For multi-client server mode:
chroma run --path ./chroma_data --port 8000
That's the complete setup. No Docker required for development.
Qdrant — Docker-first
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant
Or with Docker Compose (recommended for persistence across restarts):
services:
qdrant:
image: qdrant/qdrant:latest
ports:
- "6333:6333" # REST API
- "6334:6334" # gRPC
volumes:
- ./qdrant_storage:/qdrant/storage
restart: unless-stopped
pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
client = QdrantClient("localhost", port=6333)
client.create_collection(
collection_name="my_docs",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
Weaviate — Docker Compose with config
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051"
volumes:
- weaviate_data:/var/lib/weaviate
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
DEFAULT_VECTORIZER_MODULE: "none"
ENABLE_API_BASED_MODULES: "true"
CLUSTER_HOSTNAME: "node1"
volumes:
weaviate_data:
docker compose up -d
pip install weaviate-client
import weaviate
client = weaviate.connect_to_local()
Weaviate is available at localhost:8080. If you enable built-in vectorizers (text2vec-openai, text2vec-cohere, etc.), the Compose file gets more involved — see [docs.weaviate.io](https://docs.weaviate.i
Top comments (0)