DEV Community

Jovan Chan
Jovan Chan

Posted on • Originally published at aifoss.dev

chroma-vs-qdrant-vs-weaviate-2026

This article was originally published on aifoss.dev

---
title: 'Chroma vs Qdrant vs Weaviate 2026: RAG Database Compared'
description: 'Compare Chroma, Qdrant, and Weaviate for local RAG in 2026: version snapshots, filtering tradeoffs, hybrid search, quantization, and a clear pick by use case.'
pubDate: 'May 27 2026'

tags: ["vectordb", "ai", "rag", "python", "opensource"]

The three most commonly recommended open-source vector databases for RAG — Chroma, Qdrant, and Weaviate — are not interchangeable. Chroma is a prototyping tool that grew into a real product. Qdrant is a production workhorse written in Rust with the best filtering performance of the three. Weaviate is an enterprise-grade platform with hybrid search and the most built-in integrations. Using Weaviate when you need Chroma adds unnecessary ops overhead. Using Chroma when you need Qdrant means migrating under pressure when your collection outgrows it.

Versions covered: ChromaDB v1.5.9 (May 2026), Qdrant v1.17.1 (March 2026), Weaviate v1.37 (May 2026).


The quick answer

Situation Best choice
Local prototyping, notebooks, under 100K vectors Chroma
Embedded in a Python process — no separate service Chroma
Production RAG with filtering-heavy queries Qdrant
Multi-user deployment, concurrent queries Qdrant
Memory-constrained deployment at millions of vectors Qdrant
Hybrid search (BM25 + vector in one query) Weaviate
Multi-modal retrieval (text + images + audio) Weaviate
Built-in re-ranking or generative AI modules Weaviate
Kubernetes, team-operated, agentic MCP workflows Weaviate
Getting from zero to working RAG in 10 minutes Chroma

What each tool actually is

ChromaDB (Apache 2.0, chroma-core/chroma) started as a pure-Python embedded database and was rebuilt in Rust for the v1.0 release. The Rust core eliminates Python's GIL bottlenecks and delivers roughly 4× faster writes and queries compared to the pre-1.0 implementation — write throughput went from ~10K to ~40K+ vectors/second in server mode. Chroma's design priority is developer ergonomics: pip install chromadb, three lines of Python, and you have a working local vector store. The default mode runs in-process — no Docker, no service to start, no YAML. You can run it in server mode for multi-client access when you're ready.

Qdrant (Apache 2.0, qdrant/qdrant) is written entirely in Rust and optimized for production-grade vector similarity search. It runs as a standalone service via Docker with REST and gRPC APIs. Qdrant's main differentiators are its payload filtering system — which combines vector similarity with structured metadata filters inside the HNSW traversal rather than as a post-filter — and its quantization stack (Scalar, Binary, Product, TurboQuant), which lets you compress large collections by up to 32× to stay within affordable RAM budgets. The Qdrant team publishes transparent benchmarks and consistently posts among the lowest latency at the highest recall in the ANN-benchmarks suite.

Weaviate (BSD-3-Clause, weaviate/weaviate) is the most feature-complete of the three. Written in Go, it combines HNSW vector search with BM25 keyword search in a single unified query — what Weaviate calls hybrid search. Pure vector similarity fails on exact string matches like model names, product codes, and proper nouns; BM25 fills those gaps. Weaviate v1.37 added a built-in MCP (Model Context Protocol) server, meaning Claude Code, Cursor, and any MCP-compatible agent can read and write to your database natively without glue code. It also added Diversity Search (MMR) and query profiling with per-shard timing breakdowns.


Versions, licensing, and architecture

ChromaDB Qdrant Weaviate
Current version v1.5.9 (May 2026) v1.17.1 (Mar 2026) v1.37 (May 2026)
License Apache 2.0 Apache 2.0 BSD-3-Clause
Core language Rust (Python + JS clients) Rust Go
API surface REST, Python, JS REST + gRPC REST + GraphQL + gRPC
Self-hostable Yes Yes Yes
Managed cloud Chroma Cloud Qdrant Cloud Weaviate Cloud

Hardware requirements

All three are CPU-capable for small workloads. RAM is the real constraint — vector indexes need to live in memory for fast queries.

ChromaDB v1.5.9 Qdrant v1.17.1 Weaviate v1.37
Development minimum 2 GB RAM 2 GB RAM 8 GB RAM (recommended)
Production minimum 8 GB RAM 4 GB RAM 16 GB RAM
GPU required No No No
Python required Yes (client) No (Docker binary) No (Docker)
OS support Linux, macOS, Windows Linux, macOS, Windows Linux, macOS, Windows

Qdrant's numbers are worth digging into. To serve 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-large) in float32, you need roughly 1.2 GB RAM. Enable Scalar quantization and that drops to ~300 MB for the same dataset, per the Qdrant memory consumption documentation. This is the reason Qdrant wins for memory-constrained production setups.

Weaviate's higher baseline memory usage comes from its module system. Each built-in vectorizer, re-ranker, or generative model you enable loads additional components into the container. For raw vector storage with external embeddings, Weaviate is comparable to Qdrant; the gap appears when you start enabling modules.

ChromaDB's storage overhead runs 2–4× your data size on disk (data + HNSW index + WAL). For development, the in-process client needs only enough RAM to load the collection.


Installation

Chroma — in-process or server

pip install chromadb
Enter fullscreen mode Exit fullscreen mode
import chromadb

# Ephemeral (in-memory, lost on restart)
client = chromadb.Client()

# Persistent (saved to disk, no separate process)
client = chromadb.PersistentClient(path="./chroma_data")

collection = client.get_or_create_collection("my_docs")
collection.add(
    documents=["doc one", "doc two"],
    ids=["id1", "id2"]
)
results = collection.query(query_texts=["example query"], n_results=2)
Enter fullscreen mode Exit fullscreen mode

For multi-client server mode:

chroma run --path ./chroma_data --port 8000
Enter fullscreen mode Exit fullscreen mode

That's the complete setup. No Docker required for development.

Qdrant — Docker-first

docker run -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant
Enter fullscreen mode Exit fullscreen mode

Or with Docker Compose (recommended for persistence across restarts):

services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"   # REST API
      - "6334:6334"   # gRPC
    volumes:
      - ./qdrant_storage:/qdrant/storage
    restart: unless-stopped
Enter fullscreen mode Exit fullscreen mode
pip install qdrant-client
Enter fullscreen mode Exit fullscreen mode
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

client = QdrantClient("localhost", port=6333)
client.create_collection(
    collection_name="my_docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
Enter fullscreen mode Exit fullscreen mode

Weaviate — Docker Compose with config

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      DEFAULT_VECTORIZER_MODULE: "none"
      ENABLE_API_BASED_MODULES: "true"
      CLUSTER_HOSTNAME: "node1"

volumes:
  weaviate_data:
Enter fullscreen mode Exit fullscreen mode
docker compose up -d
pip install weaviate-client
Enter fullscreen mode Exit fullscreen mode
import weaviate

client = weaviate.connect_to_local()
Enter fullscreen mode Exit fullscreen mode

Weaviate is available at localhost:8080. If you enable built-in vectorizers (text2vec-openai, text2vec-cohere, etc.), the Compose file gets more involved — see [docs.weaviate.io](https://docs.weaviate.i

Top comments (0)