Jovan Chan

Posted on Jun 2 • Originally published at aifoss.dev

chroma-vs-qdrant-vs-weaviate-2026

#opensource #ai #selfhosted #linux

This article was originally published on aifoss.dev

---
title: 'Chroma vs Qdrant vs Weaviate 2026: RAG Database Compared'
description: 'Compare Chroma, Qdrant, and Weaviate for local RAG in 2026: version snapshots, filtering tradeoffs, hybrid search, quantization, and a clear pick by use case.'
pubDate: 'May 27 2026'

tags: ["vectordb", "ai", "rag", "python", "opensource"]

The three most commonly recommended open-source vector databases for RAG — Chroma, Qdrant, and Weaviate — are not interchangeable. Chroma is a prototyping tool that grew into a real product. Qdrant is a production workhorse written in Rust with the best filtering performance of the three. Weaviate is an enterprise-grade platform with hybrid search and the most built-in integrations. Using Weaviate when you need Chroma adds unnecessary ops overhead. Using Chroma when you need Qdrant means migrating under pressure when your collection outgrows it.

Versions covered: ChromaDB v1.5.9 (May 2026), Qdrant v1.17.1 (March 2026), Weaviate v1.37 (May 2026).

The quick answer

Situation	Best choice
Local prototyping, notebooks, under 100K vectors	Chroma
Embedded in a Python process — no separate service	Chroma
Production RAG with filtering-heavy queries	Qdrant
Multi-user deployment, concurrent queries	Qdrant
Memory-constrained deployment at millions of vectors	Qdrant
Hybrid search (BM25 + vector in one query)	Weaviate
Multi-modal retrieval (text + images + audio)	Weaviate
Built-in re-ranking or generative AI modules	Weaviate
Kubernetes, team-operated, agentic MCP workflows	Weaviate
Getting from zero to working RAG in 10 minutes	Chroma

What each tool actually is

ChromaDB (Apache 2.0, chroma-core/chroma) started as a pure-Python embedded database and was rebuilt in Rust for the v1.0 release. The Rust core eliminates Python's GIL bottlenecks and delivers roughly 4× faster writes and queries compared to the pre-1.0 implementation — write throughput went from ~10K to ~40K+ vectors/second in server mode. Chroma's design priority is developer ergonomics: pip install chromadb, three lines of Python, and you have a working local vector store. The default mode runs in-process — no Docker, no service to start, no YAML. You can run it in server mode for multi-client access when you're ready.

Qdrant (Apache 2.0, qdrant/qdrant) is written entirely in Rust and optimized for production-grade vector similarity search. It runs as a standalone service via Docker with REST and gRPC APIs. Qdrant's main differentiators are its payload filtering system — which combines vector similarity with structured metadata filters inside the HNSW traversal rather than as a post-filter — and its quantization stack (Scalar, Binary, Product, TurboQuant), which lets you compress large collections by up to 32× to stay within affordable RAM budgets. The Qdrant team publishes transparent benchmarks and consistently posts among the lowest latency at the highest recall in the ANN-benchmarks suite.

Weaviate (BSD-3-Clause, weaviate/weaviate) is the most feature-complete of the three. Written in Go, it combines HNSW vector search with BM25 keyword search in a single unified query — what Weaviate calls hybrid search. Pure vector similarity fails on exact string matches like model names, product codes, and proper nouns; BM25 fills those gaps. Weaviate v1.37 added a built-in MCP (Model Context Protocol) server, meaning Claude Code, Cursor, and any MCP-compatible agent can read and write to your database natively without glue code. It also added Diversity Search (MMR) and query profiling with per-shard timing breakdowns.

Versions, licensing, and architecture

	ChromaDB	Qdrant	Weaviate
Current version	v1.5.9 (May 2026)	v1.17.1 (Mar 2026)	v1.37 (May 2026)
License	Apache 2.0	Apache 2.0	BSD-3-Clause
Core language	Rust (Python + JS clients)	Rust	Go
API surface	REST, Python, JS	REST + gRPC	REST + GraphQL + gRPC
Self-hostable	Yes	Yes	Yes
Managed cloud	Chroma Cloud	Qdrant Cloud	Weaviate Cloud

Hardware requirements

All three are CPU-capable for small workloads. RAM is the real constraint — vector indexes need to live in memory for fast queries.

	ChromaDB v1.5.9	Qdrant v1.17.1	Weaviate v1.37
Development minimum	2 GB RAM	2 GB RAM	8 GB RAM (recommended)
Production minimum	8 GB RAM	4 GB RAM	16 GB RAM
GPU required	No	No	No
Python required	Yes (client)	No (Docker binary)	No (Docker)
OS support	Linux, macOS, Windows	Linux, macOS, Windows	Linux, macOS, Windows

Qdrant's numbers are worth digging into. To serve 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-large) in float32, you need roughly 1.2 GB RAM. Enable Scalar quantization and that drops to ~300 MB for the same dataset, per the Qdrant memory consumption documentation. This is the reason Qdrant wins for memory-constrained production setups.

Weaviate's higher baseline memory usage comes from its module system. Each built-in vectorizer, re-ranker, or generative model you enable loads additional components into the container. For raw vector storage with external embeddings, Weaviate is comparable to Qdrant; the gap appears when you start enabling modules.

ChromaDB's storage overhead runs 2–4× your data size on disk (data + HNSW index + WAL). For development, the in-process client needs only enough RAM to load the collection.

Installation

Chroma — in-process or server

pip install chromadb

import chromadb

# Ephemeral (in-memory, lost on restart)
client = chromadb.Client()

# Persistent (saved to disk, no separate process)
client = chromadb.PersistentClient(path="./chroma_data")

collection = client.get_or_create_collection("my_docs")
collection.add(
    documents=["doc one", "doc two"],
    ids=["id1", "id2"]
)
results = collection.query(query_texts=["example query"], n_results=2)

For multi-client server mode:

chroma run --path ./chroma_data --port 8000

That's the complete setup. No Docker required for development.

Qdrant — Docker-first

docker run -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Or with Docker Compose (recommended for persistence across restarts):

services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"   # REST API
      - "6334:6334"   # gRPC
    volumes:
      - ./qdrant_storage:/qdrant/storage
    restart: unless-stopped

pip install qdrant-client

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

client = QdrantClient("localhost", port=6333)
client.create_collection(
    collection_name="my_docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

Weaviate — Docker Compose with config

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      DEFAULT_VECTORIZER_MODULE: "none"
      ENABLE_API_BASED_MODULES: "true"
      CLUSTER_HOSTNAME: "node1"

volumes:
  weaviate_data:

docker compose up -d
pip install weaviate-client

import weaviate

client = weaviate.connect_to_local()

Weaviate is available at localhost:8080. If you enable built-in vectorizers (text2vec-openai, text2vec-cohere, etc.), the Compose file gets more involved — see [docs.weaviate.io](https://docs.weaviate.i

DEV Community