DEV Community

Zayd Mulani
Zayd Mulani

Posted on

I built a local-first hybrid vector database in Rust from scratch

A few months ago I started building vecdb — a vector database that
runs entirely on your own machine. No cloud, no API keys, no subscription.

The problem

Most vector databases make you choose — semantic search OR keyword search.
Semantic search finds meaning but misses exact keywords. Keyword search
finds exact matches but misses meaning.

vecdb combines both in a two-stage pipeline:

  1. HNSW dense index retrieves candidates by meaning
  2. BM25 sparse index re-scores by keyword relevance
  3. A fusion function combines both scores

What it can do

  • Hybrid HNSW + BM25 retrieval
  • SQL-like query language with VECTOR_SIM predicate
  • Python and TypeScript SDKs
  • Single binary, Docker support
  • 187 tests
  • MIT license

Example query

SELECT * FROM documents
WHERE VECTOR_SIM(vec, [0.1, 0.2, 0.3]) > 0.75
AND payload->>'region' = 'US'
LIMIT 10;

Try it

GitHub: https://github.com/zaydmulani09/vecdb

Would love feedback from the community — especially on the
architecture and what to tackle in v0.2.0.

Top comments (0)