Milvus or FAISS? What I Learned Building a High-Performance Vector Search Engine

As an open-source database engineer with years of experience diving into low-level internals, I’ve been closely exploring vector databases while building a semantic search engine. Recently, I dug deep into two of the most prominent players in the vector search space: Milvus and FAISS. What I found was a set of fascinating trade-offs that go far beyond benchmarks or marketing claims.

In this post, I’ll walk you through my hands-on reflections, design considerations, and some microbenchmarking notes I gathered while testing these tools on a 10M vector dataset.

Understanding the Tools: Milvus vs. FAISS

At their core, both Milvus and FAISS address the same fundamental challenge: performing efficient similarity search across high-dimensional vector spaces, a task critical for applications like semantic search, retrieval-augmented generation (RAG), and recommendation systems.

But their approaches differ.

Milvus is a purpose-built, open-source vector database designed for large-scale deployments. It wraps vector indexing with a full database engine: storage management, distributed deployments, and ecosystem integrations.
FAISS (Facebook AI Similarity Search) is a C++ library (with Python bindings) hyper-optimized for fast similarity search, especially on GPUs. It offers blazing speed for core vector operations but skips database features like replication, persistence, or query language layers.

If you’re familiar with how PostgreSQL differs from something like RocksDB, the analogy kind of fits here.

Architecture: Database vs. Library

One of the most striking contrasts lies in their architectural philosophy.


Feature	Milvus	FAISS
Storage & Persistence	Built-in, supports disk + memory	In-memory, persistence external
Distributed Support	Yes (sharding, replication)	No (single-node)
Index Types	IVF, HNSW, DiskANN, etc.	IVF, HNSW, PQ, OPQ
Hardware Acceleration	CPU + GPU	CPU + GPU
Language Interface	API (REST, gRPC, SDKs)	C++ / Python library

This means that if you just want to embed FAISS into a custom Go or C++ application, you can—no server required. But if you need something that comes with built-in high availability or scales horizontally across machines, Milvus offers a much more complete package.

Microbenchmarking Insights

To get beyond surface-level comparisons, I ran some small-scale tests using a 10 million vector dataset (128 dimensions) under two scenarios:

1️⃣ Exact Search (Brute Force)

FAISS (CPU): ~1200 QPS
Milvus (CPU): ~1000 QPS

2️⃣ Approximate Search (IVF, nprobe=10)

FAISS (GPU): ~9500 QPS
Milvus (GPU): ~8700 QPS

These aren’t definitive numbers—they depend heavily on hardware, configuration, and dataset—but they illustrate a general pattern: FAISS tends to edge out Milvus in raw search speed, especially when GPU-optimized, while Milvus offers “good enough” performance with the added benefit of persistence and cluster-wide scaling.

Here’s a simple FAISS Python snippet to illustrate approximate search:

import faiss
import numpy as np
d = 128  # vector dimension
nb = 10_000_000  # database size
nq = 10  # number of queries
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
index = faiss.index_factory(d, "IVF1000,Flat")
index.train(xb)
index.add(xb)
D, I = index.search(xq, 5)  # top-5 nearest

Deployment Trade-offs

Here’s where things get practical.


Scenario	Recommendation
Embedded system, no persistence needed	FAISS
Scalable cloud-native search service	Milvus
GPU-heavy approximate search pipelines	FAISS (GPU)
Multi-modal search with text + image data	Milvus (multi-modal APIs)

A key note: FAISS leaves a lot of system concerns (like checkpointing or failover) up to you. For some custom pipelines, that’s fine—even preferable. But if you’re running a production service where consistency and uptime matter, Milvus’s built-in clustering, replication, and monitoring become serious advantages.

Benchmarking Tools

I also explored VectorDBBench, an open-source benchmarking suite designed to compare vector databases across datasets and workloads. It’s useful if you want structured comparisons on your own hardware rather than relying on vendor claims.

The VectorDBBench Leaderboard gives a quick overview of how mainstream systems stack up, but I always recommend running your own benchmarks because workload patterns vary so much.

Final Thoughts

What I came away with was this: choosing between Milvus and FAISS isn’t just about raw performance—it’s about system design. Do you want a low-level, embeddable search library or a full-fledged database that handles storage, scaling, and fault tolerance?

Both tools are excellent at what they’re designed for, but they sit at different points on the system design spectrum. As always, understanding your own workload’s needs—latency, throughput, consistency, scalability—should guide the choice.

If you’re experimenting with vector search at scale, I’d love to hear what design decisions you’re wrestling with. Let’s swap notes.