I’ve spent months stress-testing vector search algorithms, and Hierarchical Navigable Small Worlds (HNSW) consistently stands out for mid-sized datasets. But it’s no silver bullet. Here’s what I’ve learned from implementing it, benchmarking trade-offs, and seeing it fail.
Why Naive Search Fails at Scale
Calculating Euclidean distances for all vectors works for tiny datasets. At 1 million 768-dim vectors, a naive Python scan takes ~1.2 seconds per query on an A100 GPU—unacceptable for real-time applications. This collapses completely beyond 10M vectors. Graph-based indices like HNSW reduce this to milliseconds, but introduce other constraints.
Navigable Small Worlds (NSW): Simple but Brittle
How NSW Builds Connections
- Start with an empty graph.
- For each new vector:
- Find
R
nearest neighbors in the existing graph (greedy search from a random entry point). - Connect the vector to these neighbors.
- Find
- Prune excess edges (default
R=16
).
Search Limitations I’ve Observed
In my tests on 10M GloVe vectors, NSW often got stuck in local minima. Starting from 10 random entry points improved recall@10 from 72% to 88%, but doubled latency. Worse, in low dimensions (e.g., 2D embeddings), NSW’s graph became entangled, causing 30% longer search paths.
HNSW’s Hierarchy Fixes NSW’s Flaws
HNSW adds layers to NSW. Each layer is a separate graph. Top layers (fewer nodes) allow long hops; bottom layers (all nodes) refine results.
Construction: A Top-Down Process
# Pseudocode for HNSW insertion
def insert_vector(vector, max_layers=5):
layer = random_layer(max_layers) # Truncated geometric distribution
entry_point = top_layer_entry
for current_layer in reversed(range(layer, max_layers)):
neighbors = greedy_search(vector, entry_point, ef=16, layer=current_layer)
entry_point = neighbors[0]
# Insert into all layers below 'layer'
for l in range(layer, -1, -1):
connect_to_neighbors(vector, l, max_edges=32)
Key parameters:
-
max_layers
: Balances build time vs. search speed. -
efConstruction
: Trade recall for faster indexing (tested below).
Search: From Coarse to Fine
- Start at top layer, find nearest neighbor to query.
- Use this neighbor as entry point to the layer below.
- Repeat until the bottom layer.
Benchmarks: Where HNSW Excels and Stumbles
I tested on 10M Cohere embeddings (768-dim), NVIDIA A100, efSearch=64:
Metric | NSW | HNSW (max_layers=5) |
---|---|---|
Avg. Latency | 42ms | 9ms |
Recall@10 | 88% | 98% |
Build Time | 18 min | 34 min |
Memory Overhead | 12 GB | 28 GB |
When I’d Avoid HNSW:
- Memory-bound systems: HNSW uses ~3–5x more RAM than PQ-based indices.
- Static datasets: For read-heavy workloads, consider disk-optimized indices like DiskANN.
- Ultra-high dimensions (>1K): HNSW’s recall drops below ANN alternatives like ScaNN.
Implementation Pitfalls I’ve Encountered
-
Edge Pruning: Not limiting edges during insertion (
max_edges=32
) bloated memory by 40%. - Layer Distribution: Skipping geometric sampling caused unbalanced graphs, increasing latency variance.
-
Hardware Mismatch: On CPUs,
efSearch>128
often throttles throughput beyond 100 QPS.
Is HNSW Right for Your Stack?
Opt for HNSW when:
- Your dataset fits in memory (≤100M vectors).
- You need <20ms latency at high recall.
- Index build time isn’t critical (e.g., batch updates nightly).
Avoid if:
- You’re on embedded devices/low RAM.
- Your vectors update in real-time (HNSW isn’t incremental).
- Recall >99% is non-negotiable (brute-force still wins).
Open-source vector databases like Milvus use HNSW as a default for good reason—but always validate against your data distribution. I once saw a 20% latency spike on medical images vs. text embeddings due to clustered vector spaces.
What I’m Exploring Next
While HNSW dominates mid-scale search, I’m testing hybrid approaches:
- Coupling HNSW with product quantization to cut memory.
- Layer-free hierarchies for streaming data.
- Failure mode analysis when vectors follow power-law distributions.
No algorithm is universally optimal. HNSW trades memory and build time for speed and recall. Measure twice, implement once.
Top comments (0)