The Memory Wall Problem
Most vector indexes prioritize RAM for low latency. HNSW, for example, achieves 95% recall at <5ms for 100M vectors but requires ~500GB RAM. At 1B vectors, RAM costs exceed $10k/month on cloud instances—prohibitively expensive for many teams. DiskANN flips this model:
- Core Innovation: Stores graph structure and full vectors on SSDs
- Memory Footprint: Requires 15–50× less RAM than HNSW
- Tradeoff: Accepts ~10–20ms latency (NVMe SSDs) for billion-scale searches
In my 1B vector test (768-dim), DiskANN used 32GB RAM versus HNSW’s 512GB while maintaining 95%+ recall.
How DiskANN Breaks the Tradeoff
DiskANN’s performance stems from two key innovations:
1. Vamana Graph Construction
Unlike random graph initialization in HNSW, DiskANN uses a pruned, directional approach:
- Step 1: Build an initial graph with random edges
- Step 2: Execute greedy searches from medoid (center) point
- Step 3: Prune neighbors lacking angular diversity (see Fig. 1)
- Result: Fewer long-range hops → fewer SSD reads
My observation: Vamana reduced average search path length by 60% versus unpruned graphs, cutting SSD access.
2. Quantization-Guided Search
DiskANN combines two search phases:
-
In-Memory (PQ-compressed vectors)
- Fast approximate scoring
- Identifies candidate nodes
-
SSD (Full-precision vectors)
- Validates candidates with exact distances
- Final ranking
During tests, Phase 1 filtered 90% of nodes, reducing SSD reads to 10–20 per query.
Deployment Reality Check
Deploying DiskANN demands hardware awareness:
Factor | Recommendation | Impact if Ignored |
---|---|---|
SSD Type | NVMe (≥1GB/s read) | Latency spikes to >100ms with SATA SSDs |
Memory | ≥32GB for PQ data | OOM crashes at query time |
Graph Tuning |
max_degree=64 , beamwidth_ratio=4
|
Recall drops below 90% |
Sample Python Configuration:
index_params = {
"index_type": "DISKANN",
"params": {
"metric_type": "IP", # Inner Product
"max_degree": 64, # Max neighbors per node
"beamwidth_ratio": 4.0 # Search breadth vs depth
}
}
client.create_index("my_collection", index_params)
When Not to Use DiskANN
DiskANN underperforms in:
- Small datasets (<50M vectors): HNSW’s RAM latency dominates
- Write-heavy workloads: Index rebuilds take hours
- Low-precision hardware: SATA SSDs add 5–10ms latency
FreshDiskANN partially addresses updates but adds 30% latency overhead in my tests.
Key Takeaways
- Cost Efficiency: DiskANN cuts memory costs 15× for billion-scale search
- Latency Budget: Budget 20ms per query (NVMe required)
- Optimal Workloads: Read-heavy, static datasets (e.g., historical archives)
What I’m Exploring Next
- Integrating disk caching for warmer queries
- Testing GraphANN (DiskANN’s successor) for dynamic data
- Cold-start optimization for ephemeral instances
Final Thought
DiskANN isn’t a universal replacement but a specialized tool for extreme-scale search. Its SSD-centric design democratizes billion-vector workloads—provided engineers architect for its disk-bound nature. For teams with SATA SSDs or sub-millisecond requirements, HNSW/NSSG remain preferable.
Top comments (0)