Vector Databases and Retrieval Stores Keep Failing in Subtle Ways (FAISS, pgvector, Qdrant, Redis)
This is part of the Global Fix Map series — a practical guide to debugging LLM pipelines at scale.
👉 Full index here: Global Fix Map README
Why this matters
If you’ve ever worked with vector databases like FAISS, pgvector, Qdrant, or Redis, you’ve probably seen it:
your data is in the store, ingestion looks successful, dashboards are green… but queries come back empty or off-target.
This is not just bad luck. These issues are systematic and repeatable, and they break real-world AI apps at scale.
Common failure modes in VectorDBs
- Index drift — documents ingested but not searchable until a background index build finishes (FAISS, Milvus).
- Metric mismatch — one side uses cosine similarity, the other defaults to L2 or dot product, silently tanking recall.
- Chunk fracture — embeddings split inconsistently between ingestion and query, so alignment is lost.
- Vector ghosts — deleted embeddings remain retrievable (seen in Qdrant / Redis setups).
- Sharded blind spots — cross-shard queries miss slices of the data when routing isn’t aligned.
What’s really breaking under the hood
Most of these failures come down to broken contracts between ingestion and retrieval:
- Index =/= query surface. Async builds leave gaps.
- Metric defaults differ per library (
cosine
vs.L2
vs.IP
). - Tokenization at ingestion != tokenization at query → chunk contracts drift.
- Delete ops don’t enforce tombstones, so “ghost vectors” live on.
- Sharding rules lack guardrails → partial coverage under load.
In other words: your vectorstore says “success” but the retrieval contract is silently broken.
Minimal fixes (works across FAISS, pgvector, Qdrant, Redis)
- Post-ingest probes: immediately query new vectors back to confirm availability.
- Metric alignment: explicitly set distance metric, don’t trust defaults.
- Chunk contract enforcement: unify tokenization + window size at both ingest and query time.
- Delete fences: add tombstones and verify zero recall after delete ops.
- Shard probes: random test queries per shard, enforce recall coverage ratios.
Acceptance targets (for production reliability)
- Ingest-to-query ΔS ≤ 0.25 across 10k+ documents.
- Metric mismatch error rate ≤ 0.05.
- Recall coverage ≥ 0.90 under shard load.
- Ghost vector retrieval ≤ 0.5%.
How to apply this in practice
- Open the Global Fix Map README.
- Navigate to VectorDBs & Retrieval Stores section.
- Run the minimal fix checklist above in your pipeline.
- Validate against the acceptance targets with stress tests.
💡 This episode is about vector database stability.
Next episode (5): Embeddings pipeline — why normalization, casing, and chunk contracts drift more than you think.
Top comments (0)