In my last post, I talked about how dynamic embeddings keep your knowledge base fresh as documents evolve. But freshness is only half the story.
When a user asks your assistant a question, ๐ก๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐ฌ๐๐๐ซ๐๐ก ๐ญ๐ก๐ ๐ฏ๐๐๐ญ๐จ๐ซ ๐๐๐ญ๐๐๐๐ฌ๐ determines whether they get:
- the single most relevant snippet,
- a broader set of context, or
- results filtered by metadata like timestamps or document type.
Here are the five main search strategiesโexplained simply.
1๏ธโฃ Similarity Search (k-NN)
When you type a query, the system converts it into a vector Then, it looks around the vector space for the โneighborsโ that sit closest. Those become your top results
๐ Example:
Query: โWhat is the required capital reserve?โ
Result: โBanks must maintain 12% capital reserves.โ
2๏ธโฃ Max Marginal Relevance (MMR)
MMR makes sure you donโt get the same answer five times in a row.
Hereโs how it works: after finding the most relevant snippet, it deliberately looks for other results that are still relevant but it balances relevance with diversity.
๐ Example:
Query: โExplain capital reserve requirements.โ
Results: โBanks must maintain 12% capital reserves.โ
โThese reserves are adjusted annually based on regulations.โ
Notice how the second snippet doesnโt just repeat the firstโit brings in a new angle. Thatโs MMR at work.
3๏ธโฃ Filtered / Metadata Search
Sometimes โclosest meaningโ isnโt the whole storyโyou also care about context and constraints. Thatโs where metadata filtering comes in.
Think of it as adding a funnel on top of similarity search. You still find the closest matches, but only those that meet extra rules like date, document type, source, or author.
๐ Example:
Query: โWhatโs the latest capital reserve requirement?โ
Filter: updated_at > 2025-01-01
Result: The system ignores older documents and only shows the most recent ruleโeven if the older ones are technically โcloserโ in meaning.
4๏ธโฃ Hybrid Search (Keyword + Vector)
Sometimes, meaning alone isnโt enough. What if your query includes an exact code, acronym, or ID? A pure semantic search might blur it, but a keyword search nails it.
Hybrid search combines the two:
Vector search captures the context and meaning.
Keyword search makes sure specific terms (like โCRR-2025โ) get the priority they deserve.
๐ Example:
Query: โCapital Reserve Rule CRR-2025โ
Vector search โ understands itโs about capital reserves.
Keyword search โ ensures documents mentioning CRR-2025 are ranked higher.
5๏ธโฃ Cross-Encoder Reranking
Starts with a fast similarity search, then uses a deeper model (like BERT) to re-score the top candidates for accuracy.
๐ Query: โWhat are the capital reserve rules for 2025?โ
Step 1: Initial retrieval โ 10 candidates
Step 2: Reranker โ re-scores and picks the single best snippet
Want to explore the full code base?
https://lnkd.in/eec9AiHy
๐ Search Strategies at a Glance
Strategy | How it Works | Pros | Cons | Best For |
---|---|---|---|---|
Similarity Search (k-NN) | Finds nearest neighbors in vector space | Fast & simple | Can return repetitive or narrow results | Quick lookups, FAQs |
Max Marginal Relevance (MMR) | Balances relevance + diversity | Avoids duplicates, adds variety | Slightly slower | Explanations, multi-fact answers |
Filtered / Metadata Search | Adds constraints (date, type, source) on top of similarity | Ensures results match business rules | Needs clean, consistent metadata | Compliance, regulations, versioned docs |
Hybrid Search | Combines keyword search with vector similarity | Best of both worlds (context + exact match) | Requires extra infra (ElasticSearch, OpenSearch) | IDs, codes, acronyms, technical docs |
Cross-Encoder Reranking | Re-scores initial candidates with a deeper model (e.g., BERT) | Highest precision | Computationally heavy | Mission-critical answers, high-accuracy apps |
๐ Key Takeaway
- Static embeddings = stale snapshots
- Dynamic embeddings = living knowledge
This pipeline keeps context fresh and supports multiple retrieval modes so you can choose the right strategy for your production needs.
Top comments (0)