DEV Community

RajeevaChandra
RajeevaChandra

Posted on

๐๐ž๐ฒ๐จ๐ง๐ ๐…๐ซ๐ž๐ฌ๐ก๐ง๐ž๐ฌ๐ฌ: ๐‡๐จ๐ฐ ๐ญ๐จ ๐ฎ๐ฌ๐ž ๐’๐ž๐š๐ซ๐œ๐ก ๐Œ๐จ๐๐ž๐ฌ ๐ข๐ง ๐•๐ž๐œ๐ญ๐จ๐ซ ๐ƒ๐š๐ญ๐š๐›๐š๐ฌ๐ž๐ฌ (๐ฐ๐ข๐ญ๐ก ๐‹๐š๐ง๐ ๐œ๐ก๐š๐ข๐ง)

In my last post, I talked about how dynamic embeddings keep your knowledge base fresh as documents evolve. But freshness is only half the story.
When a user asks your assistant a question, ๐ก๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐ฌ๐ž๐š๐ซ๐œ๐ก ๐ญ๐ก๐ž ๐ฏ๐ž๐œ๐ญ๐จ๐ซ ๐๐š๐ญ๐š๐›๐š๐ฌ๐ž determines whether they get:

  • the single most relevant snippet,
  • a broader set of context, or
  • results filtered by metadata like timestamps or document type.

Here are the five main search strategiesโ€”explained simply.

search types

1๏ธโƒฃ Similarity Search (k-NN)

When you type a query, the system converts it into a vector Then, it looks around the vector space for the โ€œneighborsโ€ that sit closest. Those become your top results

๐Ÿ‘‰ Example:
Query: โ€œWhat is the required capital reserve?โ€
Result: โ€œBanks must maintain 12% capital reserves.โ€

2๏ธโƒฃ Max Marginal Relevance (MMR)

MMR makes sure you donโ€™t get the same answer five times in a row.
Hereโ€™s how it works: after finding the most relevant snippet, it deliberately looks for other results that are still relevant but it balances relevance with diversity.

๐Ÿ‘‰ Example:
Query: โ€œExplain capital reserve requirements.โ€
Results: โ€œBanks must maintain 12% capital reserves.โ€
โ€œThese reserves are adjusted annually based on regulations.โ€

Notice how the second snippet doesnโ€™t just repeat the firstโ€”it brings in a new angle. Thatโ€™s MMR at work.

3๏ธโƒฃ Filtered / Metadata Search
Sometimes โ€œclosest meaningโ€ isnโ€™t the whole storyโ€”you also care about context and constraints. Thatโ€™s where metadata filtering comes in.
Think of it as adding a funnel on top of similarity search. You still find the closest matches, but only those that meet extra rules like date, document type, source, or author.

๐Ÿ‘‰ Example:
Query: โ€œWhatโ€™s the latest capital reserve requirement?โ€
Filter: updated_at > 2025-01-01
Result: The system ignores older documents and only shows the most recent ruleโ€”even if the older ones are technically โ€œcloserโ€ in meaning.

4๏ธโƒฃ Hybrid Search (Keyword + Vector)

Sometimes, meaning alone isnโ€™t enough. What if your query includes an exact code, acronym, or ID? A pure semantic search might blur it, but a keyword search nails it.

Hybrid search combines the two:

Vector search captures the context and meaning.
Keyword search makes sure specific terms (like โ€œCRR-2025โ€) get the priority they deserve.

๐Ÿ‘‰ Example:

Query: โ€œCapital Reserve Rule CRR-2025โ€
Vector search โ†’ understands itโ€™s about capital reserves.
Keyword search โ†’ ensures documents mentioning CRR-2025 are ranked higher.

5๏ธโƒฃ Cross-Encoder Reranking

Starts with a fast similarity search, then uses a deeper model (like BERT) to re-score the top candidates for accuracy.

๐Ÿ‘‰ Query: โ€œWhat are the capital reserve rules for 2025?โ€
Step 1: Initial retrieval โ†’ 10 candidates
Step 2: Reranker โ†’ re-scores and picks the single best snippet

Want to explore the full code base?
https://lnkd.in/eec9AiHy

๐Ÿ“Š Search Strategies at a Glance

Strategy How it Works Pros Cons Best For
Similarity Search (k-NN) Finds nearest neighbors in vector space Fast & simple Can return repetitive or narrow results Quick lookups, FAQs
Max Marginal Relevance (MMR) Balances relevance + diversity Avoids duplicates, adds variety Slightly slower Explanations, multi-fact answers
Filtered / Metadata Search Adds constraints (date, type, source) on top of similarity Ensures results match business rules Needs clean, consistent metadata Compliance, regulations, versioned docs
Hybrid Search Combines keyword search with vector similarity Best of both worlds (context + exact match) Requires extra infra (ElasticSearch, OpenSearch) IDs, codes, acronyms, technical docs
Cross-Encoder Reranking Re-scores initial candidates with a deeper model (e.g., BERT) Highest precision Computationally heavy Mission-critical answers, high-accuracy apps

๐Ÿ”‘ Key Takeaway

  • Static embeddings = stale snapshots
  • Dynamic embeddings = living knowledge

This pipeline keeps context fresh and supports multiple retrieval modes so you can choose the right strategy for your production needs.

Top comments (0)