𝐁𝐞𝐲𝐨𝐧𝐝 𝐅𝐫𝐞𝐬𝐡𝐧𝐞𝐬𝐬: 𝐇𝐨𝐰 𝐭𝐨 𝐮𝐬𝐞 𝐒𝐞𝐚𝐫𝐜𝐡 𝐌𝐨𝐝𝐞𝐬 𝐢𝐧 𝐕𝐞𝐜𝐭𝐨𝐫 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞𝐬 (𝐰𝐢𝐭𝐡 𝐋𝐚𝐧𝐠𝐜𝐡𝐚𝐢𝐧)

#langchain #machinelearning #vectordatabase #python

In my last post, I talked about how dynamic embeddings keep your knowledge base fresh as documents evolve. But freshness is only half the story.
When a user asks your assistant a question, 𝐡𝐨𝐰 𝐲𝐨𝐮 𝐬𝐞𝐚𝐫𝐜𝐡 𝐭𝐡𝐞 𝐯𝐞𝐜𝐭𝐨𝐫 𝐝𝐚𝐭𝐚𝐛𝐚𝐬𝐞 determines whether they get:

the single most relevant snippet,
a broader set of context, or
results filtered by metadata like timestamps or document type.

Here are the five main search strategies—explained simply.

1️⃣ Similarity Search (k-NN)

When you type a query, the system converts it into a vector Then, it looks around the vector space for the “neighbors” that sit closest. Those become your top results

👉 Example:
Query: “What is the required capital reserve?”
Result: “Banks must maintain 12% capital reserves.”

2️⃣ Max Marginal Relevance (MMR)

MMR makes sure you don’t get the same answer five times in a row.
Here’s how it works: after finding the most relevant snippet, it deliberately looks for other results that are still relevant but it balances relevance with diversity.

👉 Example:
Query: “Explain capital reserve requirements.”
Results: “Banks must maintain 12% capital reserves.”
“These reserves are adjusted annually based on regulations.”

Notice how the second snippet doesn’t just repeat the first—it brings in a new angle. That’s MMR at work.

3️⃣ Filtered / Metadata Search
Sometimes “closest meaning” isn’t the whole story—you also care about context and constraints. That’s where metadata filtering comes in.
Think of it as adding a funnel on top of similarity search. You still find the closest matches, but only those that meet extra rules like date, document type, source, or author.

👉 Example:
Query: “What’s the latest capital reserve requirement?”
Filter: updated_at > 2025-01-01
Result: The system ignores older documents and only shows the most recent rule—even if the older ones are technically “closer” in meaning.

4️⃣ Hybrid Search (Keyword + Vector)

Sometimes, meaning alone isn’t enough. What if your query includes an exact code, acronym, or ID? A pure semantic search might blur it, but a keyword search nails it.

Hybrid search combines the two:

Vector search captures the context and meaning.
Keyword search makes sure specific terms (like “CRR-2025”) get the priority they deserve.

👉 Example:

Query: “Capital Reserve Rule CRR-2025”
Vector search → understands it’s about capital reserves.
Keyword search → ensures documents mentioning CRR-2025 are ranked higher.

5️⃣ Cross-Encoder Reranking

Starts with a fast similarity search, then uses a deeper model (like BERT) to re-score the top candidates for accuracy.

👉 Query: “What are the capital reserve rules for 2025?”
Step 1: Initial retrieval → 10 candidates
Step 2: Reranker → re-scores and picks the single best snippet

Want to explore the full code base?
https://lnkd.in/eec9AiHy

📊 Search Strategies at a Glance

Strategy	How it Works	Pros	Cons	Best For
Similarity Search (k-NN)	Finds nearest neighbors in vector space	Fast & simple	Can return repetitive or narrow results	Quick lookups, FAQs
Max Marginal Relevance (MMR)	Balances relevance + diversity	Avoids duplicates, adds variety	Slightly slower	Explanations, multi-fact answers
Filtered / Metadata Search	Adds constraints (date, type, source) on top of similarity	Ensures results match business rules	Needs clean, consistent metadata	Compliance, regulations, versioned docs
Hybrid Search	Combines keyword search with vector similarity	Best of both worlds (context + exact match)	Requires extra infra (ElasticSearch, OpenSearch)	IDs, codes, acronyms, technical docs
Cross-Encoder Reranking	Re-scores initial candidates with a deeper model (e.g., BERT)	Highest precision	Computationally heavy	Mission-critical answers, high-accuracy apps