Sparsh Jain

Posted on Feb 26

I Built a RAG Search Engine from Scratch to Understand How Modern Search Actually Works

#rag #ai #programming #tutorial

Everyone is building RAG apps.

But most tutorials skip the most important part — search quality.

So instead of just plugging in a framework, I decided to build my own RAG Search Engine from scratch to deeply understand how retrieval systems work under the hood.

This project helped me explore the real mechanics behind:

Keyword search
Semantic search
Hybrid ranking
Reranking models
Evaluation metrics
Multimodal retrieval
Retrieval-Augmented Generation (RAG)

You can watch the full breakdown here:

YouTube Video: [https://www.youtube.com/watch?v=HEx0A5R-_Tc]

GitHub repository:

Source Code: [https://github.com/SplinterSword/RAG-Search-Engine]

What This Project Implements

Keyword Search (BM25)

I implemented classical information retrieval techniques like:

TF (Term Frequency)
IDF (Inverse Document Frequency)
BM25 scoring

This helped me understand why traditional keyword search is still extremely powerful in production systems.

Semantic Search (Embeddings + Vector Similarity)

I added dense retrieval using embeddings and cosine similarity to capture meaning instead of exact keyword matches.

This allows the system to handle:

Synonyms
Context
Conceptual similarity

Hybrid Search

Instead of choosing between keyword or semantic search, I combined them using:

Weighted fusion
Reciprocal Rank Fusion (RRF)

This is closer to how modern production search systems operate — combining precision and semantic understanding.

Reranking with Cross-Encoders

After retrieving top results, I added a reranking stage to refine relevance.

This significantly improved result quality by evaluating query-document pairs more deeply.

Evaluation (The Part Most People Skip)

I didn’t just build the system — I measured it.

I implemented:

Precision
Recall
F1 Score
Manual evaluation
LLM-as-a-judge evaluation

This helped me understand how to properly assess retrieval performance.

Multimodal Search

I also experimented with text + image retrieval using embedding-based similarity.

Retrieval-Augmented Generation (RAG)

Finally, I connected the hybrid retrieval system to an LLM to generate grounded responses.

This reinforced one important lesson:

Better retrieval > Bigger model.

Why I Built This

Most RAG demos abstract away the hardest part — retrieval.

I wanted to understand:

How ranking algorithms work
Why hybrid systems outperform pure approaches
How reranking improves precision
How evaluation should actually be done
What tradeoffs exist in search system design

This project was about learning how modern search systems are engineered — not just calling an API.

If you're interested in:

Search engines
Information retrieval
RAG systems
AI system design
Hybrid search architectures

I’d love to hear your thoughts.

Let me know what you’d improve or explore next 👇

DEV Community