DEV Community

Cover image for I Built a RAG Search Engine from Scratch to Understand How Modern Search Actually Works
Sparsh Jain
Sparsh Jain

Posted on

I Built a RAG Search Engine from Scratch to Understand How Modern Search Actually Works

Everyone is building RAG apps.

But most tutorials skip the most important part — search quality.

So instead of just plugging in a framework, I decided to build my own RAG Search Engine from scratch to deeply understand how retrieval systems work under the hood.

This project helped me explore the real mechanics behind:

  • Keyword search
  • Semantic search
  • Hybrid ranking
  • Reranking models
  • Evaluation metrics
  • Multimodal retrieval
  • Retrieval-Augmented Generation (RAG)

You can watch the full breakdown here:

YouTube Video: [https://www.youtube.com/watch?v=HEx0A5R-_Tc]

GitHub repository:

Source Code: [https://github.com/SplinterSword/RAG-Search-Engine]


What This Project Implements

Keyword Search (BM25)

I implemented classical information retrieval techniques like:

  • TF (Term Frequency)
  • IDF (Inverse Document Frequency)
  • BM25 scoring

This helped me understand why traditional keyword search is still extremely powerful in production systems.


Semantic Search (Embeddings + Vector Similarity)

I added dense retrieval using embeddings and cosine similarity to capture meaning instead of exact keyword matches.

This allows the system to handle:

  • Synonyms
  • Context
  • Conceptual similarity

Hybrid Search

Instead of choosing between keyword or semantic search, I combined them using:

  • Weighted fusion
  • Reciprocal Rank Fusion (RRF)

This is closer to how modern production search systems operate — combining precision and semantic understanding.


Reranking with Cross-Encoders

After retrieving top results, I added a reranking stage to refine relevance.

This significantly improved result quality by evaluating query-document pairs more deeply.


Evaluation (The Part Most People Skip)

I didn’t just build the system — I measured it.

I implemented:

  • Precision
  • Recall
  • F1 Score
  • Manual evaluation
  • LLM-as-a-judge evaluation

This helped me understand how to properly assess retrieval performance.


Multimodal Search

I also experimented with text + image retrieval using embedding-based similarity.


Retrieval-Augmented Generation (RAG)

Finally, I connected the hybrid retrieval system to an LLM to generate grounded responses.

This reinforced one important lesson:

Better retrieval > Bigger model.


Why I Built This

Most RAG demos abstract away the hardest part — retrieval.

I wanted to understand:

  • How ranking algorithms work
  • Why hybrid systems outperform pure approaches
  • How reranking improves precision
  • How evaluation should actually be done
  • What tradeoffs exist in search system design

This project was about learning how modern search systems are engineered — not just calling an API.


If you're interested in:

  • Search engines
  • Information retrieval
  • RAG systems
  • AI system design
  • Hybrid search architectures

I’d love to hear your thoughts.

Let me know what you’d improve or explore next 👇

Top comments (0)