Everyone is building RAG apps.
But most tutorials skip the most important part — search quality.
So instead of just plugging in a framework, I decided to build my own RAG Search Engine from scratch to deeply understand how retrieval systems work under the hood.
This project helped me explore the real mechanics behind:
- Keyword search
- Semantic search
- Hybrid ranking
- Reranking models
- Evaluation metrics
- Multimodal retrieval
- Retrieval-Augmented Generation (RAG)
You can watch the full breakdown here:
YouTube Video: [https://www.youtube.com/watch?v=HEx0A5R-_Tc]
GitHub repository:
Source Code: [https://github.com/SplinterSword/RAG-Search-Engine]
What This Project Implements
Keyword Search (BM25)
I implemented classical information retrieval techniques like:
- TF (Term Frequency)
- IDF (Inverse Document Frequency)
- BM25 scoring
This helped me understand why traditional keyword search is still extremely powerful in production systems.
Semantic Search (Embeddings + Vector Similarity)
I added dense retrieval using embeddings and cosine similarity to capture meaning instead of exact keyword matches.
This allows the system to handle:
- Synonyms
- Context
- Conceptual similarity
Hybrid Search
Instead of choosing between keyword or semantic search, I combined them using:
- Weighted fusion
- Reciprocal Rank Fusion (RRF)
This is closer to how modern production search systems operate — combining precision and semantic understanding.
Reranking with Cross-Encoders
After retrieving top results, I added a reranking stage to refine relevance.
This significantly improved result quality by evaluating query-document pairs more deeply.
Evaluation (The Part Most People Skip)
I didn’t just build the system — I measured it.
I implemented:
- Precision
- Recall
- F1 Score
- Manual evaluation
- LLM-as-a-judge evaluation
This helped me understand how to properly assess retrieval performance.
Multimodal Search
I also experimented with text + image retrieval using embedding-based similarity.
Retrieval-Augmented Generation (RAG)
Finally, I connected the hybrid retrieval system to an LLM to generate grounded responses.
This reinforced one important lesson:
Better retrieval > Bigger model.
Why I Built This
Most RAG demos abstract away the hardest part — retrieval.
I wanted to understand:
- How ranking algorithms work
- Why hybrid systems outperform pure approaches
- How reranking improves precision
- How evaluation should actually be done
- What tradeoffs exist in search system design
This project was about learning how modern search systems are engineered — not just calling an API.
If you're interested in:
- Search engines
- Information retrieval
- RAG systems
- AI system design
- Hybrid search architectures
I’d love to hear your thoughts.
Let me know what you’d improve or explore next 👇
Top comments (0)