<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hector Hernandez Cruz</title>
    <description>The latest articles on DEV Community by Hector Hernandez Cruz (@hector_hernndez_cruz).</description>
    <link>https://dev.to/hector_hernndez_cruz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4010186%2Fabed1247-2a5b-4fa8-a386-e405228fd872.png</url>
      <title>DEV Community: Hector Hernandez Cruz</title>
      <link>https://dev.to/hector_hernndez_cruz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hector_hernndez_cruz"/>
    <language>en</language>
    <item>
      <title>Building a Production RAG Pipeline with Hybrid Retrieval and LangChain</title>
      <dc:creator>Hector Hernandez Cruz</dc:creator>
      <pubDate>Wed, 01 Jul 2026 00:34:16 +0000</pubDate>
      <link>https://dev.to/hector_hernndez_cruz/building-a-production-rag-pipeline-with-hybrid-retrieval-and-langchain-4cdm</link>
      <guid>https://dev.to/hector_hernndez_cruz/building-a-production-rag-pipeline-with-hybrid-retrieval-and-langchain-4cdm</guid>
      <description>&lt;p&gt;Most RAG tutorials get you 70% of the way there. This is about the other 30% that actually matters in production.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Why basic RAG fails&lt;br&gt;
Embed your docs, retrieve the top-k, pass to the LLM. Simple. But in production you quickly hit a wall. Dense vector search misses exact keyword matches. Keyword search misses semantic meaning. Your retrieval quality plateaus and your LLM starts hallucinating because the wrong context is coming in.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hybrid Retrieval fixes this&lt;br&gt;
Combine dense vector search with BM25 keyword search, then fuse the ranked results using Reciprocal Rank Fusion. You get the best of both worlds and retrieval precision jumps noticeably.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add a reranker&lt;br&gt;
After retrieval, run a cross-encoder reranker on your top candidates. It's slower than embedding similarity but far more accurate. This is the highest ROI improvement you can make after basic RAG is working.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Measure everything&lt;br&gt;
Most people skip evaluation entirely. Build a harness that measures hit rate, MRR, and faithfulness before you change anything. Otherwise you're flying blind every time you swap a model or tweak a prompt.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>langchain</category>
      <category>python</category>
    </item>
  </channel>
</rss>
