<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Eduardo Borges</title>
    <description>The latest articles on DEV Community by Eduardo Borges (@eduardo_borges_7a50083176).</description>
    <link>https://dev.to/eduardo_borges_7a50083176</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3886137%2F48b1f8f7-a140-41bf-9e9d-3e471e0097a9.png</url>
      <title>DEV Community: Eduardo Borges</title>
      <link>https://dev.to/eduardo_borges_7a50083176</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/eduardo_borges_7a50083176"/>
    <language>en</language>
    <item>
      <title>RAG Is Failing in Production — Here’s Why (and What I’m Testing Instead)</title>
      <dc:creator>Eduardo Borges</dc:creator>
      <pubDate>Tue, 21 Apr 2026 14:49:42 +0000</pubDate>
      <link>https://dev.to/eduardo_borges_7a50083176/rag-is-failing-in-production-heres-why-and-what-im-testing-instead-1o28</link>
      <guid>https://dev.to/eduardo_borges_7a50083176/rag-is-failing-in-production-heres-why-and-what-im-testing-instead-1o28</guid>
      <description>&lt;p&gt;RAG (Retrieval-Augmented Generation) looks great in demos.&lt;/p&gt;

&lt;p&gt;But in real-world systems, it often fails in subtle ways.&lt;/p&gt;

&lt;p&gt;Not because retrieval is bad.&lt;/p&gt;

&lt;p&gt;But because it lacks something more fundamental.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem I Kept Seeing
&lt;/h2&gt;

&lt;p&gt;Everything worked fine… until it didn’t.&lt;/p&gt;

&lt;p&gt;Simple questions? Great.&lt;/p&gt;

&lt;p&gt;But anything that depended on multiple systems?&lt;/p&gt;

&lt;p&gt;That’s where things started to break.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How does the production deploy process work?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A typical RAG system retrieves documents like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD pipeline&lt;/li&gt;
&lt;li&gt;Kubernetes deployment&lt;/li&gt;
&lt;li&gt;Monitoring setup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All relevant.&lt;/p&gt;

&lt;p&gt;All correct.&lt;/p&gt;

&lt;p&gt;And still… incomplete.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Answer Is Still Wrong
&lt;/h2&gt;

&lt;p&gt;Because the real answer is not inside a single document.&lt;/p&gt;

&lt;p&gt;It’s in how they connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD triggers Kubernetes&lt;/li&gt;
&lt;li&gt;Deploy emits metrics&lt;/li&gt;
&lt;li&gt;Monitoring consumes those metrics&lt;/li&gt;
&lt;li&gt;Alerts trigger incident response&lt;/li&gt;
&lt;li&gt;Incident response triggers rollback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a list.&lt;/p&gt;

&lt;p&gt;This is a &lt;strong&gt;system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And RAG doesn’t understand systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Issue
&lt;/h2&gt;

&lt;p&gt;RAG retrieves by similarity.&lt;/p&gt;

&lt;p&gt;But real-world knowledge is structured by &lt;strong&gt;relationships&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So even when retrieval is "correct", the model gets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fragments of truth
&lt;/li&gt;
&lt;li&gt;without the structure to connect them
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why answers feel incomplete.&lt;/p&gt;




&lt;h2&gt;
  
  
  “Just Use Better Embeddings” Doesn’t Fix It
&lt;/h2&gt;

&lt;p&gt;I tried that.&lt;/p&gt;

&lt;p&gt;Better embeddings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;improve ranking
&lt;/li&gt;
&lt;li&gt;reduce noise
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But they don’t fix the core problem.&lt;/p&gt;

&lt;p&gt;You still get isolated chunks.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Started Testing
&lt;/h2&gt;

&lt;p&gt;Instead of treating documents as independent pieces, I tried:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;semantic search (same as RAG)&lt;/li&gt;
&lt;li&gt;+ building a graph of relationships between documents&lt;/li&gt;
&lt;li&gt;+ retrieving connected context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here are 3 relevant documents"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here’s how these documents connect"&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;In scenarios where context spans multiple domains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;answers became more complete
&lt;/li&gt;
&lt;li&gt;fewer gaps in reasoning
&lt;/li&gt;
&lt;li&gt;less "guessing" from the model
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s not perfect — but the difference is noticeable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tradeoff Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;This approach adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;complexity
&lt;/li&gt;
&lt;li&gt;processing overhead
&lt;/li&gt;
&lt;li&gt;graph construction challenges
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I’m still figuring out:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When is this actually worth it?&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Built Around This
&lt;/h2&gt;

&lt;p&gt;I ended up building a small tool to explore this idea in practice.&lt;/p&gt;

&lt;p&gt;It ingests documents, maps relationships, and retrieves connected context instead of isolated chunks.&lt;/p&gt;

&lt;p&gt;If you want to see it:&lt;br&gt;
👉 &lt;a href="https://usemindex.dev/" rel="noopener noreferrer"&gt;https://usemindex.dev/&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Open Question
&lt;/h2&gt;

&lt;p&gt;I’m not convinced this is always the right direction.&lt;/p&gt;

&lt;p&gt;Curious to hear from others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have you seen RAG fail like this in production?&lt;/li&gt;
&lt;li&gt;Are you solving this at retrieval time?&lt;/li&gt;
&lt;li&gt;Or relying on the model to stitch context together?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Would love to compare notes.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>llm</category>
    </item>
    <item>
      <title>Why RAG Breaks in Real-World Systems (and How I’m Trying to Fix It)</title>
      <dc:creator>Eduardo Borges</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:07:05 +0000</pubDate>
      <link>https://dev.to/eduardo_borges_7a50083176/why-rag-breaks-in-real-world-systems-and-how-im-trying-to-fix-it-35p</link>
      <guid>https://dev.to/eduardo_borges_7a50083176/why-rag-breaks-in-real-world-systems-and-how-im-trying-to-fix-it-35p</guid>
      <description>&lt;p&gt;Most Retrieval-Augmented Generation (RAG) setups work well for simple questions.&lt;/p&gt;

&lt;p&gt;But once you move to real-world systems, things start to break.&lt;/p&gt;

&lt;p&gt;I kept running into the same issue over and over again — and it wasn’t obvious at first why.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem Isn’t Retrieval — It’s Context
&lt;/h2&gt;

&lt;p&gt;Let’s take a simple example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How does the production deploy process work?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A typical RAG system will retrieve documents like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD pipeline&lt;/li&gt;
&lt;li&gt;Kubernetes deployment&lt;/li&gt;
&lt;li&gt;Monitoring setup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Individually, these are relevant.&lt;/p&gt;

&lt;p&gt;But they’re treated as isolated chunks of information.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where It Breaks
&lt;/h2&gt;

&lt;p&gt;In reality, the answer depends on how these systems connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD triggers Kubernetes&lt;/li&gt;
&lt;li&gt;The deploy emits metrics&lt;/li&gt;
&lt;li&gt;Monitoring consumes those metrics&lt;/li&gt;
&lt;li&gt;Alerts trigger incident response&lt;/li&gt;
&lt;li&gt;Incident response may trigger rollback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a list of documents.&lt;/p&gt;

&lt;p&gt;This is a &lt;strong&gt;chain of relationships&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And this is exactly where traditional RAG struggles.&lt;/p&gt;

&lt;p&gt;Even when retrieval is technically "correct", the model lacks the structure to connect these pieces.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Better Embeddings Don’t Solve It
&lt;/h2&gt;

&lt;p&gt;A common reaction is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We just need better embeddings."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I tried that.&lt;/p&gt;

&lt;p&gt;It improves ranking — but it doesn’t solve the core issue.&lt;/p&gt;

&lt;p&gt;You still get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;relevant documents&lt;/li&gt;
&lt;li&gt;but no understanding of how they relate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model gets fragments, not structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Started Experimenting With
&lt;/h2&gt;

&lt;p&gt;To address this, I started exploring a different approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use embeddings for semantic search (same as RAG)&lt;/li&gt;
&lt;li&gt;Build a &lt;strong&gt;knowledge graph&lt;/strong&gt; connecting documents&lt;/li&gt;
&lt;li&gt;Retrieve not just matches, but &lt;strong&gt;connected context&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of returning:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here are 3 similar documents"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here are the relevant documents AND how they connect"&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;In scenarios where the answer spans multiple systems, the difference is noticeable.&lt;/p&gt;

&lt;p&gt;Instead of partial answers, the model can follow the chain:&lt;/p&gt;

&lt;p&gt;CI/CD → Kubernetes → Monitoring → Incident Response&lt;/p&gt;

&lt;p&gt;This leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;more complete answers&lt;/li&gt;
&lt;li&gt;fewer hallucinations&lt;/li&gt;
&lt;li&gt;better reasoning across systems&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Tradeoff
&lt;/h2&gt;

&lt;p&gt;This approach is not free.&lt;/p&gt;

&lt;p&gt;It adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;complexity&lt;/li&gt;
&lt;li&gt;processing overhead&lt;/li&gt;
&lt;li&gt;graph construction challenges&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I’m still figuring out:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When is this actually worth it vs just overengineering RAG?&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I ended up building a small tool to explore this idea in practice.&lt;/p&gt;

&lt;p&gt;It ingests documents, builds relationships between them, and retrieves connected context instead of isolated chunks.&lt;/p&gt;

&lt;p&gt;If you're curious, you can check it out here:&lt;br&gt;
👉 &lt;a href="https://usemindex.dev/" rel="noopener noreferrer"&gt;https://usemindex.dev/&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Open Question
&lt;/h2&gt;

&lt;p&gt;I’m still early in this exploration, and I’m not convinced this is always the right approach.&lt;/p&gt;

&lt;p&gt;Curious to hear from others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have you hit similar limitations with RAG?&lt;/li&gt;
&lt;li&gt;How are you handling cross-document context today?&lt;/li&gt;
&lt;li&gt;Are you solving this at retrieval time, or leaving it to the model?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Would love to learn how others are approaching this.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>llm</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Stop using naive RAG</title>
      <dc:creator>Eduardo Borges</dc:creator>
      <pubDate>Sat, 18 Apr 2026 13:45:37 +0000</pubDate>
      <link>https://dev.to/eduardo_borges_7a50083176/stop-using-naive-rag-pmg</link>
      <guid>https://dev.to/eduardo_borges_7a50083176/stop-using-naive-rag-pmg</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febhjc4xq0nv1sqpofs7u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febhjc4xq0nv1sqpofs7u.png" alt=" " width="800" height="900"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most RAG setups look good in demos — until things get slightly complex.&lt;/p&gt;

&lt;p&gt;You ask a question, it retrieves “relevant” chunks, and everything seems fine.&lt;/p&gt;

&lt;p&gt;But as soon as your system spans multiple documents — APIs, billing, infra, workflows — things start breaking down.&lt;/p&gt;

&lt;p&gt;Not because the information isn’t there.&lt;br&gt;
But because the relationships between them are lost.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with RAG
&lt;/h2&gt;

&lt;p&gt;RAG works by retrieving chunks based on similarity.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It finds text that &lt;em&gt;looks&lt;/em&gt; relevant&lt;/li&gt;
&lt;li&gt;But doesn’t understand how pieces connect&lt;/li&gt;
&lt;li&gt;And can’t reconstruct system behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So you end up with answers that are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;technically correct&lt;/li&gt;
&lt;li&gt;but incomplete&lt;/li&gt;
&lt;li&gt;and often misleading&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real systems aren’t flat
&lt;/h2&gt;

&lt;p&gt;In real systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a deploy triggers a pipeline&lt;/li&gt;
&lt;li&gt;the pipeline applies changes to Kubernetes&lt;/li&gt;
&lt;li&gt;monitoring evaluates the rollout&lt;/li&gt;
&lt;li&gt;failures trigger rollback logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this lives in a single document.&lt;/p&gt;

&lt;p&gt;And RAG doesn’t connect these dots.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I built instead
&lt;/h2&gt;

&lt;p&gt;I built &lt;strong&gt;Mindex&lt;/strong&gt;:&lt;br&gt;
&lt;a href="https://usemindex.dev/" rel="noopener noreferrer"&gt;https://usemindex.dev/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead of just retrieving chunks, it builds a &lt;strong&gt;knowledge graph&lt;/strong&gt; on top of your documents.&lt;/p&gt;

&lt;p&gt;So your AI can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;connect documents&lt;/li&gt;
&lt;li&gt;follow relationships&lt;/li&gt;
&lt;li&gt;reconstruct flows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not just match text.&lt;/p&gt;




&lt;h2&gt;
  
  
  RAG vs Graph-based context
&lt;/h2&gt;

&lt;p&gt;Here’s a simplified comparison:&lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ Naive RAG
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Returns a flat list of documents&lt;/li&gt;
&lt;li&gt;No relationships&lt;/li&gt;
&lt;li&gt;No ordering&lt;/li&gt;
&lt;li&gt;No system understanding&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ Mindex (GraphRAG)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Connects documents&lt;/li&gt;
&lt;li&gt;Traverses relationships&lt;/li&gt;
&lt;li&gt;Infers flows (cause → effect)&lt;/li&gt;
&lt;li&gt;Provides structured context&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The difference is subtle at first.&lt;/p&gt;

&lt;p&gt;But when you're working with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal documentation&lt;/li&gt;
&lt;li&gt;APIs&lt;/li&gt;
&lt;li&gt;distributed systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It becomes critical.&lt;/p&gt;

&lt;p&gt;You don’t just need relevant text.&lt;/p&gt;

&lt;p&gt;You need to understand how things work together.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;Mindex combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;semantic search&lt;/li&gt;
&lt;li&gt;a knowledge graph layer&lt;/li&gt;
&lt;li&gt;relationship traversal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s available via:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CLI&lt;/li&gt;
&lt;li&gt;MCP (works with tools like Claude Code, Cursor, etc.)&lt;/li&gt;
&lt;li&gt;REST API&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;You can try it here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://usemindex.dev/" rel="noopener noreferrer"&gt;https://usemindex.dev/&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Feedback welcome
&lt;/h2&gt;

&lt;p&gt;I’m especially interested in feedback from people:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;building with RAG&lt;/li&gt;
&lt;li&gt;working with internal knowledge bases&lt;/li&gt;
&lt;li&gt;building AI dev tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Curious to hear how you're handling this today.&lt;/p&gt;

</description>
      <category>aiops</category>
      <category>rag</category>
      <category>claude</category>
      <category>cli</category>
    </item>
  </channel>
</rss>
