DEV Community: Eduardo Borges

RAG Is Failing in Production — Here’s Why (and What I’m Testing Instead)

Eduardo Borges — Tue, 21 Apr 2026 14:49:42 +0000

RAG (Retrieval-Augmented Generation) looks great in demos.

But in real-world systems, it often fails in subtle ways.

Not because retrieval is bad.

But because it lacks something more fundamental.

The Problem I Kept Seeing

Everything worked fine… until it didn’t.

Simple questions? Great.

But anything that depended on multiple systems?

That’s where things started to break.

Example:

"How does the production deploy process work?"

A typical RAG system retrieves documents like:

CI/CD pipeline
Kubernetes deployment
Monitoring setup

All relevant.

All correct.

And still… incomplete.

Why the Answer Is Still Wrong

Because the real answer is not inside a single document.

It’s in how they connect:

CI/CD triggers Kubernetes
Deploy emits metrics
Monitoring consumes those metrics
Alerts trigger incident response
Incident response triggers rollback

This is not a list.

This is a system.

And RAG doesn’t understand systems.

The Core Issue

RAG retrieves by similarity.

But real-world knowledge is structured by relationships.

So even when retrieval is "correct", the model gets:

fragments of truth
without the structure to connect them

That’s why answers feel incomplete.

“Just Use Better Embeddings” Doesn’t Fix It

I tried that.

Better embeddings:

improve ranking
reduce noise

But they don’t fix the core problem.

You still get isolated chunks.

What I Started Testing

Instead of treating documents as independent pieces, I tried:

semantic search (same as RAG)
+ building a graph of relationships between documents
+ retrieving connected context

So instead of:

"Here are 3 relevant documents"

You get:

"Here’s how these documents connect"

What Changed

In scenarios where context spans multiple domains:

answers became more complete
fewer gaps in reasoning
less "guessing" from the model

It’s not perfect — but the difference is noticeable.

The Tradeoff Nobody Talks About

This approach adds:

complexity
processing overhead
graph construction challenges

And I’m still figuring out:

When is this actually worth it?

What I Built Around This

I ended up building a small tool to explore this idea in practice.

It ingests documents, maps relationships, and retrieves connected context instead of isolated chunks.

If you want to see it:
👉 https://usemindex.dev/

Open Question

I’m not convinced this is always the right direction.

Curious to hear from others:

Have you seen RAG fail like this in production?
Are you solving this at retrieval time?
Or relying on the model to stitch context together?

Would love to compare notes.

Why RAG Breaks in Real-World Systems (and How I’m Trying to Fix It)

Eduardo Borges — Mon, 20 Apr 2026 16:07:05 +0000

Most Retrieval-Augmented Generation (RAG) setups work well for simple questions.

But once you move to real-world systems, things start to break.

I kept running into the same issue over and over again — and it wasn’t obvious at first why.

The Problem Isn’t Retrieval — It’s Context

Let’s take a simple example:

"How does the production deploy process work?"

A typical RAG system will retrieve documents like:

CI/CD pipeline
Kubernetes deployment
Monitoring setup

Individually, these are relevant.

But they’re treated as isolated chunks of information.

Where It Breaks

In reality, the answer depends on how these systems connect:

CI/CD triggers Kubernetes
The deploy emits metrics
Monitoring consumes those metrics
Alerts trigger incident response
Incident response may trigger rollback

This is not a list of documents.

This is a chain of relationships.

And this is exactly where traditional RAG struggles.

Even when retrieval is technically "correct", the model lacks the structure to connect these pieces.

Why Better Embeddings Don’t Solve It

A common reaction is:

"We just need better embeddings."

I tried that.

It improves ranking — but it doesn’t solve the core issue.

You still get:

relevant documents
but no understanding of how they relate

The model gets fragments, not structure.

What I Started Experimenting With

To address this, I started exploring a different approach:

Use embeddings for semantic search (same as RAG)
Build a knowledge graph connecting documents
Retrieve not just matches, but connected context

So instead of returning:

"Here are 3 similar documents"

You get:

"Here are the relevant documents AND how they connect"

What Changed

In scenarios where the answer spans multiple systems, the difference is noticeable.

Instead of partial answers, the model can follow the chain:

CI/CD → Kubernetes → Monitoring → Incident Response

This leads to:

more complete answers
fewer hallucinations
better reasoning across systems

The Tradeoff

This approach is not free.

It adds:

complexity
processing overhead
graph construction challenges

And I’m still figuring out:

When is this actually worth it vs just overengineering RAG?

What I Built

I ended up building a small tool to explore this idea in practice.

It ingests documents, builds relationships between them, and retrieves connected context instead of isolated chunks.

If you're curious, you can check it out here:
👉 https://usemindex.dev/

Open Question

I’m still early in this exploration, and I’m not convinced this is always the right approach.

Curious to hear from others:

Have you hit similar limitations with RAG?
How are you handling cross-document context today?
Are you solving this at retrieval time, or leaving it to the model?

Would love to learn how others are approaching this.

Stop using naive RAG

Eduardo Borges — Sat, 18 Apr 2026 13:45:37 +0000

Most RAG setups look good in demos — until things get slightly complex.

You ask a question, it retrieves “relevant” chunks, and everything seems fine.

But as soon as your system spans multiple documents — APIs, billing, infra, workflows — things start breaking down.

Not because the information isn’t there.
But because the relationships between them are lost.

The problem with RAG

RAG works by retrieving chunks based on similarity.

That means:

It finds text that looks relevant
But doesn’t understand how pieces connect
And can’t reconstruct system behavior

So you end up with answers that are:

technically correct
but incomplete
and often misleading

Real systems aren’t flat

In real systems:

a deploy triggers a pipeline
the pipeline applies changes to Kubernetes
monitoring evaluates the rollout
failures trigger rollback logic

None of this lives in a single document.

And RAG doesn’t connect these dots.

What I built instead

I built Mindex:
https://usemindex.dev/

Instead of just retrieving chunks, it builds a knowledge graph on top of your documents.

So your AI can:

connect documents
follow relationships
reconstruct flows

Not just match text.

RAG vs Graph-based context

Here’s a simplified comparison:

❌ Naive RAG

Returns a flat list of documents
No relationships
No ordering
No system understanding

✅ Mindex (GraphRAG)

Connects documents
Traverses relationships
Infers flows (cause → effect)
Provides structured context

Why this matters

The difference is subtle at first.

But when you're working with:

internal documentation
APIs
distributed systems

It becomes critical.

You don’t just need relevant text.

You need to understand how things work together.

How it works

Mindex combines:

semantic search
a knowledge graph layer
relationship traversal

It’s available via:

CLI
MCP (works with tools like Claude Code, Cursor, etc.)
REST API

Try it

You can try it here:

https://usemindex.dev/

Feedback welcome

I’m especially interested in feedback from people:

building with RAG
working with internal knowledge bases
building AI dev tools

Curious to hear how you're handling this today.