Erando Putra

Posted on Mar 18

I built DocsRAG — because reading docs during coding is still painful

#opensource #productivity #rag #showdev

When working on integrations (Stripe, APIs, SDKs, etc.), I kept running into the same problem.

You search something simple like:

“how to generate an API key”

And then:

you open 3–5 documentation pages
each page explains a different piece
code examples are there… but not clearly tied to what you need
you end up stitching everything together manually

Even with AI tools, it’s not much better:

answers are often generic
sometimes not grounded in the actual docs
or missing key implementation details

After going through this over and over again, I decided to build something for it.

🚀 Introducing DocsRAG

DocsRAG is an open-source, local-first platform that turns public documentation into a grounded reasoning layer.

Instead of treating docs as just text to search, it tries to understand their structure and answer questions like an engineer would:

explanation first
then relevant code examples
backed by actual documentation
with citations

👉 Repo: https://github.com/Ando22/rag-docs

💡 The idea

Most documentation is written for browsing, not for implementation-time reasoning.

But when you’re coding, you don’t want:

long explanations
full pages
or unrelated examples

You want:

👉 “What exactly do I need to do?”

DocsRAG tries to bridge that gap.

😵 The problem

From my experience (and probably yours too):

Docs are large and fragmented
Useful answers are spread across multiple pages
Code examples and explanations are not tightly connected
AI tools hallucinate when context is weak
Docs chatbots return “related” answers, not precise ones

And most RAG systems treat everything the same:

explanation
reference
examples

Which leads to… mediocre answers.

🛠️ The approach

DocsRAG is built as a multi-stage pipeline, not just “retrieve and prompt”.

Ingestion

Crawl public documentation
Extract structured sections
Separate:
- explanation chunks
- code examples
Keep them linked by section/page

Ask flow

Analyze intent
Retrieve explanation-first
Rerank results
Attach only relevant code examples
Validate whether the docs actually support the answer
Generate grounded response with citations

🧠 High-level architecture

⚙️ Tech stack

FastAPI (backend)
Next.js + React (frontend)
Chroma (vector DB)
SQLite (metadata)
Trafilatura + BeautifulSoup (extraction)
OpenAI-compatible models (BYO provider)
Typer CLI

🔥 Why I think this matters (especially now)

With AI-assisted coding (or what people call “vibe coding”), we move faster.

But:

docs are still the source of truth
AI answers are not always reliable

DocsRAG tries to combine both:

👉 fast iteration + grounded answers

🧪 Current state

This is still early, but already working:

ingest public docs
ask questions
get explanation-first answers
see citations
attach relevant code examples

🤝 Looking for contributors

If this resonates with you, I’d love contributions.

Interesting areas:

better parsing (docs are messy 😅)
retrieval improvements
code example ranking
UI/UX improvements
evaluation / benchmarking
MCP / agent integrations

Even small contributions (docs, testing, feedback) are super helpful.

🎯 Goal

The long-term goal is simple:

👉 Make documentation actually usable during coding

Not just readable — but actionable.

🙌 Closing

This started from a very personal frustration:

constantly jumping between docs while coding

If you’ve experienced the same thing, I’d love your thoughts

DEV Community