DEV Community

Erando Putra
Erando Putra

Posted on

I built DocsRAG — because reading docs during coding is still painful

When working on integrations (Stripe, APIs, SDKs, etc.), I kept running into the same problem.

You search something simple like:

“how to generate an API key”

And then:

  • you open 3–5 documentation pages
  • each page explains a different piece
  • code examples are there… but not clearly tied to what you need
  • you end up stitching everything together manually

Even with AI tools, it’s not much better:

  • answers are often generic
  • sometimes not grounded in the actual docs
  • or missing key implementation details

After going through this over and over again, I decided to build something for it.


🚀 Introducing DocsRAG

DocsRAG is an open-source, local-first platform that turns public documentation into a grounded reasoning layer.

Instead of treating docs as just text to search, it tries to understand their structure and answer questions like an engineer would:

  • explanation first
  • then relevant code examples
  • backed by actual documentation
  • with citations

👉 Repo: https://github.com/Ando22/rag-docs


💡 The idea

Most documentation is written for browsing, not for implementation-time reasoning.

But when you’re coding, you don’t want:

  • long explanations
  • full pages
  • or unrelated examples

You want:

👉 “What exactly do I need to do?”

DocsRAG tries to bridge that gap.


😵 The problem

From my experience (and probably yours too):

  • Docs are large and fragmented
  • Useful answers are spread across multiple pages
  • Code examples and explanations are not tightly connected
  • AI tools hallucinate when context is weak
  • Docs chatbots return “related” answers, not precise ones

And most RAG systems treat everything the same:

  • explanation
  • reference
  • examples

Which leads to… mediocre answers.


🛠️ The approach

DocsRAG is built as a multi-stage pipeline, not just “retrieve and prompt”.

Ingestion

  • Crawl public documentation
  • Extract structured sections
  • Separate:

    • explanation chunks
    • code examples
  • Keep them linked by section/page

Ask flow

  • Analyze intent
  • Retrieve explanation-first
  • Rerank results
  • Attach only relevant code examples
  • Validate whether the docs actually support the answer
  • Generate grounded response with citations

🧠 High-level architecture

concept


⚙️ Tech stack

  • FastAPI (backend)
  • Next.js + React (frontend)
  • Chroma (vector DB)
  • SQLite (metadata)
  • Trafilatura + BeautifulSoup (extraction)
  • OpenAI-compatible models (BYO provider)
  • Typer CLI

🔥 Why I think this matters (especially now)

With AI-assisted coding (or what people call “vibe coding”), we move faster.

But:

  • docs are still the source of truth
  • AI answers are not always reliable

DocsRAG tries to combine both:

👉 fast iteration + grounded answers


🧪 Current state

This is still early, but already working:

  • ingest public docs
  • ask questions
  • get explanation-first answers
  • see citations
  • attach relevant code examples

🤝 Looking for contributors

If this resonates with you, I’d love contributions.

Interesting areas:

  • better parsing (docs are messy 😅)
  • retrieval improvements
  • code example ranking
  • UI/UX improvements
  • evaluation / benchmarking
  • MCP / agent integrations

Even small contributions (docs, testing, feedback) are super helpful.


🎯 Goal

The long-term goal is simple:

👉 Make documentation actually usable during coding

Not just readable — but actionable.


🙌 Closing

This started from a very personal frustration:

constantly jumping between docs while coding

If you’ve experienced the same thing, I’d love your thoughts

Top comments (0)