Docify v2: Moving Beyond Standard RAG to Multi-Document Agents

#ai #opensource #architecture #showdev

When I first launched Docify, the goal was to build a reliable way to chat with your documents locally. It worked well for a handful of PDFs, but as the library grew, the limitations of a "one-size-fits-all" search became clear. A single giant index works for simple questions, but it struggles when you want to compare two research papers or find deep insights hidden across hundreds of files.

What’s New: The "Agentic" Shift

In the first version, Docify would look through every single chunk of text in your workspace at once. This was slow and often noisy.
In v2, every document you upload is essentially treated as its own "mini-expert" or Document Agent. When you ask a question, the system doesn't just dive into the data—it actually plans its approach.

Smarter Query Planning

Instead of just searching, Docify now "thinks" first. If you ask for a comparison between two documents, it recognizes that intent and orchestrates a search across those specific entities. If you ask a general question, it sweeps the workspace to find the most relevant "experts" to consult.

Parallel Document Retrieval

Once the system knows which documents are relevant, it searches them in parallel. By treating documents as independent agents, we can fetch information from multiple sources simultaneously. This makes the system feel much snappier, even as your workspace grows.

Better Retrieval, Better Answers

Finding the right information is only half the battle; ensuring the AI uses it correctly is the other half.

Hybrid Search (SQL-Native): We’ve combined the best of both worlds—semantic "meaning-based" search and traditional keyword search. This is now handled directly inside the database, making it incredibly fast and much better at finding exact names or technical terms that semantic search sometimes misses.
Strict Grounding: One of the biggest issues with AI is "hallucination." In v2, we’ve implemented stricter rules for how the AI cites its sources. If the information isn’t in your documents, the system will tell you, rather than making up a plausible-sounding answer.

Hardware-Aware Performance

Since Docify is a local-first application, everyone’s computer is different. v2 includes Hardware Detection that tunes the system to your specific machine.

If you have a GPU or Apple Silicon (Metal), it automatically enables more powerful models and larger context windows for deeper reading.
If you’re on a standard CPU, it intelligently scales down to more efficient models and optimized thread counts so your computer doesn't lock up while searching.

Looking Ahead

The move to a multi-document agent architecture isn't just a performance boost—it changes how you interact with your knowledge. Instead of searching a database, you're essentially orchestrating a team of experts that live in your documents.
I'm continuing to refine the system to make it even more intuitive. If you’re interested in building local-first RAG or want to see the code behind the agents, check out the repository. Docify GitHub