DEV Community

Ayush Patel
Ayush Patel

Posted on

We Built a GraphRAG System Over 14,000 Research Papers!! Here's What We Learned

For the TigerGraph GraphRAG Hackathon, we built GyanCortex — a Q&A system that answers factual and multi-hop questions over 14,247 AI/ML research papers.

The core question: does adding a knowledge graph on top of vector search actually help?


What We Built

Three retrieval pipelines, one benchmark (16 hand-authored questions):

  1. LLM-Only — keyword filter → dump papers into Gemini. Simple baseline.
  2. Hybrid RAG — Qdrant dense + sparse retrieval, cross-encoder reranking, query decomposition for multi-hop.
  3. GraphRAG — everything in Pipeline 2, plus TigerGraph for citation expansion (CITES edges) and topic linking (HAS_TOPIC edges).

Results

Pipeline Pass Rate Avg Latency
LLM-Only 31.2% 29s
Hybrid RAG 93.8% 115s
GraphRAG 100% 50s

More accurate and 2.3× faster than pure Hybrid RAG.


Why the Graph Helps

Vector search is good at finding semantically similar papers. It struggles with
papers that are related but phrased differently — exactly what multi-hop
questions need.

TigerGraph let us traverse citation networks and topic clusters to surface papers
the vector index ranked poorly. The one question Hybrid RAG failed was a
multi-hop synthesis question — the graph found the right papers, vector search
didn't.

The graph traversal adds ~2–5s per query. The accuracy gain is worth it.


Stack

TigerGraph · Qdrant · Llama 3.3 70B via Groq · FastAPI · React

Top comments (0)