<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kavyanjali</title>
    <description>The latest articles on DEV Community by Kavyanjali (@kavyanjali_lingam).</description>
    <link>https://dev.to/kavyanjali_lingam</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936603%2F24cea3de-5916-4842-8c07-f03c67c9353e.jpg</url>
      <title>DEV Community: Kavyanjali</title>
      <link>https://dev.to/kavyanjali_lingam</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kavyanjali_lingam"/>
    <language>en</language>
    <item>
      <title>Building a Biomedical GraphRAG Inference System: Comparing LLM-Only, Basic RAG, and GraphRAG Pipelines</title>
      <dc:creator>Kavyanjali</dc:creator>
      <pubDate>Sun, 17 May 2026 17:44:52 +0000</pubDate>
      <link>https://dev.to/kavyanjali_lingam/building-a-biomedical-graphrag-inference-system-with-tigergraph-and-llm-benchmarking-410n</link>
      <guid>https://dev.to/kavyanjali_lingam/building-a-biomedical-graphrag-inference-system-with-tigergraph-and-llm-benchmarking-410n</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As enterprise adoption of LLMs grows, inference costs, hallucinations, and retrieval inefficiencies are becoming major production challenges.&lt;/p&gt;

&lt;p&gt;Traditional vector-based Retrieval-Augmented Generation (RAG) improves grounding, but it still struggles with multi-hop reasoning and relationship-aware retrieval.&lt;/p&gt;

&lt;p&gt;For the TigerGraph GraphRAG Inference Hackathon, our team built a complete biomedical GraphRAG inference system that compares:&lt;/p&gt;

&lt;p&gt;• LLM-only inference&lt;br&gt;
• Basic RAG (Vector + LLM)&lt;br&gt;
• GraphRAG (Knowledge Graph + LLM)&lt;/p&gt;

&lt;p&gt;across latency, token usage, cost, grounded accuracy, and reasoning quality.&lt;/p&gt;

&lt;p&gt;Our goal was simple:&lt;/p&gt;

&lt;p&gt;Can GraphRAG reduce token usage while maintaining grounded and explainable answers?&lt;/p&gt;

&lt;p&gt;Main benchmarking dashboard comparing LLM-only, Basic RAG, and GraphRAG pipelines.&lt;/p&gt;

&lt;p&gt;🔗 GitHub Repository:&lt;br&gt;
&lt;a href="https://github.com/SIDHANTH-S/graphrag-inference-system" rel="noopener noreferrer"&gt;https://github.com/SIDHANTH-S/graphrag-inference-system&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🌐 Live Demo:&lt;br&gt;
&lt;a href="http://52.172.150.0:3000/" rel="noopener noreferrer"&gt;http://52.172.150.0:3000/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🎥 Demo Video:&lt;br&gt;
&lt;a href="https://drive.google.com/file/d/1CKCUYpRbdjh9qdTHKyu5V2V8J5c0lgRr/view?usp=sharing" rel="noopener noreferrer"&gt;https://drive.google.com/file/d/1CKCUYpRbdjh9qdTHKyu5V2V8J5c0lgRr/view?usp=sharing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why We Built This&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LLMs are powerful, but production AI systems face several challenges:&lt;/p&gt;

&lt;p&gt;• Hallucinated answers&lt;br&gt;
• Expensive context windows&lt;br&gt;
• Retrieval noise&lt;br&gt;
• Weak explainability&lt;br&gt;
• Difficulty performing multi-hop reasoning&lt;/p&gt;

&lt;p&gt;Basic RAG pipelines solve part of the problem by retrieving semantically similar chunks from vector databases.&lt;/p&gt;

&lt;p&gt;However, semantic similarity alone is often insufficient for domains like biomedicine, where relationships between drugs, diseases, enzymes, and pathways are highly structured.&lt;/p&gt;

&lt;p&gt;This is where GraphRAG becomes powerful.&lt;/p&gt;

&lt;p&gt;Instead of retrieving only semantically similar text, GraphRAG retrieves entities and relationships from a structured knowledge graph, enabling explainable and relationship-aware reasoning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our platform combines:&lt;/p&gt;

&lt;p&gt;• FAISS for semantic vector retrieval&lt;br&gt;
• TigerGraph for structured biomedical relationships&lt;br&gt;
• LLM-based entity extraction and answer synthesis&lt;br&gt;
• A benchmarking dashboard for evaluation and analytics&lt;/p&gt;

&lt;p&gt;End-to-End Architecture&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyozuofjq1scxg4mbqrhb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyozuofjq1scxg4mbqrhb.png" alt=" " width="800" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Three Pipelines&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. LLM-Only Pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This serves as the baseline pipeline.&lt;/p&gt;

&lt;p&gt;The user query is sent directly to the LLM without any retrieval or grounding.&lt;/p&gt;

&lt;p&gt;Advantages:&lt;br&gt;
• Fast&lt;br&gt;
• Simple&lt;/p&gt;

&lt;p&gt;Limitations:&lt;br&gt;
• High hallucination risk&lt;br&gt;
• No evidence grounding&lt;br&gt;
• Poor explainability&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Basic RAG Pipeline&lt;/strong&gt;&lt;br&gt;
The Basic RAG pipeline retrieves semantically similar chunks using FAISS embeddings.&lt;/p&gt;

&lt;p&gt;Pipeline flow:&lt;/p&gt;

&lt;p&gt;Query&lt;br&gt;
→ Embedding generation&lt;br&gt;
→ Vector retrieval&lt;br&gt;
→ Context injection&lt;br&gt;
→ LLM answer generation&lt;/p&gt;

&lt;p&gt;Advantages:&lt;br&gt;
• Better grounding than pure LLM inference&lt;br&gt;
• Reduced hallucinations&lt;/p&gt;

&lt;p&gt;Limitations:&lt;br&gt;
• Retrieval noise&lt;br&gt;
• Weak relationship understanding&lt;br&gt;
• Difficulty with multi-hop reasoning&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. GraphRAG Pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The GraphRAG pipeline combines semantic retrieval with structured graph traversal.&lt;/p&gt;

&lt;p&gt;The workflow includes:&lt;/p&gt;

&lt;p&gt;• Query entity extraction&lt;br&gt;
• Entity-to-graph resolution&lt;br&gt;
• Multi-hop graph expansion in TigerGraph&lt;br&gt;
• Evidence fusion&lt;br&gt;
• Grounded answer synthesis&lt;/p&gt;

&lt;p&gt;This enables the system to retrieve not only semantically similar text, but also biologically meaningful relationships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Biomedical Dataset and Knowledge Graph Construction&lt;/strong&gt;&lt;br&gt;
We used PubMed-style biomedical literature from the MedRAG dataset hosted on Hugging Face.&lt;br&gt;
&lt;strong&gt;Dataset Source&lt;/strong&gt; : &lt;a href="https://dev.tourl"&gt;&lt;/a&gt;&lt;a href="https://huggingface.co/datasets/MedRAG/pubmed" rel="noopener noreferrer"&gt;https://huggingface.co/datasets/MedRAG/pubmed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ingestion pipeline performs:&lt;/p&gt;

&lt;p&gt;• Document chunking&lt;br&gt;
• Biomedical entity extraction&lt;br&gt;
• Relation extraction&lt;br&gt;
• Dense embedding generation&lt;br&gt;
• TigerGraph vertex/edge creation&lt;br&gt;
• FAISS index construction&lt;/p&gt;

&lt;p&gt;The system extracts biomedical entities such as:&lt;/p&gt;

&lt;p&gt;• Drugs&lt;br&gt;
• Diseases&lt;br&gt;
• Genes&lt;br&gt;
• Side effects&lt;br&gt;
• Anatomical entities&lt;/p&gt;

&lt;p&gt;and stores their relationships in TigerGraph for graph-based retrieval.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbhz29kq3wkaldeezyojb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbhz29kq3wkaldeezyojb.png" alt=" " width="800" height="625"&gt;&lt;/a&gt;&lt;br&gt;
            High-throughput biomedical ingestion pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmarking and Evaluation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the main goals of this project was not just building GraphRAG, but scientifically evaluating it.&lt;/p&gt;

&lt;p&gt;Our dashboard compares:&lt;/p&gt;

&lt;p&gt;• Token usage&lt;br&gt;
• Latency&lt;br&gt;
• Estimated API cost&lt;br&gt;
• Grounded accuracy&lt;br&gt;
• BERTScore&lt;br&gt;
• LLM-as-a-Judge evaluation&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Query: Causal Biomedical Reasoning&lt;/strong&gt;&lt;br&gt;
One benchmark query asked:&lt;/p&gt;

&lt;p&gt;“What is the causal path from alloxan to arteriosclerosis?”&lt;/p&gt;

&lt;p&gt;Expected reasoning:&lt;/p&gt;

&lt;p&gt;Alloxan&lt;br&gt;
→ causes&lt;br&gt;
Diabetes&lt;br&gt;
→ increases&lt;br&gt;
Arteriosclerosis&lt;/p&gt;

&lt;p&gt;The LLM-only pipeline generated a plausible but unverified answer.&lt;/p&gt;

&lt;p&gt;Basic RAG retrieved semantically relevant evidence but struggled with structured causal reasoning.&lt;/p&gt;

&lt;p&gt;GraphRAG successfully combined semantic retrieval with graph-grounded biomedical relationships to generate a grounded causal explanation with supporting evidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Results&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Across evaluation queries, our GraphRAG pipeline achieved:&lt;/p&gt;

&lt;p&gt;• ~52% average token reduction&lt;br&gt;
• ~58% retrieval token savings&lt;br&gt;
• ~61% estimated API cost reduction&lt;br&gt;
• Strong grounded biomedical reasoning&lt;br&gt;
• Improved explainability through graph traces&lt;/p&gt;

&lt;p&gt;One of the most important findings was that GraphRAG reduced unnecessary retrieval context while maintaining answer quality through structured graph relationships.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdhanpn07w1so4memmkq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdhanpn07w1so4memmkq.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GraphRAG Benchmark Highlights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;✔ ~52% average token reduction&lt;br&gt;
✔ ~61% estimated API cost savings&lt;br&gt;
✔ Grounded biomedical reasoning&lt;br&gt;
✔ Multi-hop graph-based retrieval&lt;br&gt;
✔ Explainable evidence-backed answers&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What We Learned&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of our biggest takeaways was that GraphRAG is not simply “better retrieval.”&lt;/p&gt;

&lt;p&gt;Its real strength comes from:&lt;br&gt;
• structured reasoning&lt;br&gt;
• relationship-aware retrieval&lt;br&gt;
• explainability&lt;br&gt;
• context compression&lt;/p&gt;

&lt;p&gt;This becomes especially valuable in biomedical AI systems, where trust, traceability, and multi-hop reasoning are critical&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As enterprise AI systems continue to scale, inference efficiency and explainability will become increasingly important.&lt;/p&gt;

&lt;p&gt;This project demonstrated that GraphRAG can reduce retrieval overhead while maintaining grounded and explainable reasoning through structured biomedical knowledge graphs.&lt;/p&gt;

&lt;p&gt;The combination of vector retrieval and graph traversal opens exciting possibilities for production-grade GenAI systems that are not only accurate, but also interpretable and cost-efficient.&lt;br&gt;
This project was developed as part of the TigerGraph GraphRAG Inference Hackathon.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>llm</category>
      <category>rag</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
