<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Likhitha M</title>
    <description>The latest articles on DEV Community by Likhitha M (@likhitha_m_4ace61f190b3f8).</description>
    <link>https://dev.to/likhitha_m_4ace61f190b3f8</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3935734%2F2c35bb08-12d1-41f5-8e0c-b44580c215a8.png</url>
      <title>DEV Community: Likhitha M</title>
      <link>https://dev.to/likhitha_m_4ace61f190b3f8</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/likhitha_m_4ace61f190b3f8"/>
    <language>en</language>
    <item>
      <title>Tiger Graph Hackathon</title>
      <dc:creator>Likhitha M</dc:creator>
      <pubDate>Sun, 17 May 2026 03:57:24 +0000</pubDate>
      <link>https://dev.to/likhitha_m_4ace61f190b3f8/tiger-graph-hackathon-2p1i</link>
      <guid>https://dev.to/likhitha_m_4ace61f190b3f8/tiger-graph-hackathon-2p1i</guid>
      <description>&lt;h1&gt;
  
  
  🚀 Beating the Token Explosion: How GraphRAG Outperforms Vector Search in Medical AI
&lt;/h1&gt;

&lt;p&gt;As &lt;strong&gt;Large Language Models (LLMs)&lt;/strong&gt; scale across industries, developers are hitting a massive wall: the &lt;strong&gt;token explosion&lt;/strong&gt;. Shoving massive document dumps into an LLM's context window isn't just slow—it is incredibly expensive. In domains like healthcare, where precision is everything, hallucinating because the model lost track of complex relationships isn't just an error; it's a critical failure.&lt;/p&gt;

&lt;p&gt;For the &lt;strong&gt;🏆 TigerGraph GraphRAG Inference Hackathon&lt;/strong&gt;, my teammate and I wanted to prove that graphs make LLM inference &lt;strong&gt;faster, cheaper, and fundamentally smarter&lt;/strong&gt;. The goal wasn't just to build a pipeline, but to benchmark token reduction while rigorously maintaining answer accuracy through a custom-built, interactive UI.&lt;/p&gt;

&lt;p&gt;Here is a technical breakdown of how we built an interactive GraphRAG benchmark using Python and &lt;code&gt;tkinter&lt;/code&gt;, the three pipelines we compared, and the data that proves why graphs are the future of retrieval.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏥 The Problem and The Medical Dataset
&lt;/h2&gt;

&lt;p&gt;Building robust architecture for healthcare applications requires absolute precision. That is why we chose a dense &lt;strong&gt;Medical Dataset—mapping specific diseases and their overlapping symptoms&lt;/strong&gt;—for this hackathon. &lt;/p&gt;

&lt;p&gt;Medical data is inherently a graph. A symptom links to multiple possible diseases, and a disease links to specific treatments. Standard vector search struggles to connect these dots safely because it treats text as isolated, independent chunks. We needed a &lt;strong&gt;knowledge graph&lt;/strong&gt; to preserve the true multi-hop relationships.&lt;/p&gt;




&lt;h2&gt;
  
  
  🖥️ System Architecture &amp;amp; The Tkinter Dashboard
&lt;/h2&gt;

&lt;p&gt;To make the comparison tangible and user-friendly, we bypassed standard web frameworks and built a lightweight, interactive &lt;strong&gt;GUI using Python's native &lt;code&gt;tkinter&lt;/code&gt; library&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;This dashboard acts as our command center. When you enter a single patient symptom query, the GUI simultaneously routes it through all three pipelines, displaying the responses side-by-side along with real-time performance metrics. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmea10opit4bls6jm21ty.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmea10opit4bls6jm21ty.jpeg" alt="Architecture" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 The Three Inference Pipelines: A Medical Case Study
&lt;/h2&gt;

&lt;p&gt;To truly measure the impact, our app evaluates three side-by-side pipelines answering the exact same symptom-based queries. Let's look at how they handle a complex, multi-hop query like: &lt;em&gt;"What diseases share the symptoms of chronic migraines, acute nausea, and severe vertigo?"&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ Pipeline 1: LLM-Only (The Baseline)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; A prompt goes in, and an output comes out. &lt;strong&gt;Zero external retrieval&lt;/strong&gt;. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Medical Result:&lt;/strong&gt; The LLM relies entirely on its pre-trained weights. It frequently generates generic advice or hallucinates a completely unrelated disease. In healthcare AI, this worst-case baseline is highly risky.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4qrnk732e0ooy047hxg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4qrnk732e0ooy047hxg.png" alt="PIPELINE 1" width="800" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚠️ Pipeline 2: Basic RAG (Vector + LLM)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; The current industry standard. Vector embeddings retrieve mathematically similar text chunks regarding the listed symptoms, dumping them into the context window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Medical Result:&lt;/strong&gt; It pulls a massive, noisy context dump of raw symptom descriptions. The LLM has to sift through thousands of tokens to try and find the overlap. It frequently misses crucial relationship data because vector search lacks explicit entity linking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ev4okywz1fuax9yb49f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ev4okywz1fuax9yb49f.png" alt="PIPELINE 2" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ✅ Pipeline 3: GraphRAG (Graph + LLM)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; Built using the &lt;strong&gt;TigerGraph GraphRAG&lt;/strong&gt; framework. We modeled the data as interconnected entities (&lt;code&gt;Disease&lt;/code&gt; and &lt;code&gt;Symptom&lt;/code&gt;) and relationships (&lt;code&gt;HAS_SYMPTOM&lt;/code&gt;). &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Medical Result:&lt;/strong&gt; TigerGraph performs &lt;strong&gt;true multi-hop reasoning&lt;/strong&gt;. It identifies the symptoms, traces the graph edges to find the exact diseases where they intersect, and filters out the noise. The LLM receives a clean, highly structured, and tightly filtered prompt. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhytx1o14qwgp3a4wjcaz.png" alt="PIPELINE 3" width="800" height="425"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  📊 Evaluating the Metrics: The Comparison Dashboard
&lt;/h2&gt;

&lt;p&gt;The heart of this project is the interactive comparison dashboard. Here is how the performance metrics stacked up during testing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Pipeline 1 (LLM-Only)&lt;/th&gt;
&lt;th&gt;Pipeline 2 (Basic RAG)&lt;/th&gt;
&lt;th&gt;Pipeline 3 (GraphRAG)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tokens Used&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Baseline (Prompt only)&lt;/td&gt;
&lt;td&gt;Extremely High (Massive context dump)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Significantly Reduced&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost Per Query&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Optimized &amp;amp; Predictable&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Response Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ultra-Fast&lt;/td&gt;
&lt;td&gt;Slowest (Processing massive chunks)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Balanced &amp;amp; Efficient&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Hop Reasoning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fails completely&lt;/td&gt;
&lt;td&gt;Fails frequently&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Excels (Traverses explicit edges)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qdirs45gbwu9xx3wdjb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qdirs45gbwu9xx3wdjb.png" alt="DASHBOARD" width="800" height="423"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Caption: Live side-by-side comparison inside our Python GUI dashboard.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔬 Deep-Diving Into the Knowledge Graph Data
&lt;/h2&gt;

&lt;p&gt;To give a better visual of how our application structures facts, we can see how the schema explicitly connects diseases directly to symptoms instead of scattering them inside raw text paragraphs. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpsgtuf6dsyao8na49c9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpsgtuf6dsyao8na49c9.png" alt="Tiger Graph Schema Exploration" width="800" height="448"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Caption: A view of our interconnected nodes inside TigerGraph.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 Ensuring Answer Accuracy (The Hugging Face Evaluation)
&lt;/h2&gt;

&lt;p&gt;Cutting tokens is useless if the answer quality drops—especially with medical data. To prove Pipeline 3's superiority, every response was rigorously evaluated using two complementary Hugging Face approaches:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LLM-as-a-Judge:&lt;/strong&gt; A hosted Hugging Face model graded each answer on a PASS/FAIL basis, targeting a pass rate of &lt;strong&gt;≥ 90%&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BERTScore:&lt;/strong&gt; We measured the semantic similarity of the generated response against the ground-truth correct answer, aiming for an F1 rescaled score of &lt;strong&gt;≥ 0.55&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Through iterative parameter tuning, the GraphRAG pipeline consistently hit these high-accuracy thresholds while utilizing a fraction of the tokens required by Basic RAG.&lt;/p&gt;

&lt;h2&gt;
  
  
  💡 The Verdict
&lt;/h2&gt;

&lt;p&gt;The numbers tell the story. By feeding the LLM structured graph relationships instead of raw vector chunks, TigerGraph extracts exactly what is needed for complex, symptom-to-disease routing. This means &lt;strong&gt;optimized API costs, faster generation times, and highly accurate, hallucination-resistant answers.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj1odjrfc0kjm42h7wlv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flj1odjrfc0kjm42h7wlv.png" alt=" " width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🛠️ Open Source &amp;amp; Code Walkthrough
&lt;/h3&gt;

&lt;p&gt;We have open-sourced our benchmarking suite! Check out how we wired up the graph traversal and built the GUI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;💻 &lt;strong&gt;GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/Lochan-Visnu/GraphRAG-Hackathon-Lcube*" rel="noopener noreferrer"&gt;https://github.com/Lochan-Visnu/GraphRAG-Hackathon-Lcube*&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🎥 &lt;strong&gt;Demo Video:&lt;/strong&gt; &lt;a href="https://drive.google.com/file/d/1eF7Ahm6laaiajQhQDBlMzY-m1IrczWZJ/view?usp=drivesdk" rel="noopener noreferrer"&gt;https://drive.google.com/file/d/1eF7Ahm6laaiajQhQDBlMzY-m1IrczWZJ/view?usp=drivesdk&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A huge thank you to the TigerGraph team for providing the open-source repo, the Savanna environment, and the opportunity to tackle this industry-wide problem during the hackathon!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devchallenge</category>
      <category>llm</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
