Bhuvi D

Posted on May 17

How We Built CyberGraph RAG: A 3.5M Token Cybersecurity GraphRAG System with TigerGraph

#cybersecurity #database #llm #rag

Traditional Vector RAG struggles with highly connected cybersecurity data.

Threat actors, malware, CVEs, and attack techniques exist as relationships - not isolated text chunks.

To explore whether graph-based retrieval performs better, we built CyberGraph RAG, a cybersecurity benchmarking platform comparing:

LLM-only
Basic Vector RAG
TigerGraph GraphRAG

using a 3.5M+ token cybersecurity corpus built from MITRE ATT&CK, CISA KEV, and NVD feeds.

The Problem with Traditional RAG

Most retrieval systems today use vector similarity search.

Documents are chunked into fixed-size text blocks, embedded into vectors, and retrieved using cosine similarity.

While this works reasonably well for generic QA tasks, cybersecurity intelligence is heavily relationship-driven.

For example:

Which threat actors exploited Log4Shell to deploy ShadowPad in healthcare systems?

Answering this requires understanding relationships between:

Threat actors
Vulnerabilities
Malware families
Target industries

Traditional RAG retrieves nearby chunks of text, but often loses the actual relationships between entities.

This leads to:

noisy context
larger prompts
hallucinations
incorrect attack attribution

Our GraphRAG Approach

Instead of storing cybersecurity intelligence as disconnected chunks, we modeled it as a graph inside TigerGraph.

Entities included:

Threat Actors
Malware
CVEs
Attack Techniques
Target Sectors

Relationships included:

USES
TARGETS
EXPLOITS
DELIVERS

Example traversal:

APT41 → EXPLOITS → Log4Shell
Log4Shell → DELIVERS → ShadowPad
ShadowPad → TARGETS → Healthcare

When a query is received:

The system identifies the central entity.
TigerGraph performs multi-hop graph traversal.
Only the most relevant relationships are retrieved.
Gemini generates the final response using focused graph context.

This avoids injecting massive unrelated text chunks into the prompt.

System Architecture

CyberGraph compares LLM-only, Basic RAG, and TigerGraph GraphRAG pipelines side-by-side using a shared cybersecurity dataset.

While Basic RAG retrieves nearby text chunks using vector similarity search, GraphRAG performs multi-hop traversal over structured cybersecurity relationships stored inside TigerGraph.

The retrieved context is then passed to Gemini for final response generation, benchmark evaluation, and graph visualization.

Dashboard Comparison

CyberGraph dashboard comparing LLM-only, Basic RAG, and GraphRAG side-by-side.

Building the Cybersecurity Dataset

We created a lightweight dataset aggregation pipeline that automatically collected and normalized cybersecurity intelligence from:

MITRE ATT&CK Enterprise Matrix
CISA Known Exploited Vulnerabilities (KEV)
NVD CVE Feeds

Final Dataset Scale

3.5M+ tokens
21,029 processed documents
35,072 graph relationships

The processed dataset was converted into relationship-rich graph structures optimized for GraphRAG traversal.

Interactive Graph Visualization

To improve explainability, we integrated graph visualization directly into the dashboard.

Each query dynamically renders the retrieved graph neighborhood showing:

threat actors
malware families
CVEs
attack techniques
target industries

This made multi-hop cybersecurity reasoning much easier to validate visually.

Interactive graph visualization generated during GraphRAG traversal.

Benchmark Results

We benchmarked all 3 pipelines side-by-side using complex cybersecurity reasoning queries.

Pipeline	Avg Tokens	Avg Latency	Accuracy
LLM-only	950	10.15s	20%
Basic RAG	1280	6.45s	60%
GraphRAG	685	3.80s	100%

Key Improvements with GraphRAG

~46.5% lower token usage compared to Basic RAG
~62.5% lower latency
Lower estimated API cost
Higher factual consistency on multi-hop cybersecurity queries

Because GraphRAG retrieves focused entity relationships instead of large overlapping chunks, the prompts become:

smaller
cleaner
more explainable

Metrics Comparison

Benchmark comparison showing lower latency and token usage for GraphRAG.

Key Learnings

Graph relationships matter more than raw chunk similarity in cybersecurity workflows.
Smaller, focused prompts improved both latency and factual consistency.
Multi-hop graph traversal produced more explainable retrieval than traditional vector search.
Graph visualization significantly improved debugging and trust in retrieved threat intelligence.

Future Improvements

Next steps for CyberGraph include:

Real-time threat feed ingestion
Live attack-chain visualization
Autonomous community detection for emerging threat clusters
Adaptive graph traversal strategies

Conclusion

CyberGraph showed us that GraphRAG is not just a retrieval upgrade — it fundamentally changes how complex cybersecurity intelligence can be explored, explained, and validated.

By combining TigerGraph with Gemini, we built a system that is: