feminist

Posted on May 16

The Answer Is an Edge, Not a Sentence — Building a Topology-Native GraphRAG Intelligence Platform with TigerGraph

#ai #database #python #graph

How we built Shadow Network Intelligence — a GraphRAG-powered fraud investigation platform that proved why topology-aware retrieval outperforms traditional RAG for financial crime investigations.

🔗 GitHub Repository: https://github.com/Tuhin-void/shadow-network-intelligence

Introduction

Most retrieval systems are built around documents.

Financial crime investigations are not.

Fraud networks, laundering chains, shell company ecosystems, mule accounts, and intermediary ownership structures do not exist as clean paragraphs inside a single document. They exist across:

transactions
shared devices
addresses
shell corporations
account transfers
ownership chains
hidden intermediaries
multi-hop relationships

Traditional retrieval systems can retrieve text.

But financial investigations are fundamentally about reconstructing relationships.

That realization became the foundation for our TigerGraph GraphRAG Inference Hackathon project:

Shadow Network Intelligence

A topology-native intelligence platform built to prove one thing:

Traditional retrieval preserves documents.

GraphRAG preserves relationships.

Or more simply:

The answer is an edge, not a sentence.

The Problem with Traditional Retrieval

Most modern AI retrieval systems rely on semantic similarity.

That works well when:

the answer exists inside a chunk
semantic similarity is enough
relationships are shallow
the retrieval target is local

But financial crime investigations behave very differently.

The answer often emerges only after reconstructing:

multi-hop ownership chains
indirect transaction flows
hidden intermediary entities
shell-company cascades
device-sharing patterns
ring structures
laundering topology

In these situations:

semantic similarity alone breaks down.

A chunk may contain a clue.

But the relationship continuity between clues disappears.

That is the core limitation of VectorRAG.

Our Hypothesis

We hypothesized that:

PureLLM systems would hallucinate or miss hidden structural relationships
VectorRAG systems would retrieve partial clues but fail to reconstruct topology
GraphRAG systems would recover hidden investigative structure through graph traversal

To test this properly, we needed:

adversarial datasets
relationship-dense ecosystems
hidden rings
multi-hop structures
topology-aware benchmarks

Not simple Q&A datasets.

Building the Dataset

We built a synthetic financial crime ecosystem specifically designed to stress retrieval systems.

The generated graph included:

6,000 people
5,000 companies
10,000 accounts
150,000+ transactions
shared devices
shared addresses
ownership structures
hidden fraud rings
intermediary laundering chains

Final graph scale:

Graph Component	Count
Vertices	175,204
Edges	373,439
Transaction Vertices	150,054
Reverse Edge Types	6

The important part was not scale alone.

It was:

Structural density.

We intentionally designed adversarial investigation scenarios where:

topology mattered
intermediary entities mattered
chunk retrieval failed structurally
graph traversal became necessary

Architecture Overview

The platform evolved into a full operational intelligence environment.

1_data_engine/
├── synthetic fraud generation
├── topology-aware ecosystems
├── adversarial benchmark generation
└── ingestion pipelines

2_baseline_systems/
├── PureLLM baseline
├── VectorRAG baseline
└── benchmark orchestration

3_graph_intelligence_core/
├── TigerGraph integration
├── GraphRAG traversal
├── topology-aware retrieval
└── structural expansion

4_orchestrator_api/
├── FastAPI orchestration
├── SSE investigation streaming
├── benchmark APIs
└── cognitive orchestration

5_agent_swarm/
├── retrieval analyst
├── topology investigator
├── sanctions tracer
└── fraud ring analyst

6_reasoning_engine/
├── grounded claims
├── contradiction detection
├── confidence scoring
└── explainability

7_reporting_engine/
├── operational reports
├── markdown exports
└── benchmark summaries

8_dashboard_ui/
├── operational workspace
├── graph investigation UI
├── benchmark comparison
└── cognitive reasoning surfaces

Why We Chose TigerGraph

This project required:

high-performance traversal
multi-hop exploration
topology-native reasoning
relationship continuity
structural neighborhood expansion

TigerGraph became the backbone of the entire intelligence system.

We used TigerGraph to:

reconstruct hidden ownership chains
detect fraud rings
traverse laundering paths
surface intermediary entities
expand graph neighborhoods
support topology-aware retrieval

The graph became:

not just storage,

but the reasoning substrate itself.

Building the Benchmark

One of the biggest goals of the project was to avoid fake benchmark theater.

We wanted:

real adversarial evaluation.

So instead of asking simplistic questions, we built investigation tasks such as:

tracing hidden ownership cascades
reconstructing laundering paths
identifying hidden ring members
detecting intermediary shell structures
recovering topology continuity

Each investigation was executed across:

PureLLM
VectorRAG
GraphRAG

inside the same operational environment.

The Results

The benchmark results became the strongest validation of the project thesis.

Structural Recall

System	Structural Success
PureLLM	0 / 20
VectorRAG	0 / 20
GraphRAG	20 / 20

GraphRAG successfully reconstructed:

hidden rings
intermediary chains
ownership topology
laundering paths
multi-hop relationships

while the other systems failed structurally.

And importantly:

this was not a tuning failure.

Vector retrieval fundamentally lacks topology.

A chunk cannot retrieve an edge that no longer exists.

The Most Important Realization

During development, one insight became impossible to ignore:

The answer was never hidden in a document.

It was hidden in the relationships.

That single realization completely shaped the rest of the platform.

Building the Cognitive Layer

We wanted the system to do more than retrieve graph neighborhoods.

We wanted:

grounded investigative reasoning.

So we added a cognitive reasoning layer capable of:

structural claim synthesis
contradiction detection
confidence scoring
evidence grounding
entity explanation
topology-based justification

Most importantly:

we intentionally prevented fake AI theater.

The reasoning system:

validates real graph IDs
rejects hallucinated entities
lowers confidence when evidence is weak
grounds every structural claim in topology

One of the most satisfying moments was watching the sanctions tracer correctly return:

low confidence.

Not because the system failed.

But because:

there genuinely was no sanctions evidence in the graph.

That honesty made the platform feel operationally credible.

Real-Time Investigation Streaming

To make investigations feel operational, we built a real SSE-driven orchestration flow.

Investigations stream live events such as:

entity discovered
ring reconstructed
hidden relationship surfaced
neighborhood expanded
reasoning synthesized
report finalized

Importantly:

these were not fake loading animations.

The events reflected:

real graph state transitions.

The Frontend Philosophy

At first, we built a conventional dashboard.

It quickly felt wrong.

Financial investigations are not spreadsheet experiences.

So the frontend evolved into:

an operational intelligence environment.

We built:

Worldspace continuity
TacticalRail navigation
investigation-first layouts
graph-native interaction flows
cognitive reasoning surfaces
operational benchmark visualization

The UI was intentionally designed to feel:

calm,
focused,
and structurally investigative.

Not flashy.

The Biggest Engineering Challenge

Ironically, the hardest problem was not graph traversal.

It was:

credibility.

Late in development, we realized something important:

A visually impressive system is meaningless if judges cannot trust the metrics.

So we performed a full credibility hardening pass.

We:

removed fake benchmark routes
eliminated synthesized frontend metrics
surfaced real TigerGraph counts
exposed reproducible benchmark artifacts
clearly labeled synthetic vs live surfaces
ensured every visible benchmark number traced back to real JSON artifacts

This completely changed the maturity of the platform.

The system stopped feeling like:

"a cool hackathon UI"

and started feeling like:

an actual reviewable intelligence platform.

The Final Platform

By the end of the project, Shadow Network Intelligence supported:

live TigerGraph integration
GraphRAG retrieval
adversarial benchmarking
cognitive reasoning
topology-aware investigation
multi-agent analysis
grounded structural claims
operational reporting
live SSE investigation streaming
graph-native UI workflows
real benchmark reproducibility

The platform evolved far beyond what we originally planned.

What We Learned

The biggest lesson was surprisingly simple:

AI retrieval systems fail when relationships matter more than text.

That is exactly where graphs become essential.

We learned that:

semantic similarity is not structural intelligence
topology changes retrieval fundamentally
graph traversal enables hidden relationship recovery
operational credibility matters more than visual complexity
grounded reasoning is more valuable than theatrical confidence

Most importantly:

we learned that:

GraphRAG is not just “RAG with a graph.”

It is a fundamentally different retrieval philosophy.

Final Thought

Traditional retrieval systems search for relevant documents.

Graph-native systems reconstruct hidden structure.

And in fraud investigations:

structure is the investigation.

That became the foundation of Shadow Network Intelligence.

And ultimately:

the answer was never a sentence.

It was an edge.

Tech Stack

TigerGraph
FastAPI
React + TypeScript
LangGraph
Ollama
ChromaDB
Docker
SSE Streaming

Join the Discussion

Have you found semantic similarity to become a bottleneck in relationship-heavy RAG systems?
How are you handling multi-hop topology reconstruction in your own GraphRAG pipelines?

DEV Community