DEV Community

vijaya kumari
vijaya kumari

Posted on

How I Built a Fintech AI Agent That Detects Fraud and Generates AML Risk Reports — With Zero Hallucinations

The Problem

AI in fintech cannot hallucinate.

A fabricated regulation reference = legal liability.
A missed fraud pattern = financial crime.
A wrong compliance answer = regulatory penalty.

Yet most AI demos ship without any eval layer.
I built one.


What I Built

A 4-tab Fintech AI Agent deployed on HuggingFace:

Tab 1 — Fraud Detector
Analyzes transactions for fraud patterns.
Returns: risk score (0-10), red flags,
approve/review/reject recommendation.

Test input:
"Transfer $9,800 to Cayman Islands
at 3:47am from unrecognized device"

Result: 9/10 HIGH RISK → REJECT

Red flags caught:

  • Amount just below $10K CTR threshold (structuring)
  • High-risk jurisdiction (Cayman Islands)
  • Unusual transaction time (3:47am)
  • New unrecognized device

Tab 2 — Compliance Q&A
RAG over hardcoded financial regulations:
KYC, AML, GDPR, SOX, PCI-DSS

Every answer:

  • Cites specific regulation + section
  • Shows confidence score
  • Flags hallucination risk

Tab 3 — AML Risk Report Generator
Generates formal 6-section risk assessments:

  • Customer Risk Profile
  • Transaction Pattern Analysis
  • Red Flags Identified
  • Regulatory Considerations
  • Recommended Actions
  • Compliance Officer Notes

Tab 4 — Eval Dashboard
Real-time metrics across all tabs:

  • Total queries processed
  • Avg quality score
  • Hallucinations flagged
  • Risk alerts triggered

The Eval Layer

Every Claude output is scored for:

{
  "faithfulness_score": 0.95,
  "confidence": 0.85,
  "hallucination_risk": "LOW"
}
Enter fullscreen mode Exit fullscreen mode

This is the LLM-as-Judge pattern —
Claude evaluating Claude's own outputs.

Results from first run:

  • 98% avg quality score
  • 0 hallucinations detected
  • Faithfulness: 95-100% per tab

Tech Stack

  • Claude (claude-sonnet-4-20250514)
  • Pinecone (vector store + semantic dedup)
  • LangSmith (production tracing)
  • TruLens (eval monitoring dashboard)
  • Gradio (HF Space UI)

Live Demo

huggingface.co/spaces/Vijayarv07/fintech-ai-agent

GitHub:
github.com/vijayarjun7


What's Next

Adding quantum-inspired compression to the
inference layer (QuantRot-PQC research).

Because reliable AI + efficient AI =
production-ready AI.

BuildInPublic

Top comments (0)