Prakhar Shukla

Posted on Mar 10 • Originally published at builder.aws.com

TruthLayer — How I Built an AI Hallucination Firewall on AWS

#aws #python #serverless #ai

"An AI that hallucinates in a hospital could cost a life. In a law firm, a lawsuit. In a bank, millions. The question is not whether AI makes mistakes — it is whether you catch them before your users do."

The Problem

In 2025, a law firm submitted a legal brief containing case citations that did not exist. Their AI assistant had fabricated case names, dates, and rulings with full confidence. The lawyers trusted it. The judge sanctioned them.

This was not a failure of AI technology. It was a failure of the infrastructure around AI — there was no layer between the model and the real world that simply asked: "Is this actually true?"

That is what I built.

What TruthLayer Does

TruthLayer is a production-ready verification API deployed on AWS. It sits silently between any AI model and its users — intercepting every response before it reaches a human and certifying whether each claim is verified.

Status	Meaning	Action
✅ VERIFIED	Factually grounded in your source documents	Safe to display
⚠️ UNCERTAIN	Topically related but not fully confirmed	Display with caveat
❌ UNSUPPORTED	Not found in any source — likely hallucinated	Block or flag

One API call. No model changes. No fine-tuning. Sub-second latency.

🌐 Try it free: truth-layer.vercel.app

The Core Innovation: Two Signals, Not One

Every existing hallucination detector uses one signal: embedding similarity. Here's why that fails.

"GDPR fines are up to 2% of revenue" vs "GDPR fines are up to 4% of revenue"

Cosine similarity between these two sentences: 0.97 out of 1.0. Nearly identical to any model. Completely opposite in a compliance audit.

An embedding-only system classifies the wrong answer as VERIFIED. TruthLayer catches it using a second signal.

Signal 1 — Amazon Bedrock Titan Embeddings V2: Claims and source chunks are embedded into 1,024-dimensional semantic vectors. Cosine similarity finds the best-matching source chunk for each claim.

Signal 2 — Entity Contradiction Checker (Custom): A rule-based system that applies multiplicative penalties for three contradiction classes embeddings fundamentally cannot detect:

Contradiction	Example	Penalty
Numerical mismatch	"2% fine" vs "4% fine"	× 0.5
Negation mismatch	"non-refundable" vs "refundable"	× 0.6
Superlative vs specific	"unlimited" vs "1,000/month"	× 0.6

Final Score = Cosine Similarity × Contradiction Penalty

The 2%/4% GDPR claim: 0.97 × 0.5 = 0.485 → UNSUPPORTED. Caught.

The AWS Architecture

Everything runs serverless on AWS — Amazon Bedrock, AWS Lambda, Amazon API Gateway, Amazon DynamoDB, deployed via AWS SAM. Four Lambda functions, four DynamoDB tables, one template.yaml file, one command to deploy.

The embedding cache is the key architectural decision. Early TruthLayer hit Bedrock on every request — a 3-document verification took 3–4 seconds. After adding DynamoDB as an embedding cache, the same verification dropped to 750ms. Documents don't change. Their embeddings shouldn't be recomputed every time.

Scenario	Latency	Cost
First verification (cache miss)	~900ms	Bedrock call
All subsequent verifications (cache hit)	~750ms	DynamoDB only — ~5ms
Monthly cost at 50,000 verifications	—	~$1.50 total

Security

API keys are SHA-256 hashed in DynamoDB — raw keys shown once, never stored. Same model as Stripe and GitHub. Rate limiting enforced at the database level via DynamoDB conditional writes, not the application layer. Each Lambda function holds only the exact IAM permissions it needs. Zero external PyPI dependencies — Python stdlib + boto3 only.

Live Demo

The dashboard tracks verification analytics, claim-level results, source attribution, API key management, and cache performance in real time.

Try it yourself: Go to truth-layer.vercel.app → Get API Key → paste any AI response and source document → see claim-by-claim results in under 1 second.

What I Learned

Embeddings are brilliant — and dangerously incomplete. "$2 million" and "$20 million" score 0.97 cosine similarity. They mean nearly the same thing semantically while being completely different factually. You need both signals.

The cache is as important as the algorithm. Without the DynamoDB embedding cache, TruthLayer was unusable at 3–4 seconds. Caching is infrastructure design, not optimization.

Data staying in AWS changes the enterprise conversation. Healthcare and legal organizations cannot send patient records or contracts to external APIs. Bedrock keeps everything within the AWS ecosystem. Compliance by default.

Full technical breakdown on AWS Builder Center

Stack: Amazon Bedrock · AWS Lambda · API Gateway · DynamoDB · AWS SAM · Next.js 16 · Python 3.9 · Kiro IDE

DEV Community