DEV Community

Cover image for How to Add Claim Verification to Your LangChain Agent in 5 Minutes
AgentOracle
AgentOracle

Posted on

How to Add Claim Verification to Your LangChain Agent in 5 Minutes

Your LangChain agent is wrong about 10% of the time. Not occasionally — consistently, confidently, and silently.

The problem isn't the model. It's that your agent has no way to know when it's wrong. It receives information, formats a response, and acts. No second opinion. No fact-check. No circuit breaker.

This tutorial shows you how to add a verification layer in 5 minutes that catches hallucinations before your agent acts on them.

The Problem

LLM hallucination rates in 2026 range from 3% to 20% depending on the task. On a summarization benchmark, GPT-4 looks great. On open-ended factual questions — the kind your agent asks constantly — it's a different story.

The deeper problem: reasoning models hallucinate more on factual tasks, not less. The more a model "thinks through" an answer, the more likely it is to fill gaps with plausible-sounding fiction.

In a simple chatbot, a hallucination is embarrassing. In an autonomous agent pipeline, it's a wrong action. A refunded order, a bad recommendation, a compliance violation, a message sent to the wrong person.

The standard fix is human review. But human review defeats the purpose of an autonomous agent.

The real fix is a verification layer that runs before your agent acts — independently of the model that generated the claim.

Install

pip install langchain-agentoracle
Enter fullscreen mode Exit fullscreen mode

That's it. No API keys. No configuration. The free tier gives you 20 preview verifications per hour to test with.

Quick Start: Verify Before Your Agent Acts

The simplest integration — verify a piece of text and get per-claim verdicts:

from langchain_agentoracle import AgentOracleEvaluateTool

verifier = AgentOracleEvaluateTool()

# Your agent just generated this text — is it true?
agent_output = """
OpenAI released GPT-4 in March 2023.
Bitcoin was created by Elon Musk.
The Python programming language was created by Guido van Rossum.
"""

result = verifier.run(agent_output)
print(result)
Enter fullscreen mode Exit fullscreen mode

Here's what comes back:

EVALUATION RESULT
Overall confidence: 0.61
Recommendation: ACT
Claims found: 3 | Supported: 2 | Refuted: 1 | Unverifiable: 0
Sources used: sonar, sonar-pro, adversarial, gemma-4

CLAIMS:
  ✓ [SUPPORTED] (1.00) OpenAI released GPT-4 in March 2023
    Evidence: Widely documented historical fact; GPT-4 was announced
    and released on March 14, 2023.

  ✗ [REFUTED] (0.83) Bitcoin was created by Elon Musk
    Evidence: Bitcoin's creator is the pseudonymous Satoshi Nakamoto.
    Correction: Bitcoin was created by Satoshi Nakamoto, not Elon Musk.

  ✓ [SUPPORTED] (1.00) Python was created by Guido van Rossum
    Evidence: Confirmed in official Python documentation and
    Van Rossum's own statements.
Enter fullscreen mode Exit fullscreen mode

Three claims went in. Two came back supported with evidence. One came back refuted with a correction. Your agent now knows claim #2 is wrong before it acts on it.

Add It to Your Agent's Toolbelt

Want your agent to verify claims on its own? Add the tools directly:

from langchain_agentoracle import get_agentoracle_tools

# Returns all 6 AgentOracle tools ready for your agent
tools = get_agentoracle_tools()

# Or pick specific ones:
from langchain_agentoracle import (
    AgentOracleEvaluateTool,    # Per-claim verification ($0.01)
    AgentOracleVerifyGateTool,  # Quick pass/fail gate (free)
    AgentOraclePreviewTool,     # Research preview (free, 20/hr)
)
Enter fullscreen mode Exit fullscreen mode

The tools follow LangChain's BaseTool interface, so they plug into any agent:

from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI
from langchain_agentoracle import AgentOracleEvaluateTool, AgentOraclePreviewTool

llm = ChatOpenAI(model="gpt-4")

tools = [
    AgentOracleEvaluateTool(),
    AgentOraclePreviewTool(),
]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True,
)

# The agent can now verify claims before acting
agent.run("Check if this is true: Tesla's market cap exceeded $2 trillion in 2024")
Enter fullscreen mode Exit fullscreen mode

The Verify-Then-Act Pattern

The most useful pattern: gate your agent's actions on verification confidence.

from langchain_agentoracle import AgentOracleEvaluateTool
import json

verifier = AgentOracleEvaluateTool()

def verify_then_act(text, confidence_threshold=0.8):
    """Only act if verification confidence exceeds threshold."""
    result = verifier.run(text)

    # Parse the confidence from the result
    # The tool returns a formatted string with overall confidence
    if "Overall confidence:" in result:
        conf_line = [l for l in result.split('\n') if 'Overall confidence' in l][0]
        confidence = float(conf_line.split(': ')[1])

        if confidence >= confidence_threshold:
            print(f"✅ VERIFIED ({confidence}) — safe to act")
            return True
        else:
            print(f"⚠️ LOW CONFIDENCE ({confidence}) — hold for review")
            return False
    return False

# In your agent pipeline:
claim = "The Federal Reserve raised interest rates in March 2024"
if verify_then_act(claim):
    # proceed with the action
    pass
else:
    # flag for human review or use a fallback
    pass
Enter fullscreen mode Exit fullscreen mode

Free Quick Check: The Verify Gate

Don't need per-claim breakdowns? The verify gate gives you a fast pass/fail:

from langchain_agentoracle import AgentOracleVerifyGateTool

gate = AgentOracleVerifyGateTool()

# Quick binary check — free, no payment needed
result = gate.run("The speed of light is approximately 300,000 km per second")
print(result)
# VERIFY GATE: FAIL
# Confidence: 1.00
# Recommendation: ACT
# ("FAIL" = gate found no issues — content is safe to act on)
Enter fullscreen mode Exit fullscreen mode

Why AgentOracle

Most hallucination detection tools are built for humans — dashboards, observability platforms, monitoring UIs. They tell you what went wrong after the fact.

AgentOracle is built for agents. It sits in the pipeline, takes any text, runs it through 4 independent verification sources in parallel, and returns a machine-readable verdict before your agent acts.

No dashboards. No subscriptions. No API keys to configure. Your agent calls /evaluate, gets ACT / VERIFY / REJECT with a confidence score and evidence, and decides what to do next.

What's under the hood:

  • 4 independent sources: Sonar, Sonar Pro, Adversarial challenge, and Gemma 4
  • Per-claim decomposition — complex text gets broken into individual verifiable claims
  • Confidence calibration across sources
  • Evidence and corrections for every verdict
  • 1,900+ claim fingerprints in the database and growing daily

Try It Now

Playground — no setup, no payment: agentoracle.co
Paste any text and see per-claim verdicts in under 15 seconds.

Packages:

  • pip install langchain-agentoraclePyPI
  • pip install crewai-agentoraclePyPI
  • npm install agentoracle-verifynpm

Source: GitHub


Hallucinations aren't going away. The models are getting better, but "better" still means wrong 3-10% of the time on the tasks your agents actually run.

A verification layer doesn't replace a good model. It catches the cases where even a good model is confidently wrong — which is exactly when you need it most.

Top comments (0)