How to Add Claim Verification to Your LangChain Agent in 5 Minutes

#ai #langchain #python #agents

Your LangChain agent is wrong about 10% of the time. Not occasionally — consistently, confidently, and silently.

The problem isn't the model. It's that your agent has no way to know when it's wrong. It receives information, formats a response, and acts. No second opinion. No fact-check. No circuit breaker.

This tutorial shows you how to add a verification layer in 5 minutes that catches hallucinations before your agent acts on them.

The Problem

LLM hallucination rates in 2026 range from 3% to 20% depending on the task. On a summarization benchmark, GPT-4 looks great. On open-ended factual questions — the kind your agent asks constantly — it's a different story.

The deeper problem: reasoning models hallucinate more on factual tasks, not less. The more a model "thinks through" an answer, the more likely it is to fill gaps with plausible-sounding fiction.

In a simple chatbot, a hallucination is embarrassing. In an autonomous agent pipeline, it's a wrong action. A refunded order, a bad recommendation, a compliance violation, a message sent to the wrong person.

The standard fix is human review. But human review defeats the purpose of an autonomous agent.

The real fix is a verification layer that runs before your agent acts — independently of the model that generated the claim.

Install

pip install langchain-agentoracle

That's it. No API keys. No configuration. The free tier gives you 20 preview verifications per hour to test with.

Quick Start: Verify Before Your Agent Acts

The simplest integration — verify a piece of text and get per-claim verdicts:

from langchain_agentoracle import AgentOracleEvaluateTool

verifier = AgentOracleEvaluateTool()

# Your agent just generated this text — is it true?
agent_output = """
OpenAI released GPT-4 in March 2023.
Bitcoin was created by Elon Musk.
The Python programming language was created by Guido van Rossum.
"""

result = verifier.run(agent_output)
print(result)

Here's what comes back:

EVALUATION RESULT
Overall confidence: 0.61
Recommendation: ACT
Claims found: 3 | Supported: 2 | Refuted: 1 | Unverifiable: 0
Sources used: sonar, sonar-pro, adversarial, gemma-4

CLAIMS:
  ✓ [SUPPORTED] (1.00) OpenAI released GPT-4 in March 2023
    Evidence: Widely documented historical fact; GPT-4 was announced
    and released on March 14, 2023.

  ✗ [REFUTED] (0.83) Bitcoin was created by Elon Musk
    Evidence: Bitcoin's creator is the pseudonymous Satoshi Nakamoto.
    Correction: Bitcoin was created by Satoshi Nakamoto, not Elon Musk.

  ✓ [SUPPORTED] (1.00) Python was created by Guido van Rossum
    Evidence: Confirmed in official Python documentation and
    Van Rossum's own statements.

Three claims went in. Two came back supported with evidence. One came back refuted with a correction. Your agent now knows claim #2 is wrong before it acts on it.

Add It to Your Agent's Toolbelt

Want your agent to verify claims on its own? Add the tools directly:

from langchain_agentoracle import get_agentoracle_tools

# Returns all 6 AgentOracle tools ready for your agent
tools = get_agentoracle_tools()

# Or pick specific ones:
from langchain_agentoracle import (
    AgentOracleEvaluateTool,    # Per-claim verification ($0.01)
    AgentOracleVerifyGateTool,  # Quick pass/fail gate (free)
    AgentOraclePreviewTool,     # Research preview (free, 20/hr)
)

The tools follow LangChain's BaseTool interface, so they plug into any agent:

from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI
from langchain_agentoracle import AgentOracleEvaluateTool, AgentOraclePreviewTool

llm = ChatOpenAI(model="gpt-4")

tools = [
    AgentOracleEvaluateTool(),
    AgentOraclePreviewTool(),
]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True,
)

# The agent can now verify claims before acting
agent.run("Check if this is true: Tesla's market cap exceeded $2 trillion in 2024")

The Verify-Then-Act Pattern

The most useful pattern: gate your agent's actions on verification confidence.

from langchain_agentoracle import AgentOracleEvaluateTool
import json

verifier = AgentOracleEvaluateTool()

def verify_then_act(text, confidence_threshold=0.8):
    """Only act if verification confidence exceeds threshold."""
    result = verifier.run(text)

    # Parse the confidence from the result
    # The tool returns a formatted string with overall confidence
    if "Overall confidence:" in result:
        conf_line = [l for l in result.split('\n') if 'Overall confidence' in l][0]
        confidence = float(conf_line.split(': ')[1])

        if confidence >= confidence_threshold:
            print(f"✅ VERIFIED ({confidence}) — safe to act")
            return True
        else:
            print(f"⚠️ LOW CONFIDENCE ({confidence}) — hold for review")
            return False
    return False

# In your agent pipeline:
claim = "The Federal Reserve raised interest rates in March 2024"
if verify_then_act(claim):
    # proceed with the action
    pass
else:
    # flag for human review or use a fallback
    pass

Free Quick Check: The Verify Gate

Don't need per-claim breakdowns? The verify gate gives you a fast pass/fail:

from langchain_agentoracle import AgentOracleVerifyGateTool

gate = AgentOracleVerifyGateTool()

# Quick binary check — free, no payment needed
result = gate.run("The speed of light is approximately 300,000 km per second")
print(result)
# VERIFY GATE: FAIL
# Confidence: 1.00
# Recommendation: ACT
# ("FAIL" = gate found no issues — content is safe to act on)

Why AgentOracle

Most hallucination detection tools are built for humans — dashboards, observability platforms, monitoring UIs. They tell you what went wrong after the fact.

AgentOracle is built for agents. It sits in the pipeline, takes any text, runs it through 4 independent verification sources in parallel, and returns a machine-readable verdict before your agent acts.

No dashboards. No subscriptions. No API keys to configure. Your agent calls /evaluate, gets ACT / VERIFY / REJECT with a confidence score and evidence, and decides what to do next.

What's under the hood:

4 independent sources: Sonar, Sonar Pro, Adversarial challenge, and Gemma 4
Per-claim decomposition — complex text gets broken into individual verifiable claims
Confidence calibration across sources
Evidence and corrections for every verdict
1,900+ claim fingerprints in the database and growing daily

Try It Now

Playground — no setup, no payment: agentoracle.co
Paste any text and see per-claim verdicts in under 15 seconds.

Packages:

pip install langchain-agentoracle — PyPI
pip install crewai-agentoracle — PyPI
npm install agentoracle-verify — npm

Source: GitHub

Hallucinations aren't going away. The models are getting better, but "better" still means wrong 3-10% of the time on the tasks your agents actually run.

A verification layer doesn't replace a good model. It catches the cases where even a good model is confidently wrong — which is exactly when you need it most.