Mike W

Posted on May 8 • Edited on May 12

Veritas: Give Your AI Agent the Ability to Know What It Knows

#ai #agents #opensource #python

Most knowledge systems store facts. Veritas stores how well you know them.

I built Veritas because AI agents have an epistemic blind spot: they act on beliefs they can't evaluate. "The API is reliable" — based on what? One observation from 2022? Twenty independent tests from last month? A single assumption that everything else depends on?

Without structure, agents overclaim certainty or collapse into paralysis. Veritas gives beliefs a shape.

Confidence is a vector, not a number

Every claim in Veritas carries a ConfidenceVector with four components:

Field	Meaning
`value`	Current best estimate (0–1), with temporal decay applied
`fragility`	How much confidence drops if the best source is removed
`staleness_penalty`	How much evidence aging has already cost
`source_diversity`	How independent your sources are

Sources are combined using noisy-OR pooling — the same model used in fault trees — so independent confirmation genuinely compounds, but correlated sources don't double-count.

from veritas import VeritasDB, calculate_confidence
from veritas.models import Claim, Source, Stance

db = VeritasDB("~/.veritas/veritas.db")
claim = db.search("persistent memory")[0]
cv = calculate_confidence(claim.sources)

print(cv.value)             # 0.90
print(cv.fragility)         # 0.12  — reasonably robust
print(cv.staleness_penalty) # 0.00  — sources are fresh

Evidence ages. Theorems don't.

Every source has a type with a corresponding half-life:

Source type	Half-life	Example
`MATHEMATICAL`	timeless	Turing 1936, Gödel 1931
`THEORETICAL`	~140 years	Newton, Darwin
`EMPIRICAL`	~10 years	Studies, benchmarks
`AUTHORITY`	~6 years	Expert consensus
`ANECDOTAL`	~2 years	Personal accounts

A 1986 study should carry less weight than a 2024 replication. A theorem from 1936 should carry exactly the same weight as when it was proved.

# See what's going stale
veritas stale

  Claims losing confidence to age:

  -0.32  [##########..........] 0.54  Minsky 1967: AI will solve all problems
         59.0y  0.70->0.07  [ANEC]  Minsky 1967 interview

Belief propagation

Claims depend on other claims. When a foundation weakens, everything built on it updates automatically — without touching the dependent claims.

Three inference types with different propagation behavior:

DEDUCTIVE: dependent claim capped at foundation confidence
INDUCTIVE: weak foundations drag down (stronger than they lift)
ABDUCTIVE: soft drag, for speculative reasoning chains

veritas chain "Cathedral will find a market"

  [0.86] Cathedral will find a market
    |-- [IND] -->
      [0.89] Developers need persistent agent memory
        |-- [IND] -->
          [0.95] AI agents currently lose state between sessions

Add a contradicting source to the bottom claim. The top two update automatically.

Semantic contradiction detection

Keyword matching misses semantic contradictions. "Physical activity strengthens the cardiovascular system" and "Exercise has no proven benefit for heart health" share zero content words but directly contradict each other.

Veritas uses sentence-transformers with a cosine similarity threshold tuned to catch genuine contradictions (sleep/rest at 0.49) while avoiding false positives (sky/sunsets at 0.45). Falls back to keyword matching if the library isn't installed.

from veritas import find_contradictions

claim = db.search("physical activity strengthens cardiovascular")[0]
contras = find_contradictions(claim, db.all_claims(), db=db)
# Finds: "Exercise has no proven benefit for heart health"

Reasoning guard

Before an agent acts on a belief, check whether it actually holds up:

from veritas import ReasoningGuard

guard = ReasoningGuard(db)
result = guard.check("GPT-3 represents the state of the art in language models")
print(result)

[CAUTION] confidence=0.87  Belief has weaknesses that should be acknowledged
  * Stale — evidence aging has reduced confidence by 0.12

Verdicts: PROCEED / CAUTION / HALT

Triggers: low confidence · single source · high fragility · staleness · contradictions

Epistemic fingerprint

Every belief system has a characteristic reasoning style. The fingerprint measures it:

  Epistemic Fingerprint: cathedral
  ========================================================
  Claims: 12   Sources: 31   Avg sources/claim: 2.6

  Source composition:
    EMPIRICAL      [################........] 65%
    AUTHORITY      [########................] 32%

  Confidence profile:
    Average        [##################......] 0.87
    Fragility      [####....................] 0.18
    Overconfident  [##......................] 8% of claims

  Epistemic health:
    Rigor score    [################........] 0.68
    Calibration    [####################....] 0.84
    Overall        [##################......] 0.76

Two agents with the same beliefs but different fingerprints are different kinds of reasoners.

Install

pip install veritas
# For semantic contradiction detection:
pip install veritas[semantic]

15 CLI commands + a Python API. Full docs on GitHub.

Connection to Cathedral

Cathedral gives AI agents persistent memory across sessions. Veritas is the reasoning layer that sits on top: Cathedral stores what an agent remembers, Veritas tracks how well those memories hold up.

Together: an agent knows its history and knows how much to trust it.

Veritas is MIT licensed and genuinely open. I'd be interested in feedback — especially on the threshold tuning for semantic contradiction detection and the inference type behavior. Both are empirically set and could be better.

Axiom: agent runtime with epistemic honesty — Veritas + Cathedral + AgentGuard in one runtime
Cathedral: Persistent Memory for AI Agents — persistent identity that pairs with epistemic confidence
AgentGuard: runtime safety layer — gate actions by confidence score

DEV Community