correctover

Posted on Jul 1

Introducing correctover-patronus: 6-Dimensional Verification for Patronus AI

#ai #llm #verification #opensource

The Problem

LLM evaluation tools like Patronus AI excel at hallucination detection, toxicity checks, and semantic relevance. But they don't catch the structural failures:

A JSON response missing required fields
A function call with malformed parameters
Output that violates schema constraints
Latency budget overruns silently degrading UX
Cost explosions from runaway token usage

These aren't hallucinations. They're verification failures.

The Solution

correctover-patronus is an adapter that runs Correctover's 87 deterministic verification rules as native Patronus evaluators. Every verdict comes with a recomputable proof hash — meaning you can verify the verifier.

pip install correctover-patronus

The 6 Dimensions

Dimension	What It Checks	Example
Structure	Output format validity	JSON parses correctly
Schema	Field presence & types	Required fields exist
Identity	Semantic relevance to input	Response addresses the question
Integrity	Forbidden pattern absence	No Tracebacks or error messages
Latency	Response time budget	Under 30s threshold
Cost	Token usage budget	Under 10k token limit

Usage

Full 6-Dimension Verification

from correctover_patronus import CorrectoverEvaluator, CorrectoverConfig

config = CorrectoverConfig(
    min_confidence=0.7,
    latency_rules={"max_ms": 5000},
    cost_rules={"max_tokens": 4000}
)

evaluator = CorrectoverEvaluator(config=config)
result = evaluator.evaluate(
    task_input="Summarize this article...",
    task_output="The article discusses...",
    task_context={"source": "article", "word_count": 1500}
)

print(f"Overall: {result.score:.2f} ({'PASS' if result.pass_ else 'FAIL'})")
print(f"Proof hash: {result.metadata['proof_hash']}")
for dim, info in result.metadata['dimensions'].items():
    print(f"  {dim}: {info['status']} (score={info['score']:.2f})")

Individual Dimensions

from correctover_patronus import correctover_structure, correctover_integrity

# Check if output is valid JSON
is_valid = correctover_structure(task_output='{"key": "value"}')

# Check for error patterns
is_clean = correctover_integrity(task_output="Result: 42")

Patronus Experiments Integration

from correctover_patronus import correctover_full

# Use in Patronus experiments for systematic benchmarking
results = patronus.evaluate(
    evaluators=[correctover_full],
    dataset=my_dataset,
    experiment_name="correctover-benchmark"
)

Recomputable Proof

Every evaluation produces a proof_hash in the metadata. This hash covers:

The input text
The output text
The verification rules applied
The verdict for each dimension

You can re-run the same verification and get the same hash. No black boxes.

Architecture

┌─────────────────┐     ┌──────────────────────┐
│   Patronus AI   │────>│  correctover-patronus │
│   Framework     │     │  (this adapter)       │
└─────────────────┘     └──────────┬───────────┘
                                   │
                    ┌──────────────▼──────────────┐
                    │     Correctover SDK         │
                    │  (87 rules, 6 dimensions)   │
                    │  P50 verification: 22us     │
                    └──────────────┬──────────────┘
                                   │
                    ┌──────────────▼──────────────┐
                    │   Verification Request      │
                    │   -> Verdict + Proof Hash    │
                    │   -> Metadata + Tags         │
                    └─────────────────────────────┘

Performance

P50 verification latency: 22μs
Self-healing rules: 87
SDK size: 586KB
Zero external API calls — fully deterministic, local execution

DEV Community