The model returned JSON. It is missing a required field. Your code crashes. Or the model returned a URL that points to a different domain than expected. Your redirect handler follows it. Or the model returned a number that is outside the valid range. Your downstream system processes an invalid value.
LLM outputs can be wrong in many ways. llm-output-validator lets you define rules for what "correct" looks like and reject bad responses before they propagate.
The Shape of the Fix
from llm_output_validator import OutputValidator, Rule, ValidationError
validator = OutputValidator([
Rule.required_fields(["action", "reason", "confidence"]),
Rule.field_type("confidence", float),
Rule.field_range("confidence", min=0.0, max=1.0),
Rule.field_in("action", ["approve", "reject", "escalate"]),
Rule.max_length("reason", 500),
])
response_text = llm_call(prompt)
try:
parsed = json.loads(response_text)
validator.validate(parsed)
return parsed # Valid
except ValidationError as e:
# Retry with error context
retry_response = llm_call(
prompt + f"\n\nYour previous response failed validation: {e.message}. Please fix it."
)
return json.loads(retry_response)
Define rules once. Validate every response. On failure, retry with the specific error message so the model can self-correct.
What It Does NOT Do
llm-output-validator does not parse JSON or structured output from raw text. It validates a Python dict (already parsed). For structured output extraction from raw LLM text, use a JSON parser or the model's structured output mode, then validate the parsed result.
It does not validate semantic correctness. Rules cover structural constraints: field presence, type, range, length, enum membership. "Is this answer factually correct?" is not a structural constraint; it requires a separate evaluation pass.
It does not auto-correct. When validation fails, it raises ValidationError with a message. You decide how to handle it: retry, return an error to the user, or fall back to a default value.
Inside the Library
Rules are callables that receive the dict and raise ValidationError on failure:
from typing import Any, Callable
class ValidationError(Exception):
def __init__(self, field: str, message: str, value: Any = None):
self.field = field
self.message = message
self.value = value
super().__init__(f"[{field}] {message}")
class Rule:
@staticmethod
def required_fields(fields: list[str]) -> Callable:
def check(data: dict) -> None:
for f in fields:
if f not in data:
raise ValidationError(f, f"Required field '{f}' is missing")
return check
@staticmethod
def field_type(field: str, expected_type: type) -> Callable:
def check(data: dict) -> None:
if field not in data:
return # Let required_fields handle missing
if not isinstance(data[field], expected_type):
raise ValidationError(
field,
f"Expected {expected_type.__name__}, got {type(data[field]).__name__}",
data[field],
)
return check
@staticmethod
def field_range(field: str, min: float | None = None, max: float | None = None) -> Callable:
def check(data: dict) -> None:
if field not in data:
return
v = data[field]
if min is not None and v < min:
raise ValidationError(field, f"Value {v} is below minimum {min}", v)
if max is not None and v > max:
raise ValidationError(field, f"Value {v} exceeds maximum {max}", v)
return check
@staticmethod
def field_in(field: str, allowed: list) -> Callable:
def check(data: dict) -> None:
if field not in data:
return
if data[field] not in allowed:
raise ValidationError(
field,
f"Value '{data[field]}' not in allowed values: {allowed}",
data[field],
)
return check
class OutputValidator:
def __init__(self, rules: list[Callable]):
self._rules = rules
def validate(self, data: dict) -> None:
for rule in self._rules:
rule(data)
def is_valid(self, data: dict) -> bool:
try:
self.validate(data)
return True
except ValidationError:
return False
The rules are composable. Each rule is a callable that takes a dict and either returns None (valid) or raises ValidationError. You can add custom rules as lambdas or named functions:
validator = OutputValidator([
Rule.required_fields(["action", "reason"]),
# Custom rule as lambda
lambda data: None if len(data.get("reason", "")) > 10 else \
(_ for _ in ()).throw(ValidationError("reason", "Too short")),
# Custom rule as function
def check_action_reason_consistency(data):
if data.get("action") == "reject" and not data.get("rejection_code"):
raise ValidationError("rejection_code", "Required when action is 'reject'")
])
When to Use It
Use it when your agent produces structured output (JSON dicts) that downstream code depends on. If missing or invalid fields cause crashes or silent bad behavior, validation before propagation is essential.
Use it for multi-step pipelines where step N's output is step N+1's input. Catch validation failures at the boundary between steps rather than letting bad data propagate to where the failure is harder to diagnose.
Use it with llm-structured-retry for automatic retry-on-failure. When validation fails, llm-structured-retry injects the error message as a follow-up user message and retries the call. The validator and the retry logic are separate concerns.
Skip it for free-form text outputs where you cannot define structural rules. If the model returns prose, rule-based validation does not apply.
Install
pip install git+https://github.com/MukundaKatta/llm-output-validator
# Or from PyPI
pip install llm-output-validator
from llm_output_validator import OutputValidator, Rule, ValidationError
from llm_structured_retry import StructuredRetry
validator = OutputValidator([
Rule.required_fields(["decision", "confidence", "rationale"]),
Rule.field_type("confidence", (int, float)),
Rule.field_range("confidence", min=0, max=100),
Rule.field_in("decision", ["APPROVE", "DENY", "MANUAL_REVIEW"]),
Rule.max_length("rationale", 1000),
])
retry = StructuredRetry(validator=validator, max_attempts=3)
def evaluate_application(application: dict) -> dict:
prompt = format_evaluation_prompt(application)
try:
result = retry.call(
fn=call_llm,
prompt=prompt,
parse_fn=json.loads,
)
return result
except ValidationError as e:
# All retries exhausted
logger.error("validation_failed", field=e.field, message=e.message)
return {"decision": "MANUAL_REVIEW", "confidence": 0, "rationale": "Automated evaluation failed"}
Sibling Libraries
| Library | What it solves |
|---|---|
llm-structured-retry |
Retry with error injected as follow-up message |
tool-result-validator |
Validate tool output against schema |
agent-guard-rails |
Composable output filters for free-form text |
prompt-eval-rubric |
Score free-form responses against weighted criteria |
llm-json-repair |
Repair malformed JSON before validation |
The output quality stack: llm-json-repair for parsing, llm-output-validator for structural validation, llm-structured-retry for automatic retry, agent-guard-rails for content filters.
What's Next
Nested field validation: Rule.nested("settings", sub_rules=[...]) that validates fields inside a nested object. Currently only top-level field validation is supported.
Validation report: instead of raising on the first error, validator.validate_all(data) that returns a list of all ValidationError instances. Useful when you want to present all problems at once rather than iterating through them one at a time.
Schema-driven rules: OutputValidator.from_json_schema(schema) that auto-generates rules from a JSON Schema object. Reads required, properties.type, properties.enum, properties.minimum, properties.maximum, properties.maxLength. Reduces rule-writing to "provide the schema."
Built as part of the agent-stack family: composable Python primitives for production LLM agents.
Top comments (0)