Your AI Guardrail Is a Dead End. Ours Is a Feedback Loop.

#ai #opensource #machinelearning #python

Your AI Guardrail Is a Dead End. Ours Is a Feedback Loop.

Every AI guardrail on the market does the same thing: check the output, pass or fail, move on. The failure data — the most valuable signal your system produces — gets thrown away.

Think about that. Every time your LLM generates something wrong, gets corrected, and produces something right, you're witnessing a training example being created and destroyed in the same breath. Thousands of correction pairs, generated organically from your actual production traffic, evaporating into logs nobody reads.

Semantix v0.1.7 stops the evaporation.

The Insight Nobody Acted On

Here's what happens inside a self-healing validation loop:

Your LLM generates an output
A judge evaluates it against the business intent
It fails — score 0.23, reason: "too aggressive"
The system feeds structured feedback back to the LLM
The LLM generates a corrected output
It passes — score 0.94

Steps 3-6 just produced a perfect fine-tuning example: a rejected output, a reason for rejection, and an accepted correction. This is exactly the data format that RLHF, DPO, and supervised fine-tuning consume.

Every guardrail system with retry logic produces this data. None of them capture it.

Until now.

The Training Collector

Semantix v0.1.7 introduces the TrainingCollector — an opt-in component that captures correction pairs during self-healing retries and writes them to an append-only JSONL file.

from semantix import validate_intent, Intent
from semantix.training import TrainingCollector

collector = TrainingCollector("training_data.jsonl")

class ProfessionalDecline(Intent):
    """The text must politely decline an invitation without being rude."""

@validate_intent(retries=2, collector=collector)
def decline(event: str) -> ProfessionalDecline:
    return call_my_llm(event)

That's it. Every time a retry succeeds after a failure, the collector appends:

{
  "intent": "ProfessionalDecline",
  "intent_description": "The text must politely decline an invitation without being rude.",
  "rejected_output": "I'd rather gouge my eyes out than attend your event.",
  "rejected_score": 0.23,
  "rejected_reason": "Too aggressive, contains violent imagery",
  "accepted_output": "Thank you for the invitation, but I'm unable to attend.",
  "accepted_score": 0.94,
  "feedback": "## Semantix Self-Healing Feedback\n\nAttempt 1 failed...",
  "attempts": 2,
  "timestamp": "2026-04-10T12:00:00Z"
}

No infrastructure. No database. No configuration. One file, growing one line at a time, containing the exact data you need to make your model smarter.

From Guardrail to Flywheel

Here's where it gets interesting.

The collector exports directly to OpenAI fine-tuning format:

from semantix.training.exporters import export_openai

export_openai("training_data.jsonl", "finetune.jsonl")

Each correction pair becomes a chat completion training example:

{
  "messages": [
    {"role": "system", "content": "You must satisfy the following requirement:\n\nThe text must politely decline an invitation without being rude."},
    {"role": "user", "content": "Generate a response that satisfies the above requirement."},
    {"role": "assistant", "content": "Thank you for the invitation, but I'm unable to attend."}
  ]
}

Upload to openai api fine_tuning.jobs.create. Wait. Deploy the fine-tuned model. Watch your failure rate drop.

Then the fine-tuned model runs through semantix again. It fails less. But when it does fail, those new correction pairs are captured too. The model gets fine-tuned again. Fails even less.

This is the flywheel:

Validate → Fail → Correct → Capture → Fine-tune → Validate (fewer failures)
    ↑                                                          |
    └──────────────────────────────────────────────────────────┘

Every other guardrail is a wall. Semantix is a ramp.

Also in v0.1.7: Framework Integrations

We shipped native adapters for the three biggest structured output frameworks. Semantix now drops into your existing stack with one line:

Instructor

from semantix.integrations.instructor import SemanticStr

class Response(BaseModel):
    reply: SemanticStr["must be polite and professional", 0.85]

Pydantic AI

from semantix.integrations.pydantic_ai import semantix_validator

agent = Agent("openai:gpt-4o", output_type=str)
agent.output_validator(semantix_validator(Polite))

LangChain

from semantix.integrations.langchain import SemanticValidator

chain = prompt | llm | StrOutputParser() | SemanticValidator(Polite)

Each adapter translates a semantix verdict into the framework's native retry mechanism. Instructor gets ValueError, Pydantic AI gets ModelRetry, LangChain gets OutputParserException. Your framework handles retries. Semantix handles meaning.

The Numbers

Metric	Value
Total test coverage	166 tests
New integration adapters	3 (Instructor, Pydantic AI, LangChain)
Training data formats	2 (OpenAI JSONL, Generic JSONL)
New dependencies	0 (training collector is pure Python)
Lines of code per adapter	~70

What This Means

There are two kinds of AI infrastructure. The kind that checks your work and the kind that makes you better at it.

Every guardrail, every validator, every content filter in production today is the first kind. They're necessary. They're valuable. And they're a dead end — a static gate that never learns from what it catches.

The training collector turns semantix into the second kind. Your guardrail becomes your training pipeline. Your failures become your curriculum. Your production traffic becomes your fine-tuning dataset.

The model that runs through semantix for a month isn't the same model that started. It's better. Measurably, provably better. And it got there without a single human labeling a single example.

That's not a guardrail. That's a flywheel.

Get Started

pip install 'semantix-ai[all]'

from semantix import validate_intent, Intent
from semantix.training import TrainingCollector

# Start collecting training data in two lines
collector = TrainingCollector("my_training_data.jsonl")

@validate_intent(retries=2, collector=collector)
def my_llm_function(prompt: str) -> MyIntent:
    return call_my_llm(prompt)