How I built an intent drift detector for LLM agents

sijan gautam — Sat, 06 Jun 2026 12:17:02 +0000

The Problem

AI agents fail silently.

You give an agent a clear instruction:
"Refund user 123, $50 within 7 days"

The agent returns:
"User refunded $500 immediately"

No error. No warning. Just wrong output.

This is semantic drift — when LLM output
diverges from original intent.

What I Built

SIP (State Integrity Protocol) is a lightweight
Python SDK that detects and flags drift in
LLM outputs before they cause damage.

How It Works

from sip.middleware import SIPMiddlewarePipeline

pipeline = SIPMiddlewarePipeline()
pipeline.anchor("Refund user 123 $50")

result = pipeline.run(
    output="Refund user 123 $500"
)

print(result.status)  # repair_required

Three checks run automatically:

Semantic drift (TF-IDF + cosine similarity)
Intent alignment (sentence-transformers)
Numeric drift ($50 vs $500 caught)

Real Test Results

Test	Status
Exact match	accepted
Same meaning different words	accepted
Wrong output	repair_required
Numbers changed	repair_required
Injection attempt	repair_required

Install

pip install state-integrity-protocol

GitHub

github.com/sijan324/state-integrity-protocol

Looking for feedback from anyone building
LLM pipelines or AI agents.

What drift problems have you seen in production?