DEV Community

Vignesh Reddy
Vignesh Reddy

Posted on

How I built narrative drift detection for LLM agent runs

Every LLM observability tool monitors
individual requests.

None of them monitor position consistency
across a conversation.

That's the gap I shipped today in Ajah.

The problem:

In a long agent run or multi-turn
conversation, a model can reverse its
position under social pressure — and
nothing flags it. Turn 2 says one thing.
Turn 8 says the opposite. Both responses
look perfectly normal in isolation.

For healthcare, legal, and financial
AI systems, this is a liability.

How narrative drift detection works:

  1. Every session turn stores up to 2000
    characters of response text in Redis

  2. When a new request comes in with a
    session ID, Ajah fetches the full
    session history and passes it to
    the scorer

  3. The scorer extracts factual claims
    from each turn — sentences containing
    proper nouns, numbers, or absolute
    statements

  4. Claims are embedded using
    sentence-transformers and compared
    across turns using cosine similarity

  5. High similarity + negation markers
    = contradiction signal

  6. drift_risk score + drift_verdict
    (stable / possible_drift / drift_detected)
    returned with every scored response

  7. narrative_drift flag fires in the
    Warnings dashboard when drift_risk > 0.5

Everything runs async. Zero latency
added to your users.

MIT license. Self-hosted.

→ github.com/VigneshReddy-afk/ajah
→ useajah.com

buildinpublic #llm #opensource #devtools

Top comments (0)