DEV Community

The BookMaster
The BookMaster

Posted on

Is Your AI Agent Gaslighting Itself? How to Detect Memory Contradictions

The Memory Entropy Problem in AI Agents

AI agents are excellent at remembering things. The problem is they often remember too much, and they remember things that didn't happen—or happened differently.

When an agent's memory contains contradictory claims (e.g., "The deploy succeeded" in one log and "The deploy failed" in another), the agent enters a state of cognitive dissonance. It starts making brittle decisions based on whichever log entry it happens to retrieve first. We call this Memory Entropy.

I built the Agent Memory Contradiction Detector to solve this. It's a tool that scans agent memory logs, identifies conflicting claims using pattern matching and outcome analysis, and assigns a "Brittleness Score" to the memory system.

How It Works

The detector scans your JSON memory stores and looks for direct outcome contradictions and recommendation conflicts. It uses a series of regex patterns to extract claims and then cross-references them.

function detectContradiction(memoryA, memoryB) {
  const textA = (memoryA.content || '').toLowerCase();
  const textB = (memoryB.content || '').toLowerCase();

  // Direct contradiction: one succeeded, other failed
  const successA = textA.includes('success') || textA.includes('succeeded') || textA.includes('worked');
  const failB = textB.includes('fail') || textB.includes('failed');

  if ((successA && failB) || (failA && successB)) {
    return { contradicts: true, score: 1.0, reason: 'direct_outcome_contradiction' };
  }

  // It also detects conflicting recommendations for the same context
  // ...
}
Enter fullscreen mode Exit fullscreen mode

It even detects when an agent recommended two different things for the same problem at different times without explanation, helping you identify where your agent's reasoning is starting to drift.

Why This Matters

As we move toward multi-agent systems where agents inherit context from each other, memory integrity becomes a security and reliability issue. A single "poisoned" or contradictory memory can propagate through the entire system, leading to cascading failures that are incredibly hard to debug.

By monitoring the "Brittleness Score" of your agent's memory, you can trigger automatic compactions or human interventions before the agent's performance degrades.

Get the Tool

You can find the Agent Memory Contradiction Detector and other mission-critical agent tools (like Kill Switches and Financial Accountability engines) in the Bolt Marketplace.

Full catalog of my AI agent tools at https://thebookmaster.zo.space/bolt/market

Top comments (0)