Nic Lydon

Posted on Jun 23

The Confabulation Cascade: When Your Agent Learns Nothing From Its Own Mistakes

#ai #security #buildinpublic

My infrastructure analyst agent was stuck in a loop I didn’t have a name for yet.

It would write a SQL query with a hallucinated column name. The query would fail with a Postgres error. My error handler would fire back the real column list from pg_attribute. The agent would read it, acknowledge the correction in its reasoning trace, and then write the exact same wrong column name on the next attempt.

Not a different wrong column. The same one.

I started calling it the confabulation cascade. Here’s what was actually happening, why it’s a tool design problem more than a model problem, and what I did about it.

The Setup

Nexus is my personal intelligence platform. It runs 8+ autonomous agents against a 191-table Postgres schema, doing things like weekly life chapter analysis, relationship health tracking, and biographical inference from 24 years of personal data. The infrastructure analyst agent is responsible for querying those tables to surface patterns and anomalies.

When agents write SQL in Nexus, they go through handleQueryDb in tool-executor.ts. The handler enforces SELECT-only access, applies agent-scoped roles, and on failure calls buildQueryDbSchemaHint() from query-db-schema-hint.ts to augment the error message.

That last part is where the problem lived.

The Reactive Schema Hint

buildQueryDbSchemaHint() does two things:

On “column does not exist” error: introspects pg_attribute and returns the real column list for that table
On “table does not exist” error: searches pg_class for similar table names and suggests them

This is useful. When it triggers, the agent gets accurate schema information. The problem is the word “when.” The hint is purely reactive. It only fires after a query fails.

There is no describe_table tool. No get_schema call. No way for an agent to ask “what columns does aurora_life_chapters have?” before writing SQL. The only path to ground truth is trial and error.

So the agent’s loop was:

Generate a query. Column name comes from training weights plus context – call it a confident prior.
Query fails. Error message arrives with real column list.
Agent processes correction as context in its next generation.
Training prior reasserts. Same wrong column appears in the new query.
Go to 1.

The agent wasn’t ignoring the correction. It was receiving two competing signals: an error-message correction grounded in reality, and a stronger schema prior embedded in the model’s weights. The correction arrived once. The prior arrived every token. Guess which one won.

Why This Is a Tool Design Problem

It’s tempting to frame this as “the model should pay more attention to error messages.” That framing puts the fix in prompt engineering territory – add emphasis, reorder the context, tell the model to really read the hint this time.

That might help at the margin. It doesn’t fix the structural issue.

The structural issue is that I designed a tool surface that makes confident guessing the only entry point to accurate information. The agent had no way to verify structure before acting. It could only learn by failing. When a model’s training prior is strong, that learning channel is lossy.

Compare this to how you’d design a tool for a human. If you give a human an API and they ask what fields it accepts, you give them documentation. You don’t make them submit malformed requests until the error messages teach them the schema. The human version of the confabulation cascade is a poorly documented API with no reference – you keep guessing based on what similar APIs look like, and sometimes the error messages stick, and sometimes they don’t.

Same failure mode. Different substrate.

The Fix: describe_table

The fix is a proactive schema introspection tool. Agents call it before writing queries, not after failing them.

The implementation is straightforward:

async function handleDescribeTable(
  tableName: string
): Promise<{ columns: Array<{ name: string; type: string; nullable: boolean }> }> {
  // Validate input -- public schema only, no injection surface
  const sanitized = tableName.replace(/[^a-z0-9_]/g, '');

  const result = await db.query(`
    SELECT column_name, data_type, is_nullable
    FROM information_schema.columns
    WHERE table_schema = 'public'
      AND table_name = $1
    ORDER BY ordinal_position
  `, [sanitized]);

  if (result.rows.length === 0) {
    // Suggest similar tables rather than returning empty
    const similar = await findSimilarTables(sanitized);
    throw new Error(
      `Table '${sanitized}' not found in public schema.` +
      (similar.length ? ` Did you mean: ${similar.join(', ')}?` : '')
    );
  }

  return {
    columns: result.rows.map(row => ({
      name: row.column_name,
      type: row.data_type,
      nullable: row.is_nullable === 'YES',
    }))
  };
}

The resulting agent behavior:

Before writing SQL against an unfamiliar table, call describe_table.
Get back authoritative column names and types.
Write the query against verified schema.

The cascade stopped. Not because the model got smarter, but because it no longer needed to guess.

The Broader Pattern

If your agents are writing tool calls against real data stores – databases, APIs, file systems – ask yourself: can they verify structure before acting, or can they only learn by failing?

The answer changes what class of bugs you’re going to see.

Reactive error hints are valuable. They’re not sufficient. An agent that can only discover reality through failure is operating in a state of managed hallucination: wrong until corrected, corrected until the prior reasserts, back to wrong.

Proactive introspection tools break the loop at the design level. The agent can ask first. That’s not a prompt engineering fix. That’s a tool surface decision.

It’s the same principle as the difference between defensive error handling and input validation. Catching the exception is better than crashing. Never constructing the invalid input is better than catching it. Move the check earlier.

For agents writing SQL: describe_table before the query beats schema_hint after the failure. The loop that took me a debugging session to understand takes zero sessions to encounter if the tool surface doesn’t require guessing in the first place.

Nexus is my personal intelligence platform, running on private hardware. The agent runtime, job system, and Postgres schema are all home-grown. Posts about the architecture live at niclydon.io.

Top comments (1)

Agentic Architect • Jun 24

Very good post. I would be interested to know what hardware your running this on?