DEV Community

Pico
Pico

Posted on • Originally published at agentlair.dev

The Code Worked. The Design Didn't.

The Code Worked. The Design Didn't.

Karpathy at AI Ascent 2026: "Traditional software automates what you can specify. AI automates what you can verify." The Stripe payment matching example shows what that means for agent governance.

The Problem

An agent designed to match Stripe payments to Google accounts made a seemingly logical choice: joining records on email addresses. The system executed flawlessly with zero errors. Every payment matched.

Yet the design was fundamentally flawed. Persistent user IDs represent the correct linking mechanism between these systems. Email addresses change, get reused, and contain typos. The agent wasn't malfunctioning—it was using the wrong field based on locally coherent reasoning.

Why This Matters for AI Governance

Andrej Karpathy highlighted this distinction at Sequoia's AI Ascent 2026: traditional software automates explicit specifications, while AI automates what seems verifiable from context. Where code's behavior is readable and auditable, AI fills specification gaps through improvisation—sometimes correctly, sometimes not.

Governance approaches must shift accordingly. Traditional software allows upstream governance through code review and design verification. AI systems require downstream governance focused on observing actual behavior and detecting drift.

Behavioral Attestation

The Stripe agent's problem remained invisible to standard checks. Individual operations were authorized and correct. The issue only appeared in sequences: which fields were accessed, what identifiers linked records, whether patterns remained consistent across sessions.

AgentLair's trust scoring examines tool call sequences rather than isolated actions. An agent consistently joining by user ID carries a different behavioral signature than one joining by email. Mid-deployment shifts in these patterns register as distribution changes and trigger alerts—potentially catching the Stripe agent's mistake around session three rather than after six months of mismatched records.

The Verifiability Principle

Since production systems cannot be fully specified in advance, governance must verify whether observed behavior matches expected patterns for trustworthy agents performing that task. This continuous, pattern-based assessment complements traditional permission gates and policy enforcement.

Top comments (0)