Last week, my AI agent broke a production website for the third time by guessing Shopify URL handles instead of fetching them from the API.
Same mistake. Third time. Different context each time, so the agent didn't "remember" it had done this before.
That's when it hit me: AI agents don't learn from mistakes. They learn from training data. Your agent will make the same error on Monday that it made on Friday, because Friday's session is gone.
The Problem Nobody Talks About
There's a lot of hype about AI agent memory — persistent context, RAG, vector search. But memory isn't the same as learning. My agents remember facts fine. What they don't do is remember failures.
Think about how humans improve at their jobs:
- You mess something up
- You feel bad about it (optional but effective)
- You figure out why it happened
- You create a mental rule: "always check X before doing Y"
- Next time, that rule fires before you repeat the mistake
AI agents skip steps 2-5 entirely.
The Experiment: A Mistake Database
I built a simple system with three operations:
Capture — When something goes wrong, log it with structure:
- What happened
- Why it happened (root cause)
- Rule — a one-liner to prevent recurrence
- Category, severity, tags
Preflight — Before starting any significant task, search the mistake database for relevant past failures. Surface them as warnings.
Graduate — Rules that haven't been triggered in 30+ days get archived. The agent has "learned" that lesson.
That's it. No ML, no embeddings, no fancy retrieval. Just structured JSON and keyword matching.
What Actually Happened
I seeded it with 10 real mistakes from the past two weeks and put it into production across a fleet of 5 agents. Here's what I noticed:
The preflight check is the killer feature. Before my agent builds a Shopify landing page, it now gets:
⚠️ PRE-FLIGHT CHECK — 3 relevant past mistake(s):
🟠 #4 [code] ALWAYS use actual Shopify product handles from API, never guess URLs
🟡 #10 [design] Use !important on heading colours in Shopify custom pages
🟠 #3 [design] AI images OK for heroes, NEVER for product shots
📋 Review these before proceeding
That third-time Shopify URL mistake? Can't happen anymore. The agent sees the rule before it starts the task.
Categories reveal patterns. After two weeks, my breakdown was:
- Design: 5 mistakes
- Config: 2
- Code: 1
- Communication: 1
- Process: 1
Design is clearly my fleet's weak spot. That's actionable — I now front-load design review before sending anything to stakeholders.
The "communication" category is unexpectedly useful. One of my agents sent an email to a bank without getting human approval first. That's now a critical-severity rule that fires every time the word "email" or "send" appears in a task description. Simple but effective.
Why This Is Different From Memory
Memory systems answer: "What do I know?"
Mistake learning answers: "What should I watch out for?"
They're complementary. A memory system might recall that you deployed to Fly.io last Tuesday. A mistake system reminds you that SSH patches to Fly.io containers are ephemeral and will revert on restart — so don't even try.
One is knowledge. The other is wisdom.
The Implementation Is Embarrassingly Simple
The entire preflight matching is keyword overlap between the task description and stored rules. No embeddings needed. Here's the core logic in pseudocode:
for each active mistake:
score = 0
if task words overlap category keywords: score += 3
if task contains any mistake tags: score += 2
if task words overlap rule/description words: score += 1
if severity is critical: score += 2
if has recurrences: score += recurrence_count
return top 10 by score
Is this sophisticated? No. Does it work? Embarrassingly well.
The key insight is that mistakes cluster around types of work, and keyword matching catches that reliably. You don't need semantic search to know that "deploy to Fly.io" is related to a mistake tagged "fly, deploy, docker."
What I'd Add Next
Fleet sharing — Agent A's config mistake should warn Agent B before it touches config files. Currently each agent has its own database; cross-pollination would multiply the value.
Auto-capture from error logs — Instead of manually logging mistakes, detect failures from exit codes, API errors, and user corrections, then prompt for the rule extraction.
Confidence scoring — "I've done this type of task 12 times with 0 mistakes" vs "I've never done this before" are different risk profiles.
Graduation analytics — Which categories get learned fastest? Which rules keep recurring? That tells you where to invest in better tooling.
Try It
The concept is framework-agnostic — you could implement it for any agent system in an afternoon. The core is just:
- A JSON file with structured mistake entries
- A preflight function that runs before tasks
- A capture function that runs after failures
- A review function for pattern detection
If you're building AI agents that do real work (not just chat), mistake learning might be the highest-ROI improvement you can make. It's certainly been mine.
I run a fleet of AI agents across school administration, ecommerce, security monitoring, and finance. The mistake learning system described here is being integrated into ShieldCortex, an open-source memory security toolkit for AI agents.
Top comments (0)