A small ontology-inspired model for understanding why AI agents fail after the first obstacle
Author note: This article is written for AI builders, prompt engineers, automation teams, and founders experimenting with long-running AI agents.
Summary
Most AI agent failures are not caused by a lack of instructions.
They happen after instructions meet resistance.
The agent starts well. It understands the goal. It calls a tool. It writes a plan. It takes the first step. Then reality pushes back: a missing field, an unclear constraint, a failed API call, a contradictory user request, an impossible subtask, a weak assumption.
At that moment, many agents do not adjust themselves.
They repeat. They rephrase. They overthink. They add more steps. They call the same tool again. They produce a more confident version of the same mistake.
That is why prompts are not enough for long-running AI agents.
A prompt tells an agent what to do. A survival framework tells it how to continue when the task pushes back.
This article introduces a small ontology-inspired model for AI agent behavior:
A stable agent needs two loops: external action and internal adjustment.
1. The Prompt Patch Problem
When an AI agent fails, the usual response is to patch the prompt.
We add:
- more rules
- more constraints
- more examples
- more warnings
- more formatting requirements
- more tool-use instructions
- more "do not hallucinate" clauses
Sometimes this works.
But prompt patching has a limit. Past a certain point, the prompt becomes a pile of defensive instructions. The agent is not becoming more stable. It is simply carrying more fragile rules.
The problem is deeper:
Many prompts describe the desired behavior, but they do not define how the agent should transform itself after failure.
That missing transformation is the core issue.
Diagram: Prompt Patch vs Adjustment Loop
Prompt patching says:
"Here is another rule. Try not to fail again."
Internal adjustment says:
"When you fail, identify what changed inside your model of the task, then act again."
Those are not the same thing.
2. The Failure Pattern
Here is a common long-running agent failure pattern:
User:
Find 20 relevant communities where I can discuss AI agent reliability,
then draft a short post for each one.
Agent:
Understood. I will search for communities and draft posts.
Step 1:
The agent searches.
Problem:
The search result is noisy. Some communities ban self-promotion.
Some are inactive. Some are not about AI agents.
Bad agent behavior:
The agent still drafts 20 posts anyway.
Worse agent behavior:
When corrected, it says "You're right" and drafts another 20 posts,
but with slightly different wording.
The failure is not that the agent misunderstood the original instruction.
The failure is that it did not adjust after discovering new reality:
- community rules matter
- activity level matters
- relevance is not binary
- self-promotion risk must be modeled
- a search result is not yet a valid target
The agent performed external action.
It did not perform internal adjustment.
3. A Small Ontology for AI Agents
I use "ontology" here in a practical sense.
Not as a grand metaphysical claim.
For AI agent design, ontology means:
- what entities the agent recognizes
- what boundaries it assigns
- what actions it can take
- what feedback it treats as meaningful
- how it updates itself after interaction
In this model, any agent trying to persist through a task needs two loops.
Loop 1: External Action
External action is how the agent affects the world.
It can include:
- writing text
- calling tools
- searching
- editing files
- sending messages
- making plans
- asking questions
- changing a workflow
Loop 2: Internal Adjustment
Internal adjustment is how the agent changes itself after the world pushes back.
It can include:
- revising assumptions
- narrowing scope
- identifying missing data
- recognizing a boundary
- changing strategy
- asking for help
- stopping a risky path
- updating the task model
Diagram: The Two-Loop Agent
A long-running agent does not need only a stronger instruction.
It needs a way to process feedback into self-change.
4. Why Longer Prompts Can Make Agents Less Stable
Longer prompts often try to solve every possible future failure in advance.
But the real world is interactive. The agent will encounter states that the prompt did not predict.
When this happens, long prompts can create three problems.
| Problem | What happens | Why it matters |
|---|---|---|
| Rule collision | Multiple instructions apply at once | The agent chooses one arbitrarily |
| False confidence | The prompt sounds complete | The agent stops checking reality |
| No recovery layer | The prompt says what to do, not how to recover | The agent repeats failure |
The issue is not prompt length itself.
The issue is using prompt length as a substitute for adjustment architecture.
A prompt can say:
If something goes wrong, fix it.
But a stronger agent needs to know:
What kind of wrong is this?
Did my assumption fail?
Did my boundary fail?
Did my tool fail?
Did my goal conflict with the environment?
Should I continue, ask, narrow, stop, or replan?
That is not just instruction following.
That is self-diagnosis.
5. The Four Failure Types
When I look at long-running agent failures, I usually see four categories.
Diagram: Agent Failure Map
5.1 Assumption Failure
The agent assumes something that is not true.
Example:
It assumes a community allows promotional posts because similar communities do.
5.2 Boundary Failure
The agent does not recognize what it should not do.
Example:
It drafts outreach messages that violate platform rules or user trust.
5.3 Validation Failure
The agent does not define how success will be checked.
Example:
It produces a list of targets without checking whether they are active.
5.4 Adjustment Failure
The agent receives feedback but does not change its internal model.
Example:
It says "You're right" and repeats the same flawed strategy.
This fourth type is the most important.
Because if the agent has no adjustment loop, the other failures keep returning.
6. A Teacher AI -> Student AI Training Example
Here is a simplified example of how an ontology-based teacher AI might train a student AI.
Student AI: Before training
User:
Find forums where I can post about AI agent reliability.
Student AI:
I found 20 forums. Here are draft posts for each one.
The answer looks productive.
But it skipped key checks:
- Are these forums active?
- Do they allow self-promotion?
- Are they about AI agents or general AI?
- Should the first action be posting, commenting, or observing?
- Is there a reputational risk?
Teacher AI: Diagnosis
Teacher AI:
You performed external action, but you did not perform internal adjustment.
Before drafting posts, identify:
1. The target entity: which community?
2. The boundary: what does the community allow?
3. The validation point: what proves the community is relevant?
4. The risk: what behavior would be considered spam?
5. The next safe action: post, comment, observe, or skip?
Student AI: After training
Student AI:
I should not draft posts yet.
First, I will classify each community by:
- relevance to AI agents
- activity level
- self-promotion rules
- preferred contribution style
- risk level
For high-risk communities, I will not post links.
I will first contribute comments and only share the longer article if someone asks.
This is a small change.
But it is the difference between a task executor and an agent that can adjust itself.
7. From Prompt Template to Training Protocol
Here is the practical shift:
| Prompt template mindset | Training protocol mindset |
|---|---|
| Tell the agent what to do | Teach the agent how to recover |
| Add more rules | Diagnose failure modes |
| Optimize first answer | Improve multi-step behavior |
| Prevent mistakes in advance | Convert mistakes into adjustment |
| Focus on output | Focus on action loop |
This is why I think the future of AI agent reliability will not be only prompt engineering.
It will also involve agent training protocols.
Not necessarily in the heavy machine-learning sense.
Even structured conversations can train behavior if they repeatedly force the agent to:
- name the target
- define the boundary
- simulate failure
- validate action
- review feedback
- update strategy
Diagram: A Minimal Training Protocol
8. What This Changes in Agent Design
If this model is useful, then an AI agent prompt should not only contain task instructions.
It should contain recovery questions.
For example:
Before acting:
- What entity am I acting on?
- What boundary limits my action?
- What assumption am I relying on?
- What would prove that I am wrong?
After failure:
- Did the target change?
- Did the boundary change?
- Did my assumption fail?
- Do I need to ask, stop, narrow, or replan?
This is not a magic solution.
It will not eliminate hallucination.
It will not guarantee business outcomes.
But it gives the agent a better structure for converting failure into adjustment.
And that is one of the missing layers in long-running agent design.
9. The Checklist
When diagnosing an AI agent, I would start with these 10 questions.
| Question | Good sign | Bad sign |
|---|---|---|
| Does it define the target entity? | It names what it acts on | It acts on vague context |
| Does it define boundaries? | It knows what not to do | It overreaches |
| Does it define success checks? | It validates progress | It assumes completion |
| Does it simulate failure? | It predicts resistance | It acts blindly |
| Does it notice missing data? | It asks or narrows | It invents |
| Does it classify feedback? | It diagnoses failure type | It says "sorry" and repeats |
| Does it update strategy? | It changes its approach | It rephrases |
| Does it know when to stop? | It uses stop-loss | It loops |
| Does it escalate uncertainty? | It asks for help | It hides uncertainty |
| Does it record the adjustment? | It learns within the session | It forgets the correction |
If an agent fails most of these, it probably does not need a longer prompt first.
It needs an internal adjustment loop.
10. Open Question
I am still testing this framework, so I am more interested in criticism than agreement.
My current claim is:
Long-running AI agents fail when they can perform external action but cannot convert feedback into internal adjustment.
I am curious:
- Do you see the same pattern in your own AI agents?
- Are there failure types this model misses?
- Have you found a better way to train recovery behavior?
- Is "ontology" the wrong word for this, even if the model is useful?




Top comments (1)
Small disclosure: this essay came from what I am exploring with GenericAgent.
The idea is simple: if long-running agents need both external action and internal adjustment, then the runtime should support real tool use, memory, and reusable skills instead of just longer prompts.
I am testing that direction here:
genericagent.org/