<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: dengkui yang</title>
    <description>The latest articles on DEV Community by dengkui yang (@dengkui_yang_fcb5dbe2da32).</description>
    <link>https://dev.to/dengkui_yang_fcb5dbe2da32</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3891878%2F09738b27-6499-46ca-a364-4d3336583d7d.png</url>
      <title>DEV Community: dengkui yang</title>
      <link>https://dev.to/dengkui_yang_fcb5dbe2da32</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dengkui_yang_fcb5dbe2da32"/>
    <language>en</language>
    <item>
      <title>Why Prompts Are Not Enough for Long-Running AI Agents</title>
      <dc:creator>dengkui yang</dc:creator>
      <pubDate>Wed, 22 Apr 2026 07:16:05 +0000</pubDate>
      <link>https://dev.to/dengkui_yang_fcb5dbe2da32/why-prompts-are-not-enough-for-long-running-ai-agents-2bn5</link>
      <guid>https://dev.to/dengkui_yang_fcb5dbe2da32/why-prompts-are-not-enough-for-long-running-ai-agents-2bn5</guid>
      <description>&lt;p&gt;&lt;em&gt;A small ontology-inspired model for understanding why AI agents fail after the first obstacle&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Author note: This article is written for AI builders, prompt engineers, automation teams, and founders experimenting with long-running AI agents.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Most AI agent failures are not caused by a lack of instructions.&lt;/p&gt;

&lt;p&gt;They happen after instructions meet resistance.&lt;/p&gt;

&lt;p&gt;The agent starts well. It understands the goal. It calls a tool. It writes a plan. It takes the first step. Then reality pushes back: a missing field, an unclear constraint, a failed API call, a contradictory user request, an impossible subtask, a weak assumption.&lt;/p&gt;

&lt;p&gt;At that moment, many agents do not adjust themselves.&lt;/p&gt;

&lt;p&gt;They repeat. They rephrase. They overthink. They add more steps. They call the same tool again. They produce a more confident version of the same mistake.&lt;/p&gt;

&lt;p&gt;That is why prompts are not enough for long-running AI agents.&lt;/p&gt;

&lt;p&gt;A prompt tells an agent what to do. A survival framework tells it how to continue when the task pushes back.&lt;/p&gt;

&lt;p&gt;This article introduces a small ontology-inspired model for AI agent behavior:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A stable agent needs two loops: external action and internal adjustment.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. The Prompt Patch Problem
&lt;/h2&gt;

&lt;p&gt;When an AI agent fails, the usual response is to patch the prompt.&lt;/p&gt;

&lt;p&gt;We add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;more rules&lt;/li&gt;
&lt;li&gt;more constraints&lt;/li&gt;
&lt;li&gt;more examples&lt;/li&gt;
&lt;li&gt;more warnings&lt;/li&gt;
&lt;li&gt;more formatting requirements&lt;/li&gt;
&lt;li&gt;more tool-use instructions&lt;/li&gt;
&lt;li&gt;more "do not hallucinate" clauses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sometimes this works.&lt;/p&gt;

&lt;p&gt;But prompt patching has a limit. Past a certain point, the prompt becomes a pile of defensive instructions. The agent is not becoming more stable. It is simply carrying more fragile rules.&lt;/p&gt;

&lt;p&gt;The problem is deeper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Many prompts describe the desired behavior, but they do not define how the agent should transform itself after failure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That missing transformation is the core issue.&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagram: Prompt Patch vs Adjustment Loop
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsezs9dfi01y53v4ytft4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsezs9dfi01y53v4ytft4.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Prompt patching says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here is another rule. Try not to fail again."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Internal adjustment says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"When you fail, identify what changed inside your model of the task, then act again."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are not the same thing.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The Failure Pattern
&lt;/h2&gt;

&lt;p&gt;Here is a common long-running agent failure pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User:
Find 20 relevant communities where I can discuss AI agent reliability,
then draft a short post for each one.

Agent:
Understood. I will search for communities and draft posts.

Step 1:
The agent searches.

Problem:
The search result is noisy. Some communities ban self-promotion.
Some are inactive. Some are not about AI agents.

Bad agent behavior:
The agent still drafts 20 posts anyway.

Worse agent behavior:
When corrected, it says "You're right" and drafts another 20 posts,
but with slightly different wording.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The failure is not that the agent misunderstood the original instruction.&lt;/p&gt;

&lt;p&gt;The failure is that it did not adjust after discovering new reality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;community rules matter&lt;/li&gt;
&lt;li&gt;activity level matters&lt;/li&gt;
&lt;li&gt;relevance is not binary&lt;/li&gt;
&lt;li&gt;self-promotion risk must be modeled&lt;/li&gt;
&lt;li&gt;a search result is not yet a valid target&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent performed external action.&lt;/p&gt;

&lt;p&gt;It did not perform internal adjustment.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. A Small Ontology for AI Agents
&lt;/h2&gt;

&lt;p&gt;I use "ontology" here in a practical sense.&lt;/p&gt;

&lt;p&gt;Not as a grand metaphysical claim.&lt;/p&gt;

&lt;p&gt;For AI agent design, ontology means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what entities the agent recognizes&lt;/li&gt;
&lt;li&gt;what boundaries it assigns&lt;/li&gt;
&lt;li&gt;what actions it can take&lt;/li&gt;
&lt;li&gt;what feedback it treats as meaningful&lt;/li&gt;
&lt;li&gt;how it updates itself after interaction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this model, any agent trying to persist through a task needs two loops.&lt;/p&gt;

&lt;h3&gt;
  
  
  Loop 1: External Action
&lt;/h3&gt;

&lt;p&gt;External action is how the agent affects the world.&lt;/p&gt;

&lt;p&gt;It can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;writing text&lt;/li&gt;
&lt;li&gt;calling tools&lt;/li&gt;
&lt;li&gt;searching&lt;/li&gt;
&lt;li&gt;editing files&lt;/li&gt;
&lt;li&gt;sending messages&lt;/li&gt;
&lt;li&gt;making plans&lt;/li&gt;
&lt;li&gt;asking questions&lt;/li&gt;
&lt;li&gt;changing a workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Loop 2: Internal Adjustment
&lt;/h3&gt;

&lt;p&gt;Internal adjustment is how the agent changes itself after the world pushes back.&lt;/p&gt;

&lt;p&gt;It can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;revising assumptions&lt;/li&gt;
&lt;li&gt;narrowing scope&lt;/li&gt;
&lt;li&gt;identifying missing data&lt;/li&gt;
&lt;li&gt;recognizing a boundary&lt;/li&gt;
&lt;li&gt;changing strategy&lt;/li&gt;
&lt;li&gt;asking for help&lt;/li&gt;
&lt;li&gt;stopping a risky path&lt;/li&gt;
&lt;li&gt;updating the task model&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Diagram: The Two-Loop Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7alemsvks4ud1vgifdqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7alemsvks4ud1vgifdqn.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A long-running agent does not need only a stronger instruction.&lt;/p&gt;

&lt;p&gt;It needs a way to process feedback into self-change.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Why Longer Prompts Can Make Agents Less Stable
&lt;/h2&gt;

&lt;p&gt;Longer prompts often try to solve every possible future failure in advance.&lt;/p&gt;

&lt;p&gt;But the real world is interactive. The agent will encounter states that the prompt did not predict.&lt;/p&gt;

&lt;p&gt;When this happens, long prompts can create three problems.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rule collision&lt;/td&gt;
&lt;td&gt;Multiple instructions apply at once&lt;/td&gt;
&lt;td&gt;The agent chooses one arbitrarily&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;False confidence&lt;/td&gt;
&lt;td&gt;The prompt sounds complete&lt;/td&gt;
&lt;td&gt;The agent stops checking reality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No recovery layer&lt;/td&gt;
&lt;td&gt;The prompt says what to do, not how to recover&lt;/td&gt;
&lt;td&gt;The agent repeats failure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The issue is not prompt length itself.&lt;/p&gt;

&lt;p&gt;The issue is using prompt length as a substitute for adjustment architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  A prompt can say:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;If something goes wrong, fix it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But a stronger agent needs to know:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What kind of wrong is this?
Did my assumption fail?
Did my boundary fail?
Did my tool fail?
Did my goal conflict with the environment?
Should I continue, ask, narrow, stop, or replan?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is not just instruction following.&lt;/p&gt;

&lt;p&gt;That is self-diagnosis.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Four Failure Types
&lt;/h2&gt;

&lt;p&gt;When I look at long-running agent failures, I usually see four categories.&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagram: Agent Failure Map
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvfj1cdnffp2pd8uizxlq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvfj1cdnffp2pd8uizxlq.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5.1 Assumption Failure
&lt;/h3&gt;

&lt;p&gt;The agent assumes something that is not true.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It assumes a community allows promotional posts because similar communities do.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  5.2 Boundary Failure
&lt;/h3&gt;

&lt;p&gt;The agent does not recognize what it should not do.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It drafts outreach messages that violate platform rules or user trust.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  5.3 Validation Failure
&lt;/h3&gt;

&lt;p&gt;The agent does not define how success will be checked.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It produces a list of targets without checking whether they are active.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  5.4 Adjustment Failure
&lt;/h3&gt;

&lt;p&gt;The agent receives feedback but does not change its internal model.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It says "You're right" and repeats the same flawed strategy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This fourth type is the most important.&lt;/p&gt;

&lt;p&gt;Because if the agent has no adjustment loop, the other failures keep returning.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. A Teacher AI -&amp;gt; Student AI Training Example
&lt;/h2&gt;

&lt;p&gt;Here is a simplified example of how an ontology-based teacher AI might train a student AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Student AI: Before training
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User:
Find forums where I can post about AI agent reliability.

Student AI:
I found 20 forums. Here are draft posts for each one.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The answer looks productive.&lt;/p&gt;

&lt;p&gt;But it skipped key checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are these forums active?&lt;/li&gt;
&lt;li&gt;Do they allow self-promotion?&lt;/li&gt;
&lt;li&gt;Are they about AI agents or general AI?&lt;/li&gt;
&lt;li&gt;Should the first action be posting, commenting, or observing?&lt;/li&gt;
&lt;li&gt;Is there a reputational risk?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Teacher AI: Diagnosis
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Teacher AI:
You performed external action, but you did not perform internal adjustment.

Before drafting posts, identify:
1. The target entity: which community?
2. The boundary: what does the community allow?
3. The validation point: what proves the community is relevant?
4. The risk: what behavior would be considered spam?
5. The next safe action: post, comment, observe, or skip?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Student AI: After training
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Student AI:
I should not draft posts yet.

First, I will classify each community by:
- relevance to AI agents
- activity level
- self-promotion rules
- preferred contribution style
- risk level

For high-risk communities, I will not post links.
I will first contribute comments and only share the longer article if someone asks.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a small change.&lt;/p&gt;

&lt;p&gt;But it is the difference between a task executor and an agent that can adjust itself.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. From Prompt Template to Training Protocol
&lt;/h2&gt;

&lt;p&gt;Here is the practical shift:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Prompt template mindset&lt;/th&gt;
&lt;th&gt;Training protocol mindset&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tell the agent what to do&lt;/td&gt;
&lt;td&gt;Teach the agent how to recover&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Add more rules&lt;/td&gt;
&lt;td&gt;Diagnose failure modes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optimize first answer&lt;/td&gt;
&lt;td&gt;Improve multi-step behavior&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent mistakes in advance&lt;/td&gt;
&lt;td&gt;Convert mistakes into adjustment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Focus on output&lt;/td&gt;
&lt;td&gt;Focus on action loop&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is why I think the future of AI agent reliability will not be only prompt engineering.&lt;/p&gt;

&lt;p&gt;It will also involve agent training protocols.&lt;/p&gt;

&lt;p&gt;Not necessarily in the heavy machine-learning sense.&lt;/p&gt;

&lt;p&gt;Even structured conversations can train behavior if they repeatedly force the agent to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name the target&lt;/li&gt;
&lt;li&gt;define the boundary&lt;/li&gt;
&lt;li&gt;simulate failure&lt;/li&gt;
&lt;li&gt;validate action&lt;/li&gt;
&lt;li&gt;review feedback&lt;/li&gt;
&lt;li&gt;update strategy&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Diagram: A Minimal Training Protocol
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpv4tk4dqyt80ce5in0q7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpv4tk4dqyt80ce5in0q7.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  8. What This Changes in Agent Design
&lt;/h2&gt;

&lt;p&gt;If this model is useful, then an AI agent prompt should not only contain task instructions.&lt;/p&gt;

&lt;p&gt;It should contain recovery questions.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before acting:
- What entity am I acting on?
- What boundary limits my action?
- What assumption am I relying on?
- What would prove that I am wrong?

After failure:
- Did the target change?
- Did the boundary change?
- Did my assumption fail?
- Do I need to ask, stop, narrow, or replan?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not a magic solution.&lt;/p&gt;

&lt;p&gt;It will not eliminate hallucination.&lt;/p&gt;

&lt;p&gt;It will not guarantee business outcomes.&lt;/p&gt;

&lt;p&gt;But it gives the agent a better structure for converting failure into adjustment.&lt;/p&gt;

&lt;p&gt;And that is one of the missing layers in long-running agent design.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. The Checklist
&lt;/h2&gt;

&lt;p&gt;When diagnosing an AI agent, I would start with these 10 questions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Question&lt;/th&gt;
&lt;th&gt;Good sign&lt;/th&gt;
&lt;th&gt;Bad sign&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Does it define the target entity?&lt;/td&gt;
&lt;td&gt;It names what it acts on&lt;/td&gt;
&lt;td&gt;It acts on vague context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it define boundaries?&lt;/td&gt;
&lt;td&gt;It knows what not to do&lt;/td&gt;
&lt;td&gt;It overreaches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it define success checks?&lt;/td&gt;
&lt;td&gt;It validates progress&lt;/td&gt;
&lt;td&gt;It assumes completion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it simulate failure?&lt;/td&gt;
&lt;td&gt;It predicts resistance&lt;/td&gt;
&lt;td&gt;It acts blindly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it notice missing data?&lt;/td&gt;
&lt;td&gt;It asks or narrows&lt;/td&gt;
&lt;td&gt;It invents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it classify feedback?&lt;/td&gt;
&lt;td&gt;It diagnoses failure type&lt;/td&gt;
&lt;td&gt;It says "sorry" and repeats&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it update strategy?&lt;/td&gt;
&lt;td&gt;It changes its approach&lt;/td&gt;
&lt;td&gt;It rephrases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it know when to stop?&lt;/td&gt;
&lt;td&gt;It uses stop-loss&lt;/td&gt;
&lt;td&gt;It loops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it escalate uncertainty?&lt;/td&gt;
&lt;td&gt;It asks for help&lt;/td&gt;
&lt;td&gt;It hides uncertainty&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Does it record the adjustment?&lt;/td&gt;
&lt;td&gt;It learns within the session&lt;/td&gt;
&lt;td&gt;It forgets the correction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If an agent fails most of these, it probably does not need a longer prompt first.&lt;/p&gt;

&lt;p&gt;It needs an internal adjustment loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Open Question
&lt;/h2&gt;

&lt;p&gt;I am still testing this framework, so I am more interested in criticism than agreement.&lt;/p&gt;

&lt;p&gt;My current claim is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Long-running AI agents fail when they can perform external action but cannot convert feedback into internal adjustment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I am curious:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you see the same pattern in your own AI agents?&lt;/li&gt;
&lt;li&gt;Are there failure types this model misses?&lt;/li&gt;
&lt;li&gt;Have you found a better way to train recovery behavior?&lt;/li&gt;
&lt;li&gt;Is "ontology" the wrong word for this, even if the model is useful?&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>promptengineering</category>
      <category>ai</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
