What should an AI coding agent learn after a failed run?

#productivity

I am building AccInt (https://accint.xyz/), a local Work Model for agent-run work. The product is early, but the technical question is broader than one tool:

When an AI coding agent fails, what exactly should be learned?

Most agent-memory discussions stop at storing more context. That helps recall, but it does not answer the harder engineering question: which context, action, check, or decision actually helped a future run land?

The unit I am testing is a settled commitment:

What did the agent think it was going to do?
Which files, docs, traces, or prior runs did it retrieve?
What action did it take?
What needed human approval?
What did tests, reviewers, or production reality say after?
Which pieces should get stronger next time, and which should be penalized?

For coding agents, this can be grounded in practical signals:

test results
diffs that actually shipped
failed commands and their fixes
reviewer corrections
repeated repo navigation mistakes
whether a future similar task takes fewer steps

That is the gap I am trying to make concrete with AccInt: not just a memory store, not just a trace viewer, and not just orchestration. A local learning substrate that turns agent activity into a Work Model, running on hardware you control.

The first wedge is Claude Code / Codex / OpenCode / MCP-style workflows near real repos, because those runs already produce commitments, diffs, tests, and outcomes.

If you use coding agents seriously, I would value feedback: