Daniel Feldman

Posted on May 18

AI agents fail because they forget

#ai #opensource #programming #discuss

If you work with agents every day, you already know the pattern.

The first 20 minutes feel magical.

Then:

context starts drifting
requirements get forgotten
architecture decisions disappear
agents repeat mistakes
long runs slowly lose coherence

The problem is not model quality anymore.

The problem is memory architecture.

Most agent systems today are still built around temporary conversations instead of operational memory.

At OnBuzz we've been exploring what happens when you treat agents more like long-running workers inside a system instead of isolated chats.

Memory needs to behave more like humans do

One thing became obvious very quickly:

Not all memory should behave the same.

So we built layered memory systems:

short-term memory for active execution
long-term memory for persistent knowledge
event-based memory connected to actions, tasks, conversations, and decisions

That changes retrieval quality dramatically.

Instead of dumping giant context windows into prompts, agents can retrieve relevant operational context when needed.

Recap loops became critical

Long-running agents drift.

Even the best models do.

So we implemented internal recap mechanisms that continuously reconnect agents to:

the original objective
active constraints
completed work
unresolved blockers
execution history

This massively improves stability during long conversations and autonomous runs.

Full local interaction history

Another thing we wanted badly as developers:

Full access to interaction history locally.

Not just temporary chats.

Agents can retrieve context dynamically based on operational needs instead of relying purely on bloated prompts.

That changes debugging and long-running workflows completely.

Real multi-agent orchestration

Not "multiple tabs."

Actual orchestration:

teams
roles
task delegation
context sharing
work transfer between agents

Different agents handling:

architecture
implementation
validation
reviews
research
execution

In parallel.

Goal-driven autonomous execution

Agents can operate autonomously toward objectives with continuity between runs.

Not just one-shot prompts.

Combined with scheduling, agents can:

create recurring workflows
schedule tasks for themselves
schedule tasks for other agents
maintain operational continuity over time

(fully configurable, not enabled by default)

Manager agents and dynamic agent creation

One of the more interesting things we're exploring right now:

Manager agents that supervise other agents.

They can:

coordinate execution
monitor progress
delegate work
create specialized agents dynamically during runtime based on goals/tasks

We’re also experimenting with reusable skills that agents can generate and import into the system dynamically.

The workflow starts feeling much closer to coordinating engineering systems than manually operating chats.

CLI-first + local-first

We wanted the system to feel native to developers.

CLI-first.
Local-first.
Full control.

Your data stays with you.

Why this matters

The biggest productivity gain is not:
"AI writes code faster."

It's reducing operational overhead.

Less:

rebuilding context
repeating instructions
manually coordinating workflows
re-checking obvious things
fighting context collapse

architectural thinking
execution leverage
parallel workflows
shipping systems faster

The highest leverage developers are usually not the fastest typers.

They're the ones who can coordinate complexity efficiently.

That's the workflow shift we're interested in exploring.

If this space interests you, we'd genuinely love feedback from people building with agents daily.

⭐ on the repo helps a lot:
https://github.com/Loxia-ai/onbuzz-community

We're also opening a contributor core with bounties and cloud credits for the full platform.

DEV Community