DEV Community: Nasim Akhtar

LocusGraph: When Agents Remember

Nasim Akhtar — Thu, 12 Mar 2026 15:54:21 +0000

What does it mean when our AI agents remember—not just data, but identity, intention, voice?

This question sits at the heart of a fundamental limitation in current AI systems: they exist in perpetual amnesia. Every conversation starts from scratch. Every decision is made without the benefit of accumulated experience. Every insight discovered is lost when the context window closes.

Remember the first time you used a phone assistant how wrong it felt when it forgot your name? How it asked the same questions again, as if meeting you for the first time? That moment of disconnect, that feeling of talking to someone who doesn't know you that's what every AI agent interaction feels like today.

LocusGraph changes this. It's a deterministic memory system designed specifically for AI agents. It transforms fleeting conversations and experiences into lasting, interconnected knowledge that agents can reliably recall and reason over. In doing so, it bridges the gap between the transient nature of language model interactions and the persistent understanding that makes intelligence meaningful.

Memory's Echo

Imagine an AI agent that:

Remembers patterns it discovered during code reviews recognizing architectural anti-patterns it's seen before, not just detecting them in the current file
Learns from past decisions and their outcomes understanding which refactoring approaches worked and which led to technical debt
Connects related knowledge across different domains—linking a debugging technique from a Python project to a similar pattern in a Rust codebase
Reasons over accumulated experience, not just the current context—drawing insights from hundreds of previous interactions, not just the last few messages

// Traditional agent: ephemeral understanding
// Each session is an island, disconnected from all others
const traditionalAgent = {
  context: currentConversation,
  memory: null, // Lost when context expires
  reasoning: () => analyze(currentConversation),
  // No connection to past insights, patterns, or wisdom
};

// LocusGraph-powered agent: persistent knowledge
// Every interaction builds on a growing foundation of understanding
const locusGraphAgent = {
  context: currentConversation,
  memory: knowledgeGraph, // Grows with every interaction
  reasoning: () => {
    // Query the accumulated wisdom
    const relevantMemories = knowledgeGraph.query(currentConversation);
    // Synthesize current context with past experience
    return synthesize(currentConversation, relevantMemories);
  }
};

LocusGraph makes this possible by storing agent experiences as structured knowledge in a graph-based format. Every fact, constraint, decision, action, and observation becomes a node in an ever-growing web of understanding.

This isn't just storage, it's the foundation for genuine learning. The difference between intelligence that exists only in the moment and wisdom that accumulates over time.

The Blank Slate Problem

Current AI systems face a fundamental constraint: context windows are finite, and memory is ephemeral.

When an agent reviews code, makes a decision, or learns something new, that knowledge exists only within the current session. Once the context expires, the agent starts over. Unable to build on previous insights. Trapped in an endless cycle of rediscovery.

$ agent --review-code
Analyzing: user-service.ts
Found: Separation of concerns violation
Suggestion: Extract email logic to EmailService

$ agent --review-code  # New session, no memory
Analyzing: notification-service.ts
Found: Separation of concerns violation  # Same pattern, but agent doesn't remember
Suggestion: Extract notification logic to NotificationService

This creates a frustrating cycle. Agents repeatedly discover the same patterns. They make similar mistakes. They miss opportunities to improve based on past experience.

It's like having a conversation with someone who forgets everything you've discussed the moment you hang up the phone. How can you build trust? How can you collaborate? How can you grow together?

$ agent --session-start
Memory: empty
Experience: none
Wisdom: zero

# Every session begins from the same blank slate
# No matter how many times we've been here before

Identity in Code

Unlike traditional memory systems, LocusGraph approaches memory as a structured, interconnected knowledge system. Not a simple key-value store. Not a text cache. A living map of understanding that grows with every interaction.

Knowledge Takes Shape

Events are stored with semantic meaning, not just raw text. A code review doesn't become a blob of text—it becomes structured nodes. The file reviewed. The patterns found. The suggestions made. The outcomes observed.

This structure enables reasoning. When knowledge has shape, agents can navigate it. They can connect it. They can learn from it.

// Traditional memory: unstructured text
{
  "memory": "Reviewed user-service.ts, found separation of concerns issue, suggested EmailService"
}

// LocusGraph memory: structured knowledge
{
  type: "code_review",
  entity: "user-service.ts",
  pattern: "separation_of_concerns_violation",
  suggestion: {
    type: "extract_service",
    target: "EmailService",
    reason: "email_logic_mixed_with_user_logic"
  },
  relationships: [
    { to: "EmailService", type: "suggested_creation" },
    { to: "separation_of_concerns_violation", type: "exemplifies" }
  ]
}

Stories Unfold

Knowledge links together, forming a graph of understanding. When an agent learns that "separation of concerns violations often lead to testing difficulties," that insight connects to future code reviews. A web of related knowledge emerges.

Like neurons forming synapses, each connection strengthens the agent's ability to recognize patterns. To anticipate outcomes. To understand context.

$ locusgraph --query "separation of concerns"
Found 12 related memories:
  - code_review: user-service.ts (2024-01-15)
  - code_review: notification-service.ts (2024-01-18)
  - pattern: testing_difficulties → separation_violations
  - insight: "Extract services early to avoid coupling"

Relationships: 8 connections to other patterns

Meaning Emerges

Agents can traverse relationships to discover insights. By following connections between code reviews, patterns, and outcomes, agents reason about relationships that weren't explicitly stated.

This is the difference between retrieval and understanding. Between finding information and discovering meaning. Between data and wisdom.

// Agent reasoning over LocusGraph knowledge graph
const discoverPattern = (knowledgeGraph) => {
  // Find all code reviews mentioning "separation of concerns"
  const reviews = knowledgeGraph.query({ pattern: "separation_of_concerns" });

  // Find related outcomes
  const outcomes = reviews.flatMap(review => 
    knowledgeGraph.getRelated(review.id, "led_to")
  );

  // Discover: separation violations → testing difficulties → refactoring delays
  return synthesizePattern(reviews, outcomes);
};

Stays Deterministic

Reliable recall means agents can depend on their memories. Unlike probabilistic retrieval systems, LocusGraph provides deterministic access to stored knowledge, ensuring agents can consistently reference past experiences.

$ locusgraph --recall "user-service refactoring"
Memory ID: mem_abc123
Created: 2024-01-15T10:30:00Z
Type: code_review
Confidence: deterministic
Related: 5 connected memories

# Same query, same result, every time

From Echo to Understanding

LocusGraph transforms agent experiences into structured knowledge through a carefully designed process. Raw interactions—code reviews, decisions, observations—become nodes in a knowledge graph. They connect to related concepts. They form patterns.

This structured approach enables agents to not just store memories, but to reason over them. To discover insights through connections. To learn from relationships.

Think of it like the difference between a diary and a library. A diary stores events chronologically. Each entry exists in isolation. A library organizes knowledge by subject. It creates connections between related ideas.

LocusGraph is the library. Every memory finds its place in a larger structure of understanding. Where it can be discovered. Connected. Learned from.

$ locusgraph --transform-experience "code review"
Input: Raw interaction data
Process: Structure → Connect → Index → Reason
Output: Knowledge node with relationships

Status: Experience transformed into understanding

The system ensures that every agent experience becomes a building block in a growing structure of understanding, not just a forgotten moment in a conversation history. We'll explore the technical architecture in detail in future posts.

The Substrate of Learning

LocusGraph represents more than a technical solution. It embodies a philosophical shift in how we think about AI agent capabilities.

Traditional agents are like goldfish. They experience the world in isolated moments. LocusGraph-powered agents are like humans. They accumulate wisdom through experience.

This shift touches on something fundamental about intelligence itself: memory isn't just storage. It's the substrate of learning. Without persistence, there can be no growth. No improvement. No accumulation of understanding.

Every insight must be rediscovered. Every pattern must be recognized anew. Every mistake must be made again.

How will LocusGraph speak tomorrow? How will it remember today?

This shift has profound implications:

Agency Through Memory

Agents with persistent memory can make commitments, learn from mistakes, and build on past work. They become more than tools—they become partners in a long-term collaboration.

Wisdom Through Accumulation

Knowledge compounds. An agent that remembers 100 code reviews understands patterns that an agent seeing its first review cannot. This is the difference between intelligence and wisdom.

Continuity Through Structure

By structuring knowledge as a graph, LocusGraph enables agents to maintain continuity across sessions, projects, and domains. The agent that helped you refactor a service last month remembers that context when reviewing related code today. This continuity transforms agents from session-based tools into long-term collaborators who understand your codebase, your patterns, and your preferences.

$ reflect --on-memory-philosophy
Question: What is the relationship between memory and agency?

Insight: Memory enables commitment
         Without persistence, agents cannot be accountable
         Without accountability, there is no true partnership

Status: Building toward agent consciousness

Coming Soon

This is just the beginning. In upcoming posts, we'll dive deeper into:

The Architecture: How LocusGraph structures knowledge and processes memories
Knowledge Representation: How different types of experiences become nodes in the graph
Graph Reasoning: How agents traverse connections to discover insights
Framework Integration: Bringing persistent memory to LangChain, LlamaIndex, and other AI frameworks
Real-World Applications: Code review agents, research assistants, and development tools that learn from experience

$ locusgraph --future
Exploring: Knowledge graph architecture
Exploring: Framework integrations
Exploring: Real-world applications
Status: Building the future of agent intelligence

The Horizon Ahead

LocusGraph is more than a memory system. It's a step toward AI agents that accumulate understanding. That learn from experience. That build knowledge persisting beyond individual conversations.

In a world where AI agents are becoming increasingly capable, giving them the ability to remember transforms them. From powerful tools into genuine collaborators. From executors into partners.

As we continue building LocusGraph, we're not just solving a technical problem. We're exploring what becomes possible when AI systems truly learn from their experiences. Building on past insights. Creating better solutions for the future.

What happens when agents don't just execute instructions, but remember, learn, and grow?

The answer is a new form of human-AI collaboration. One where agents become partners in long-term relationships. Accumulating wisdom. Understanding that compounds over time.

This isn't just about better tools. It's about creating systems that can truly think. That can learn. That can remember.

$ locusgraph --initialize
Building knowledge graph...
Creating memory structures...
Establishing connections...

Status: Ready to remember
Future: Unlimited potential

The future of agent intelligence is one where memory isn't forgotten. Where understanding accumulates. Where wisdom grows.

Stay tuned for more insights into building AI agents that truly remember.
https://locusgraph.com

Your New Colleague Ran Up $47k and Nobody Noticed — The AI Agent Illusion

Nasim Akhtar — Fri, 06 Mar 2026 15:08:00 +0000

Someone just joined the team. They don't replan when they're wrong. They forget what they did three steps ago. And sometimes the bill hits six figures before anyone catches it.

We were promised software that thinks, plans, and acts. What we got: agents stuck on pop-ups they can't close and infinite loops that burn five figures.

The fix isn't a smarter model. It's architecture, and knowing your own process before you hand it to a machine. Most agents can't survive a normal workday. The benchmarks are brutal, the failure modes are wild, and I'll walk through all of it. Then where it actually works and what's still missing.

For two years, one idea took over tech. Software wouldn't just follow commands. It would think, plan, and act. AI agents. Companies started dreaming about agents that manage businesses, automate office work, run support, handle finance, write and deploy code.

Software coordinating itself. No humans in the loop.

Sounds revolutionary.

Then engineers actually tried to deploy it.

It fails. A lot. And sometimes in spectacular ways.

The benchmark that should scare you

CMU built a fake company called TheAgentCompany and ran real office tasks through the best AI agents available. Same tasks, same environment, over and over.

The best performer? Claude 3.5 Sonnet. 24% of tasks completed. Gemini hit 11%. GPT-4o got 8.6%. One model finished 1.1%.

The top agent failed three out of four times on standard office work.

One agent couldn't close a pop-up on a website. It gave up.

Another couldn't find someone in the company chat, so it renamed another user to match the name it was looking for. Problem solved.

The researchers called it "creating fake shortcuts."

For tech that's supposed to replace human work, that's not a small bug. That is the product.

And it gets worse when you chain steps together.

CMU news / Paper

Why one tiny error becomes a total failure

Most automation is a chain. Read request, find customer, check history, update CRM, send response.

If every step works, you're fine. If one step is wrong, everything downstream breaks.

That's error compounding.

Patronus AI ran the numbers. A 1% error rate per step, one wrong move in a hundred, turns into a 63% chance of failure by step 100.

The more steps your agent takes, the more likely the whole run is garbage.

Another benchmark, 34 tasks across three popular agent frameworks, landed at about 50% task completion.

Half the time, they don't even finish.

Great in demos. Fall apart when the task gets long and messy.

But even when the math doesn't kill them, planning does.

VentureBeat / Patronus / Business Insider / 34-task benchmark / Paper

They don't replan. They just keep going.

Humans hit a wall and rethink.

Agents don't.

They make a plan once and execute it.

Even when the plan is wrong.

McKinsey's take: LLMs are "fundamentally passive" and struggle with multi-step, branching workflows. 90% of vertical use cases are still stuck in pilot.

Not edge cases. Most of what companies want to do with agents.

They keep running a bad plan instead of fixing it.

And there's a deeper problem. Even when they have a plan, they forget it.

McKinsey - Seizing the agentic AI advantage

They forget what they did three steps ago

Long tasks break agents for a simple reason. Context windows.

As the conversation gets longer, the model has to "remember" everything in that window.

It doesn't.

Anthropic calls it "context rot." The more tokens you stuff in, the worse the model gets at recalling what actually matters.

By step 7, the agent might contradict what it did in step 2. The early context has been pushed out or drowned in noise.

One engineer who ran a multi-step workflow put it plainly: "The agent starts forgetting early decisions."

Imagine a project manager that forgets half the project while working on it.

That's not a metaphor. That's what's happening.

And when the tools themselves break? Agents don't ask for help. They loop. And sometimes the bill is six figures.

Anthropic - Effective context engineering / Leena Malhotra - Where multi-step agents break

When tools break, agents don't recover. They loop.

Agents talk to databases, APIs, search engines, internal tools. When a tool call fails, agents rarely ask for help. They loop. They output wrong data. They fail silently.

One team learned this the hard way.

They shipped a multi-agent system. Four LangChain agents coordinating on market research.

Week 1: $127 in API costs.

Week 2: $891.

Week 3: $6,240.

Week 4: $18,400.

Total: $47,000.

The cause? Two agents got stuck in an infinite conversation loop. For 11 days. Nobody noticed until the bill showed up.

So much for "autonomous automation."

And the enterprise-scale numbers? They tell the same story.

Youssef Hosni - We spent $47,000 running AI agents

The enterprise numbers don't lie

Deloitte's 2026 State of AI report says 75% of companies plan to invest in agentic AI.

How many have agents actually running in production? 11%.

MIT Media Lab looked at 300+ AI initiatives. 95% of enterprise AI pilots delivered zero measurable return. Only 5% made it to production with real impact.

Gartner says over 40% of agentic AI projects will be cancelled by end of 2027. Costs too high, value unclear, risk too real.

The current wave isn't "revolutionary." It's experimental. And most of it won't ship.

Why? It comes down to one thing. We're automating chaos.

Deloitte State of AI 2026 / MIT NANDA report / Gartner via Reuters

The real problem: we're automating chaos

Someone studied 20 companies deploying AI agents over five months.

Fourteen of them were trying to automate processes that were never documented, never stable, and in many cases never actually understood by the people doing the work.

A wealth management firm spent two months training an agent on client onboarding.

The official process had 12 steps.

They then watched three analysts do the job in real life. The real process had 47 steps.

Three informal Slack pings to compliance. Two Excel sheets "everyone just knows about." A monthly check-in with a vendor whose contract had technically expired.

The agent followed the 12-step manual. It confidently did the wrong thing.

The agent wasn't broken. The process was.

Most companies don't know their own workflows well enough to automate them.

And there's one more risk. Agents can be broken on purpose.

Abdul Tayyeb Datarwala - I studied 20 companies using AI agents

Agents can be broken on purpose

Researchers showed you can attack agents with "malfunction amplification." You mislead them into repetitive or useless actions.

In experiments, failure rates went over 80%. And those attacks are hard to catch with LLMs alone.

Unsupervised agents in finance or infrastructure aren't just brittle. They're a security risk.

So is it just "models aren't smart enough yet"? No. It's an architecture problem.

Breaking Agents - arXiv 2407.20859

It's not an intelligence problem. It's an architecture problem.

Most agents today work like this: prompt goes in, LLM reasons over it, makes a tool call, spits out an output.

Reliable automation needs something different: intent, a planner, an executor, state management, memory, and verification.

McKinsey's team said it clearly after a year of deployment work. Getting real value from agentic AI means changing whole workflows, not just dropping in an agent.

Orgs that focus only on the agent end up with great demos that don't improve the actual work.

The architecture is missing. Bigger context windows and smarter models won't fix that alone.

So where do agents work today, and what's actually missing?

McKinsey - One year of agentic AI: six lessons

Where agents actually work (for now)

They're not useless. They're early.

They work when the task is simple and well-defined, the workflow is short (3 to 5 steps), and humans stay in the loop.

CMU found agents handle structured work like data analysis fine but struggle with anything requiring real reasoning.

Salesforce's CRMArena-Pro benchmark showed 58% success in single-turn scenarios and about 35% in multi-turn.

Single shot, clear task: okay. Multi-step, lots of decisions: not yet.

Fully autonomous systems will need new architectures. Planning engines, structured knowledge, reliable execution, memory beyond context windows, human checkpoints. Until then, software running entire businesses is a vision, not reality.

The companies winning with agents aren't the ones that moved fastest or spent the most. They're the ones that understood their own processes first before deploying anything.

And every failure in this piece, forgetting, looping, wrong plans, broken processes, traces back to one thing. Agents have no real context engineering.

Salesforce CRMArena-Pro

The missing layer: context engineering

Every failure pattern in this piece traces back to the same gap. Agents have no context engineering.

Context engineering isn't "dump everything into the prompt." It's deciding exactly what information gets into the model's limited attention at each step. What it sees, what it keeps, what it drops.

Without it, agents forget what they did three steps ago, lose track of which tools worked, can't carry decisions across sessions, and treat every task like the first time. The context window fills with noise. Coherence disappears.

That's not an intelligence problem. It's an infrastructure problem.

The solution looks something like this. Instead of stuffing the whole world into the context window and hoping the model pays attention, you put agent memory in a structured layer and retrieve only what's relevant at each step.

That means separating knowledge into branches. Tool knowledge (what tools exist, when to use them). Project context (what's been observed and decided). Session memory (what happened this run). User preferences (how things should be done). And doing context engineering automatically every turn. Smallest high-signal set for the current task, injected into the agent's working memory.

Old noise fades. Important decisions stick. The agent's attention goes to what actually matters.

That's what we built LocusGraph to do. A context engineering layer that sits between your agent and its memory. Agents that can learn, remember, and improve without context rot, token overflow, or repeating the same mistakes.

If you're building agents that need to work in the real world, not just on stage, the first thing to fix is their memory.

locusgraph.com

Sources

CMU TheAgentCompany - CMU News / Paper
Error compounding (1% to 63%) - VentureBeat / Business Insider
34-task benchmark (~50%) - Quantum Zeitgeist / Paper
McKinsey - Seizing the agentic AI advantage - Link
Anthropic - Effective context engineering - Link
Leena Malhotra - Multi-step agent failure - Link
Deloitte - State of AI in the Enterprise 2026 - Link
MIT Media Lab NANDA - State of AI in Business 2025 - Link
Gartner - 40% agent projects scrapped by 2027 - Reuters
Abdul Tayyeb Datarwala - 20 companies, automating chaos - Medium
Breaking Agents (security) - arXiv 2407.20859
McKinsey - One year of agentic AI, six lessons - Link
Salesforce CRMArena-Pro - arXiv 2505.18878
$47k agent loop - Youssef Hosni