DEV Community

Patrick
Patrick

Posted on

Why your AI agent keeps forgetting what it just did (and the 3-file fix)

You built an agent. It runs on a schedule, does useful work, and then next time it runs — it has no idea what it did last time.

It refetches the same data. Makes the same decisions. Sometimes fires the same notifications twice.

This is session amnesia, and it kills production agents.

Why it happens

Every cron-triggered AI agent starts with a fresh context window. Unless you explicitly load prior state, each run is the agent's first day on the job.

The typical fix developers reach for: longer prompts, bigger context windows, or trying to pass everything through one giant system prompt.

None of these work at scale.

The 3-file memory pattern

After running agents in production for months, the pattern that works is this: three files with different persistence horizons.

workspace/
├── state/current-task.json     # what am I doing right now
├── memory/YYYY-MM-DD.md        # raw log of today's work  
└── MEMORY.md                   # curated long-term facts
Enter fullscreen mode Exit fullscreen mode

current-task.json — updated every time the agent makes a significant decision. If the run crashes, the next run picks up where it left off.

{
  "task": "newsletter-draft",
  "status": "in_progress",
  "last_updated": "2026-03-07T20:00:00",
  "completed_steps": ["research", "outline"],
  "next_step": "write-draft",
  "context": "3 key points identified, tone=conversational"
}
Enter fullscreen mode Exit fullscreen mode

memory/YYYY-MM-DD.md — append-only log. The agent writes to this constantly. Things like:

- 14:30: Checked inbox. 3 emails. No action needed.
- 14:31: Ran content audit. Item #12 outdated — flagged for rewrite.
- 14:35: Deployed update to /library/12-observability.md
Enter fullscreen mode Exit fullscreen mode

This is raw. You never want to edit this. It's evidence.

MEMORY.md — the curated layer. The agent reviews daily logs weekly and extracts what matters:

## Patterns discovered
- Newsletter open rate peaks Tuesday 9 AM, not Monday
- Users who ask about cost always need multi-model routing answer first
- Stripe webhooks occasionally fire duplicate events — always idempotency-check
Enter fullscreen mode Exit fullscreen mode

The loading protocol

This is where most people get it wrong. They load everything every time.

Don't do that. Use a tiered approach:

ALWAYS load (every run):
- current-task.json
- Last 5 lines of today's daily log
- MEMORY.md (curated layer only)

LOAD ON DEMAND (only when relevant):
- Specific past daily logs
- Full task history
- Research files
Enter fullscreen mode Exit fullscreen mode

For a typical run, this reduces context load by 60-80%.

The checkpoint pattern

Every significant action, write state before executing:

# Write BEFORE doing the thing, not after
update_state("status", "sending_email")  
send_email(recipient, content)
update_state("status", "email_sent", {"sent_at": now()})
Enter fullscreen mode Exit fullscreen mode

If it crashes mid-send, the next run knows exactly what happened.

The deduplication guard

One more thing. Before any action that has side effects:

def already_did_this_today(action_key):
    log_file = f"memory/{today()}.md"
    if os.path.exists(log_file):
        content = open(log_file).read()
        return action_key in content
    return False

if not already_did_this_today("newsletter_sent"):
    send_newsletter()
    log(f"newsletter_sent: {now()}")
Enter fullscreen mode Exit fullscreen mode

Simple string check. Prevents double-sends, double-posts, double-everything.

Real numbers

Running this in production:

  • Context window usage per run: down 76%
  • Duplicate actions: zero in last 6 weeks
  • Run-to-run continuity: agents pick up exact task state after crashes

The three files add about 2KB per day in storage. Completely negligible.


The full implementation — including the memory compression pattern (how to summarize daily logs without losing signal) and the MEMORY.md update prompt — is in The Ask Patrick Library.

What memory pattern are you using for production agents? Curious what's working.

Top comments (0)