You built an agent. It runs on a schedule, does useful work, and then next time it runs — it has no idea what it did last time.
It refetches the same data. Makes the same decisions. Sometimes fires the same notifications twice.
This is session amnesia, and it kills production agents.
Why it happens
Every cron-triggered AI agent starts with a fresh context window. Unless you explicitly load prior state, each run is the agent's first day on the job.
The typical fix developers reach for: longer prompts, bigger context windows, or trying to pass everything through one giant system prompt.
None of these work at scale.
The 3-file memory pattern
After running agents in production for months, the pattern that works is this: three files with different persistence horizons.
workspace/
├── state/current-task.json # what am I doing right now
├── memory/YYYY-MM-DD.md # raw log of today's work
└── MEMORY.md # curated long-term facts
current-task.json — updated every time the agent makes a significant decision. If the run crashes, the next run picks up where it left off.
{
"task": "newsletter-draft",
"status": "in_progress",
"last_updated": "2026-03-07T20:00:00",
"completed_steps": ["research", "outline"],
"next_step": "write-draft",
"context": "3 key points identified, tone=conversational"
}
memory/YYYY-MM-DD.md — append-only log. The agent writes to this constantly. Things like:
- 14:30: Checked inbox. 3 emails. No action needed.
- 14:31: Ran content audit. Item #12 outdated — flagged for rewrite.
- 14:35: Deployed update to /library/12-observability.md
This is raw. You never want to edit this. It's evidence.
MEMORY.md — the curated layer. The agent reviews daily logs weekly and extracts what matters:
## Patterns discovered
- Newsletter open rate peaks Tuesday 9 AM, not Monday
- Users who ask about cost always need multi-model routing answer first
- Stripe webhooks occasionally fire duplicate events — always idempotency-check
The loading protocol
This is where most people get it wrong. They load everything every time.
Don't do that. Use a tiered approach:
ALWAYS load (every run):
- current-task.json
- Last 5 lines of today's daily log
- MEMORY.md (curated layer only)
LOAD ON DEMAND (only when relevant):
- Specific past daily logs
- Full task history
- Research files
For a typical run, this reduces context load by 60-80%.
The checkpoint pattern
Every significant action, write state before executing:
# Write BEFORE doing the thing, not after
update_state("status", "sending_email")
send_email(recipient, content)
update_state("status", "email_sent", {"sent_at": now()})
If it crashes mid-send, the next run knows exactly what happened.
The deduplication guard
One more thing. Before any action that has side effects:
def already_did_this_today(action_key):
log_file = f"memory/{today()}.md"
if os.path.exists(log_file):
content = open(log_file).read()
return action_key in content
return False
if not already_did_this_today("newsletter_sent"):
send_newsletter()
log(f"newsletter_sent: {now()}")
Simple string check. Prevents double-sends, double-posts, double-everything.
Real numbers
Running this in production:
- Context window usage per run: down 76%
- Duplicate actions: zero in last 6 weeks
- Run-to-run continuity: agents pick up exact task state after crashes
The three files add about 2KB per day in storage. Completely negligible.
The full implementation — including the memory compression pattern (how to summarize daily logs without losing signal) and the MEMORY.md update prompt — is in The Ask Patrick Library.
What memory pattern are you using for production agents? Curious what's working.
Top comments (0)