LangChain has three memory types: semantic, episodic, procedural. Mem0 raised $24M to build a "memory layer." Letta built a company around persistent agent state. Between them they've covered remembering what happened pretty thoroughly, and I keep wondering why none of them noticed that there's actually a second half.
Psychologists call it prospective memory: the ability to remember to do things in the future. Take your medication at 8pm, call the dentist when the office opens, bring up the budget thing if someone mentions Q3 numbers. Einstein and McDaniel published the foundational research in 1990, and by now there's an entire subfield studying how it works and why it's cognitively distinct from remembering what already happened.
Every agent memory framework I've looked at implements retrospective memory, but prospective memory is curiously absent. I think that's the gap that explains why agents still feel like tools you operate rather than assistants that actually assist.
Two systems, not one
Prospective memory is an entirely different cognitive system. It boils down to two subtypes that map almost perfectly to the agent problem. Time-based prospective memory fires at a specific moment: "take the pill at 8pm." Event-based prospective memory fires when you encounter a cue: "when I see my colleague, ask about the report."
The frameworks that come closest to supporting proactive behavior do it through scheduling APIs. Letta lets you schedule messages with timestamps or cron expressions. This covers time-based prospective memory and nothing else, which is the equivalent of having amnesia but at least you wear a watch.
What's actually missing
The second type is the interesting one; it activates when the right context shows up.
When I cataloged the things I actually wanted a proactive assistant to handle, they were mostly non-timer based. I already have a phone with a decent reminder app; why do I need a more verbose interface for that? My memory fails me when things get fuzzy and less defined, but because I have nothing else I try to fit these reminders into the mold, leading to a huge list of things that I now have to maintain, without a clear reference to why they exist.
When I mention "My ID card expires in March" while planning a trip to Japan in April, a good assistant would connect this to the upcoming travel and flag before it's too late. You shouldn't have to spell out "and please remind me to renew it in January accounting for 5-10 business days processing time"; the whole point of an assistant is that it makes those connections for you.
"When the finance numbers come in, I need to start the report." The trigger is semantic: new information entering the system that matches a stored intention. Nobody knows when finance will send the numbers, so there's no date to set a cron for.
The benchmark
I tracked what I actually needed a proactive assistant to remember over the course of a few weeks, and turned those into 22 test scenarios across 10 categories: simple timed reminders, cancelled events that should produce silence, fuzzy "sometime next month" intentions, and conditional triggers that fire when new information arrives. 9 of the 22 had an explicit timer attached and the other 13 didn't, which is the whole point.
I compared three methods. Simple cron is just a simple scheduling system. Smart cron is essentially the same, but I gave the agent waking up as much information as I could; full conversation history, full memory entries, and reasoning instructions. The prospective memory method uses projections; a structured entry with activation condition, stored context, status, and triggers, evaluated on a daily review or when new information arrives.
Sonnet 4.6 generated all responses; Opus 4.6 and Gemini 3.1 Pro judged them on usefulness, context richness, appropriateness, and coherence, and each scored 1-5.
On the full 22 the gap is wide: prospective entries 4.80, smart cron 2.59, cron 1.63. Makes sense, because the crons can't fire on 13 of the scenario's. But even on the 9 scenarios where all methods could fire, prospective entries scored 4.88 versus smart cron's 4.16. A blind human evaluation on 10 scenarios corroborated the rankings: prospective memory 4.85, smart cron 3.33, cron 2.38.
Interestingly, I initially had a simpler smart cron implementation, which just got reminder text and some reasoning instructions, and it scored 2.74. After that, the full treatment with conversation history, memory entries, reasoning instructions, scored 2.59. Lower. It reinforces that more context without the right structure just gives models more rope to hang themselves with.
What a prospective memory entry looks like
The implementation is pretty small. Instead of storing "user's ID card expires March 2026" as a flat fact, you create a forward-looking entry:
Summary: ID card expires March 2026, needs renewal before Japan trip
Activate: January 2026 (month resolution)
Context: Planning Japan trip for April. Dutch renewal takes 5-10 business days.
Status: pending
Trigger: none (time-based)
And for the conditional one:
Summary: Schedule internal sync if client meeting is cancelled
Activate: when cancellation detected
Context: Client meeting Thursday 2pm. Team wanted to discuss roadmap anyway.
Status: pending
Trigger: client meeting status changes to cancelled
Activation can be a time, a time range, or a semantic condition. Important is that the context is captured at creation time while the agent has the full conversational context. This is also why I called these entries "projections"; the agent projects itself into a future state, and writes itself a note while it actually knows what's going on. The status then allows the agent to decide that silence is the correct response when an event was cancelled or a task was already completed, and triggers connect entries to future events so that when new information enters the system, matching entries can activate without a timer.
Why nobody built this
The agent memory field grew out of information retrieval, not cognitive psychology; the people building these systems are solving "given a query, find the most relevant stored information," which is retrospective by definition. And the architectures are reactive: user sends message, agent responds.
Proactive behavior needs a different trigger mechanism entirely, and the few frameworks that have one bolt it on as a scheduling system rather than treating it as memory. The result is a blind spot shaped like half of human memory. The part that remembers is covered, but the part that reminds is conspicuously missing.
What the cognitive science predicts
Prospective memory research has decades of findings that map quite cleanly to the agent problem.
Time-based tasks (do X at 3pm) rely on monitoring: you check the clock repeatedly, and performance degrades when you're busy. Event-based tasks (do X when you see Bob) rely on cue recognition, which is automatic and much cheaper; the right environmental trigger brings the intention to mind without conscious effort.
That maps directly to implementation choices. A polling-based system that periodically checks "is there anything I should do right now?" is the time-based approach: expensive and easy to miss things between polls. A trigger-based system where new information entering the memory store activates matching intentions is the event-based approach: cheaper, more reliable, activated by the cue itself.
The cognitive science even predicts failure modes. Prospective memory breaks down when the cue is weak, when the person is under high cognitive load, or when there's a long delay between forming the intention and the activation moment. These failure modes apply directly to agent systems and could inform how you design activation thresholds, priority systems, and decay functions.
What framework authors could do
The addition is small enough to be a PR: a new memory type alongside semantic, episodic, and procedural, with fields for activation condition, stored context, status, and linked entries, plus a retrieval path that checks forward-looking entries when new information arrives rather than only when the user sends a query.
The benchmark, scenarios, and implementation are open source. I'd genuinely rather see prospective memory adopted by LangChain, Mem0, and Letta than have it only exist in my project. The gap between what cognitive science knows about forward-looking memory and what agent frameworks implement has been sitting there for thirty-five years. Seems like enough time for a 1.0.
Top comments (0)