The Man Who Studied the Hippocampus Is Telling You What's Missing

#ai #aimemory #neuroscience #deepmind

The Man Who Studied the Hippocampus Is Telling You What's Missing

In a recent conversation with Y Combinator's Garry Tan, Google DeepMind CEO Demis Hassabis said something that should get more attention than it's getting.

"We're kind of using duct tape right now. Shove it all in the context window. This seems a bit unsatisfying."

That's the head of one of the most well-funded AI labs on earth describing the current state of AI memory. Duct tape.

What makes this more than a throwaway line is who's saying it. Before Hassabis co-founded DeepMind, before AlphaGo, before AlphaFold, before the Nobel Prize, he did a PhD in cognitive neuroscience at University College London. His research was specifically about how the hippocampus works. How it integrates new knowledge into existing knowledge. How it replays experiences during sleep to consolidate them into long-term memory. How patients with hippocampal damage can't even imagine new experiences, let alone remember old ones.

He's not just an AI guy with an opinion about memory. He's a neuroscientist who published foundational work on exactly this problem.

And he's telling us the current approach doesn't work.

Full conversation: How to Build the Future: Demis Hassabis (YC Startup Podcast)

Context windows aren't memory

Hassabis laid out the problem precisely. Humans have a working memory of about seven items. AI models have context windows of a million tokens or more. That sounds like an advantage, but we're using it wrong. We're treating working memory as if it were all of memory, dumping everything in there regardless of importance or relevance.

"Things that aren't important, things that are wrong. It's pretty brute force."

Even if you could scale context windows to ten million tokens, you still have a retrieval problem. "There's still a cost to looking it up and finding the right thing that's actually relevant for the specific decision you've got to make right now," Hassabis said. "And that's non-trivial."

A million tokens sounds like a lot until you try to process live video. Then it's about twenty minutes. If you want an AI that understands what's been happening in your life for a month, a million tokens is nothing.

The brain solves this differently. It doesn't store everything in working memory and hope for the best. It has separate systems: a hippocampus for encoding and consolidating experiences, a neocortex for long-term storage and pattern recognition, and specific mechanisms for deciding what matters enough to keep. Sleep isn't downtime. It's when the hippocampus replays important episodes so they can be integrated into permanent knowledge.

Hassabis knows this better than almost anyone alive. DeepMind's very first breakthrough, the DQN system that learned to play Atari games in 2013, used "experience replay" borrowed directly from how the hippocampus works during sleep. It replayed successful trajectories many times to learn from them. That technique helped launch the modern era of deep reinforcement learning.

The neuroscience was right then. It's still right now.

The architecture question

Here's where it gets interesting. Hassabis clearly understands that memory needs a fundamentally different approach than what we have now. His own research proved it. His own lab's first success was built on it. He explicitly says that not having continual learning is "one of the things holding back agents from doing full tasks."

Google could build proper memory integration into Gemini. Hassabis, given his background, might be exactly the person to make that happen. But even the best memory system built into one platform is still one platform's memory system. The technical question of how to architect memory is separate from the ownership question of who controls it.

Hassabis articulated the design principle clearly when talking about scientific tools: "Really good general purpose tool usage models" that call specialized external systems. Not one giant brain with everything crammed in. He pointed out that putting protein folding knowledge into Gemini would degrade its language skills. The specialization has to stay separate.

He was talking about AlphaFold, not memory. But the logic extends naturally. If protein folding is too specialized to live inside a general model, what about the even more personal and idiosyncratic domain of an individual's accumulated knowledge, corrections, and reasoning history?

That's the architecture the brain uses. General cortical processing, with specialized subsystems connected to it. The hippocampus doesn't live inside the prefrontal cortex. It's a separate structure with its own circuitry, connected to the cortex but distinct from it.

If you follow the logic of Hassabis's own research, memory should be architecturally distinct from the reasoning engine. Whether that means a dedicated system inside Google's stack or an independent layer that works across platforms is a design choice, not a neuroscience question. The neuroscience just tells you that cramming everything into one undifferentiated context window isn't how memory works.

We've written separately about what an AI memory system should actually look like: typed memories, a real knowledge graph with structured relationships, hybrid search, personality persistence across platforms. Many of the design principles aren't speculative. They follow directly from the same neuroscience Hassabis published.

The question we'd add, which Hassabis didn't address, is this: should your memory belong to the platform or to you? The pattern from social media is instructive. The platform changes. The game is the same.

The data that matters most

Hassabis also discussed virtual cells, AlphaFold, and the bottlenecks in scientific discovery. One of his key observations: data acquisition is more constraining than compute. We have enough processing power. What we lack is the right data at the right resolution.

A 2025 survey on brain emulation by Freeman, Zanichelli, Schons and collaborators reached the same conclusion. The main computational bottleneck is "memory walls" rather than raw processing speed. The constraint isn't how fast you can think. It's what you have available to think about.

This applies directly to AI assistants. The most valuable data isn't on the internet. It's the corrections you've made. The preferences you've expressed. The reasoning chains you've worked through. The connections between your ideas that only emerge over months of use. That data doesn't exist in any training set. It's generated through use, one conversation at a time, and right now it evaporates when the session ends or the platform changes its terms.

If data acquisition is the bottleneck for scientific discovery, it's also the bottleneck for personal AI. The constraint on how useful your AI can be isn't the model's reasoning ability. It's whether it has access to everything you've already figured out together.

What's actually missing

The most honest part of the conversation was about what current systems still can't do. Hassabis described watching foundation models play chess and seeing them consider a move, recognize it as a blunder, fail to find anything better, and play the blunder anyway. "You just shouldn't be seeing that in a precise reasoning system."

He proposed an "Einstein test": train a system with the knowledge available in 1901 and see if it independently arrives at what Einstein discovered in 1905. Not just pattern matching. Not just extrapolation. Genuine novel insight. He admitted current systems can't do this and that something might still be missing.

He also pointed out that nobody has yet produced a genuinely creative output using AI agents that justifies the hype. No hit game that was vibe-coded. No scientific discovery that AI made on its own. He's not saying it won't happen. He's saying it hasn't happened yet, and that honesty is rare from someone in his position.

The gap between architecture and ownership

Demis Hassabis knows more about the neuroscience of memory than almost anyone running an AI lab. He studied the hippocampus. He built his first breakthrough on hippocampal replay. He can articulate exactly why context windows are insufficient. And unlike most people at his level, he's demonstrated through AlphaFold and his open research commitments that he's not just serving commercial interests. He gave away the solution to a fifty-year grand challenge in biology. That matters.

The gap isn't between what Hassabis knows and what he builds. It's between what any single platform can offer its users and what users actually need.

Even if Google builds the best memory system in the world inside Gemini, that doesn't help you when you're using Claude. Or when the next model comes along that's better than both. Or when any platform changes its terms. The value of your accumulated knowledge shouldn't be locked to whichever model you happened to be talking to when you figured something out.

The man who studied the hippocampus is telling you the hippocampus is missing. He's right. The remaining question is whether memory gets built as another feature inside another platform, or as infrastructure that belongs to the people who generate it.

Hassabis's foundational neuroscience research: Patients with hippocampal amnesia cannot imagine new experiences (PNAS, 2007) and Using Imagination to Understand the Neural Basis of Episodic Memory (Journal of Neuroscience, 2007).