Crazy experiment by me, author: @hejhdiss.
Note: The codebase in the repository was originally written by Claude Sonnet, but I edited and tested it as needed. This is an experimental project, but it works!
For years, the quest for truly intelligent AI has been hampered by a fundamental bottleneck: memory. While powerful models like Transformers dominate the scene, their reliance on external, "bolted-on" memory systems creates inefficiencies, limits long-term understanding, and demands vast computational resources.
Today, we introduce a paradigm shift: the MEMORY-NATIVE-NEURAL-NETWORK (MNNN) family. This isn't just another incremental improvement; it's a foundational reimagining of how neural networks handle information, moving from passive data storage to active, intrinsic memory that evolves with every interaction.
The Core Philosophy: Memory Baked In, Not Bolted On
The guiding principle of the MNNN family is simple yet profound: memory should be an inherent property of the neural network's architecture, not an afterthought. Imagine a human brain that requires an external hard drive to remember a conversation; it's inefficient and unnatural. MNNNs aim to mimic the biological brain's ability to integrate past experiences directly into its dynamic state, fostering genuine long-term understanding and adaptable intelligence.
This stands in stark contrast to prevailing models:
- Transformers manage context through attention mechanisms and Key-Value caches, which are essentially external look-up tables that grow linearly with input, eventually discarding older information.
- Recurrent Neural Networks (RNNs) and LSTMs carry internal states, but often struggle with vanishing gradients, leading to a rapid decay of long-term memory.
MNNNs overcome these limitations by embedding memory directly into the differential equations and neuronal dynamics, ensuring that past information isn't merely referenced but becomes a part of the network's very identity.
The Inaugural Members of the MNNN Family
The initial MNNN family is comprised of three pioneering architectures, each building upon the "memory-native" principle to offer unique capabilities:
1. AMRC: The Foundational Cell of Persistent Understanding
The Adaptive Memory Recurrent Cell (AMRC) serves as the cornerstone of the MNNN family. It introduces the fundamental concept of memory preservation at the individual neuron level. Unlike traditional neurons that merely activate and pass on a signal, AMRC neurons possess an inherent ability to retain aspects of their past activations.
This is achieved through:
- Memory-Preserving Activation ($\beta$): A mechanism that allows the neuron to selectively preserve elements of its previous output.
- Stateful Neurons ($\alpha$): Internal parameters that dynamically evolve, ensuring that the neuron's "identity" is a continuous function of its history.
The AMRC proves that a simple, yet robust, recurrent cell can maintain significant context without external memory structures, offering exceptional efficiency for scenarios demanding compact, yet intelligent, processing.
2. PMRC: The Intelligent Curator of Experience
Building on the AMRC's foundation, the Persistent Memory Recurrent Cell (PMRC) introduces a critical layer of intelligence: learnable memory gates. Where AMRC neurons simply preserve, PMRC neurons decide what to preserve and how strongly.
This mirrors how biological memory selectively filters and prioritizes information. The PMRC can dynamically adapt its memory retention strategies based on the incoming data, allowing it to:
- Focus on critical facts in a conversation.
- Discard irrelevant "filler" words.
- Develop a personalized understanding of a user's communication style over time.
PMRC is ideal for applications requiring adaptive personalization and lifelong learning, where the network needs to intelligently curate its internal memory based on ongoing interactions.
3. AMN: The Architect of Deep, Human-like Cognition
The Adaptive Memory Network (AMN) is the flagship model of the MNNN family, representing its most advanced and biologically inspired member. It integrates three cutting-edge mechanisms to achieve a profound level of memory and contextual understanding:
- Liquid Constant (LC) Neurons: Borrowing inspiration from biological neurons, these neurons possess adaptive time constants. This means the "speed of thought" within the network isn't fixed; it can dynamically adjust how long it attends to or remembers a piece of information based on its perceived importance or urgency.
- Linear Recurrent Units (LRU): Providing efficient and stable processing of sequential data, LRUs ensure that information flows through the network without the vanishing or exploding gradient problems often plaguing traditional RNNs.
- Associative Memory Manifold (AMM): This is the game-changer. The AMM acts as a high-dimensional, global "whiteboard" for the entire network. Instead of merely storing sequential data, it maps and organizes relationships between concepts and experiences into a persistent, evolving mental landscape. This allows the AMN to understand and recall the "gist" or "vibe" of long conversations, rather than just raw snippets.
The AMN is designed for the most demanding cognitive tasks, from complex, long-form dialogue to sophisticated pattern recognition where understanding the overarching context is paramount.
Becoming a Member of the MNNN Family: The Memory-Native Condition
While anyone can build upon these ideas, any new architecture claiming membership in the MEMORY-NATIVE-NEURAL-NETWORK (MNNN) family must adhere to a crucial condition: its memory must be fundamentally baked into its computational core, not added as an external module.
This means:
- Intrinsic State: The model must maintain a persistent, evolving internal state that directly influences its future computations.
- Architectural Integration: Memory mechanisms (like $\alpha$, $\beta$, learnable gates, or manifold projections) must be inseparable from the neuron's mathematical definition, not a separate data structure accessed by the network.
- Adaptive Dynamics: Ideally, the memory behavior should be dynamic, allowing the network to adapt its retention and recall based on the input stream itself.
Explore the Code and Join the Movement
The foundational code for the MNNN family, including the implementations of AMRC, PMRC, and AMN, is openly available for exploration and collaboration. The repository provides clear examples and a powerful C backend for efficient execution.
Repository Link: https://github.com/hejhdiss/MEMORY-NATIVE-NEURAL_NETWORK
Within the repository, you'll find:
-
api.py: Demonstrates how to interact with these models, showcasing their core functionalities. -
test_*.py: Comprehensive test suites validating their unique memory properties. -
LICENSE: The code is open-source under the GPL V3, encouraging broad adoption and innovation.
Remember: This is an experimental project by me (@hejhdiss), but it works! The original code was written by Claude Sonnet, and I adapted it as needed for testing and improvements.
Top comments (0)