The Problem with Today’s AI Agents
Large Language Models (LLMs) are powerful, but they don’t truly learn from experience.
Each interaction is isolated — no memory of past attempts, no cumulative knowledge.
Fine-tuning can help, but it’s:
- Expensive
- Rigid
- Slow to iterate
If we want truly adaptive agents, we need a better paradigm.
The Idea: Memory-Augmented Reinforcement Learning
Instead of retraining the model itself, Memora introduces memory into the loop.
Episodic Memory — stores past experiences (success + failure).
Case Retrieval — brings up the most relevant past examples for new tasks.
Memory Rewriting — updates knowledge dynamically with feedback.
This shifts the agent’s learning from parameter updates → to retrieval + reasoning.
How Memora Works
The architecture follows a Planner–Executor cycle:
- Meta-Planner (System 2):
Strategically breaks down complex problems.
Leverages memory for analogical reasoning.
- Executor (System 1):
Executes steps sequentially.
Writes results back to memory.
This means the agent improves with experience without touching the base model weights.
Key Results
GAIA benchmark: 87.88% validation (outperforming GPT-4 baseline).
DeepResearcher benchmark: +4.7–9.6% gain on out-of-domain tasks.
Local LLMs (Qwen2.5-14B, LLaMA): achieved near GPT-4 performance — on a consumer MacBook (M4).
Why This Matters
Continual learning without retraining.
Cost efficiency — runs on everyday hardware.
Interpretability — every decision can be traced back to memory.
Scalability — agents adapt in real time.
Try It Yourself
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a local model
ollama pull qwen2.5:14b
# Clone the repo
git clone https://github.com/Agent-on-the-Fly/Memora
cd Memora && pip install -r requirements.txt
# Run the agent
python client/agent.py
Final Thoughts
The future of AI isn’t just about building bigger models — it’s about building smarter agents with memory.
Memora shows that experience > parameters.
And this shift may redefine how we build and deploy intelligent systems.
Cross-posted from my Hashnode blog
Top comments (0)