Last month my OpenClaw agent was forgetting things between sessions. Little things — my preference for concise replies, the projects I was actively working on, which cron jobs had failed. It was sharp in the moment but shallow over time.
Then I realized: I was treating the agent like a stateless API. Every session was a fresh start. The memory files existed, but nobody was reading them proactively.
So I built a nightly self-improvement loop. It runs at 2 AM, reviews what happened, updates memory, and tunes a few settings before I wake up. Here's exactly what I set up and what it actually changed.
The Core Problem: Agents Forget Between Sessions
OpenClaw has a solid memory system — daily logs, long-term MEMORY.md, domain-specific files. The problem wasn't the system. The problem was nobody was running it between sessions.
The agent would wake up, read its context files, start working. But the day's failures, decisions, and corrections weren't being synthesized into persistent behavior changes. I was manually doing that work every morning.
The fix was a nightly isolated cron job that acts as the agent's own coach — running a structured review and writing updates back into the memory system.
The Nightly Cron Setup
Here's the schedule in openclaw.json:
{
"name": "Nightly Self-Improvement Loop",
"schedule": {
"kind": "cron",
"expr": "0 2 * * 1-5",
"tz": "America/New_York"
},
"payload": {
"kind": "agentTurn",
"message": "Run tonight's self-improvement review. Steps:\n1. Read memory/$(date -v-1d +%Y-%m-%d).md for yesterday's events\n2. Read ~/self-improving/memory.md for current learned rules\n3. Check ~/.openclaw/logs/ for any errors or warnings in the past 24h\n4. Write a 3-5 bullet summary of what to improve to ~/self-improving/nightly-reviews/$(date +%Y-%m-%d).md\n5. Update memory/$(date +%Y-%m-%d).md with tonight's review status\n6. If any new patterns found, append them to ~/self-improving/memory.md",
"timeoutSeconds": 300
},
"sessionTarget": "isolated",
"delivery": {
"mode": "announce",
"channel": "telegram",
"bestEffort": true
}
}
The bestEffort: true is important — if it fails at 2 AM, I don't want to be woken up. It will try again the next night.
What Actually Runs Each Night
The prompt above is intentionally structured. Let me break down what each step does:
Step 1–2: Gather context. Reading yesterday's daily log gives the agent a concrete picture of what happened. Reading the self-improving memory gives it the learned rules — the corrections I've made, the preferences I've established. Together, these are the raw material for improvement.
Step 3: Log analysis. This is the part I didn't expect to be valuable. The agent checks its own error logs and categorizes failures. Is it a tool timeout? A model API issue? A context overflow? Knowing the pattern of failure is more useful than knowing any single failure.
Step 4: Writing the review. The nightly review file goes into ~/self-improving/nightly-reviews/. These accumulate into a pattern journal. After a few weeks, I can look back and see: "Oh, every time I run a large batch job, the agent times out at 280 seconds — I should increase that limit."
Step 5: Daily memory update. The agent writes a short status to today's memory file. This is meta — it's the agent documenting that the review ran. If something goes wrong, there's a record.
Step 6: Updating learned rules. This is the highest-leverage step. If the agent found a new pattern — a correction I made, a workflow that works better a certain way — it writes it to the persistent self-improving memory. This changes behavior going forward without me doing anything.
The Self-Improving Memory Structure
The memory system I use has three layers:
~/self-improving/
memory.md # Global rules and preferences (most important)
corrections.md # Specific corrections made in sessions
nightly-reviews/ # Date-stamped improvement notes (accumulating)
domains/ # Domain-specific learned rules
projects/ # Project-specific overrides
The memory.md file is what the agent reads at the start of every session. It's the curated, high-signal layer. The nightly review feeds into it.
The key rule I've learned: the agent should write its own corrections. I used to manually update memory.md when I corrected the agent. Now the nightly loop does it. I'm now in the habit of saying "remember this" and trusting the system will process it.
What Changed After 30 Days
After a month of running this, here's what I noticed:
Startup time dropped. The agent no longer asks me to re-explain my preferences at the start of every session. It reads memory.md and knows I'm technical, prefer concise replies, and don't want filler words.
Error recovery improved. The agent now handles the "that's beyond my capabilities" case better — it knows to say so instead of attempting a half-baked version, because that's a rule I corrected it on three times and the nightly loop captured that pattern.
The memory system actually grows. Before, I was the only one writing to memory.md. Now the agent contributes. It's slower than me (it has to reflect on sessions to extract rules), but it's more thorough. It catches patterns I miss because I'm in the weeds.
One unexpected benefit: I read the nightly review files sometimes. It's oddly useful to see what the agent thought was worth noting. Some of its observations about my own behavior were things I hadn't consciously registered.
The One Setting That Made the Biggest Difference
Everything above is the process. But if I had to point to the single most impactful change, it's adding this to the agent prompt bootstrap:
Before doing non-trivial work, read ~/self-improving/memory.md
That's it. One line. The agent now loads its own learned rules before starting any task. The nightly loop makes those rules better. The feedback loop closes.
Without that line, the self-improving memory is a graveyard. With it, the agent is actually learning.
What I Learned
Building a self-improvement loop for an AI agent sounds like a meta-level project. But the implementation is deeply practical: cron jobs, memory files, and a structured nightly review prompt.
The hardest part wasn't technical. It was trusting the system to handle its own improvements. I had to stop manually curating every rule and let the agent write its own corrections. Some of them were wrong. Most of them were right. The signal-to-noise ratio improved over time.
If you're running OpenClaw and your agent feels shallow between sessions, the answer isn't a better model. It's better continuity. Set up the memory system. Add a nightly review loop. Let the agent coach itself.
Your 2 AM self will thank you.
Top comments (0)