Thiago V.

Posted on Apr 7

Your AI Agent Forgets Everything — Here's a Daemon That Fixes That

#ai #agents #opensource #typescript

Originally posted on AWS Builder.

You spend an hour teaching it your project structure, your coding preferences, the weird Bedrock timeout issue you debugged last Tuesday. Next session? Gone. You're back to explaining that you prefer single quotes and that the CI pipeline needs --run to avoid watch mode.

Some frameworks have memory plugins. They work — sort of. But they're coupled to one framework, they accumulate junk over time, and nobody's cleaning up the contradictions from three weeks ago when you changed your mind about the database.

So I built agent-memory-daemon.

What it does

It's a background daemon that runs alongside your agent — any agent. It watches a directory of session files and does two things:

Extraction — scans new session transcripts and pulls out facts, decisions, preferences, and error corrections. Writes each one as a structured markdown file with YAML frontmatter.

Consolidation — periodically reviews the entire memory directory. Merges duplicates, converts relative dates to absolute, removes contradicted facts, prunes stale content, and keeps a concise MEMORY.md index under a size budget.

The filesystem is the interface. Your agent writes markdown files to a directory. The daemon reads them, thinks about them, and writes organized memories back. No SDK, no API, no MCP server. If your agent can write a file, it works.

The "aha" moment

I was running an agent that had accumulated 40+ memory files over a few weeks. Half of them were duplicates with slightly different wording. Three of them contradicted each other about which AWS region we were using. The MEMORY.md index was 800 lines long and the agent was spending half its context window just reading its own memories.

That's when I realized: agents need a janitor. Not just a place to store memories, but something that actively curates them.

How it works

Extraction (discovering new memories)

Session file modified
        ↓
Cursor check: is this new content?
        ↓
Build prompt: memory manifest + session content
        ↓
LLM identifies facts, decisions, preferences
        ↓
Write structured memory files
        ↓
Advance cursor

The daemon tracks a .extraction-cursor file — a per-session offset map so it only processes genuinely new content. If a session file gets appended to, it picks up where it left off instead of reprocessing the whole thing.

Consolidation (organizing existing memories)

Three-gate trigger: time elapsed + session count + lock
        ↓
Four-phase pass: orient → gather → consolidate → prune
        ↓
Merge duplicates, resolve contradictions
        ↓
Update MEMORY.md index (200 lines / 25KB budget)
        ↓
Release lock

Both modes share a PID-based lock and never run concurrently. Consolidation takes priority — if both triggers fire on the same tick, consolidation runs first.

Quick start

npx agent-memory-daemon init    # generates memconsolidate.toml
npx agent-memory-daemon start   # starts the daemon

The config is straightforward:

memory_directory = "./memory"
session_directory = "./sessions"

extraction_enabled = true
extraction_interval_ms = 60000

[llm_backend]
name = "bedrock"
region = "us-east-1"
model = "us.anthropic.claude-sonnet-4-20250514-v1:0"

Or use OpenAI:

[llm_backend]
name = "openai"
api_key = "${OPENAI_API_KEY}"
model = "gpt-4o"

What a memory file looks like

---
name: "Bedrock timeout configuration"
description: "Default SDK timeout is too short for large prompts"
type: reference
---
The AWS SDK's default request timeout causes ECONNABORTED errors
on prompts over 30K characters. Set requestTimeout to 300000 (5 min)
via NodeHttpHandler when using BedrockRuntimeClient.

Each file has a type: user (preferences), feedback (lessons learned), project (architecture decisions), or reference (technical facts). The daemon classifies them automatically during extraction.

Framework-agnostic by design

The integration pattern is the same regardless of what you're building with:

Strands / LangChain: after each agent run, dump a session summary to the sessions directory. At startup, read MEMORY.md into the system prompt.
OpenClaw: point session_directory at your workspace's transcript directory.
Custom agents: same pattern — write files, read the index.

No plugin system, no adapter layer. The filesystem is the API.

Guardrails

One thing I learned the hard way: without limits, the extraction mode creates files exponentially. Each pass sees the new files from the last pass, prompts the LLM with a bigger manifest, and the LLM creates even more files.

So there are guardrails:

max_memory_files — hard cap on total files in the directory (default: 50)
max_files_per_batch — cap on creates per extraction pass (default: 10)
max_prompt_chars — budget enforcement with progressive truncation
Per-session cursor — prevents reprocessing already-extracted content

Observability

Every operation emits structured JSON logs:

{"timestamp":"2026-04-07T14:23:01.234Z","level":"info","event":"extraction:complete","data":{"created":3,"updated":1,"durationMs":4521,"promptLength":39102,"operationsRequested":5,"operationsApplied":4,"operationsSkipped":1}}

You get duration, prompt size, operation counts, and skip reasons. Pipe it to CloudWatch, Datadog, or just jq.

What's next

Vector similarity search for memory recall (right now it's manifest-based)
Multi-agent support (shared memory directories with conflict resolution)
A web UI for browsing and editing memories

The project is MIT-licensed and on GitHub. Issues, PRs, and feedback are welcome.

npm install agent-memory-daemon

If your agent keeps forgetting things, give it a daemon with a good memory.

DEV Community