Originally posted on AWS Builder.
I work with an AI every day. It's smart. It writes decent code. And every single morning, it forgets who I am.
I open kiro-cli chat, and the first 10 minutes are the same tax I paid yesterday:
Yes, we use pnpm. No, not npm. Yes, Vitest. No, not Jest. The main entry is
src/cli.ts. We already decided to useResult<T, E>at the CLI boundary. You told me that last week. I told you that last week. We had this exact conversation.
My teammate calls it the project re-discovery tax. Every session, you pay it. Every. Session.
I got tired of paying it.
Why the obvious fixes didn't work
I tried the obvious things first.
"Just use steering files." Steering files are great for what is this project. They're static markdown you maintain by hand. They don't capture what the AI figured out during a session. The whole point of working with an AI is that it learns things with you. Steering files can't capture that.
"Tell the agent to call a remember() tool when it learns something." I tried this. Claude is inconsistent about when to call it. GPT is inconsistent. Kiro is inconsistent. Every model is inconsistent, because memory management is a side-quest to whatever task you're actually doing. The agent forgets to remember. Turtles all the way down.
"Use a SQLite knowledge graph MCP server." Same problem. Fancier storage, same failure mode. The agent still has to decide when to store.
"Wait for Kiro to ship it." There's a proposal floating around for .kiro/tasks/*.md with auto-read/auto-write. No ETA. I had work to do this week.
The insight that actually fixed it
Here's what clicked for me, and I'll give credit where it's due — it came from a design doc by a coworker, I just productized it:
The agent should be a reader of memory, not a writer.
Writing memory is a different job from using memory. They should not share a context window. The writer can be slow, deliberate, even expensive. The reader needs to be fast, cheap, and running on every session start.
So I split them:
┌──────────────┐ MCP/stdio ┌────────────────────┐ filesystem ┌────────────────────────┐
│ Kiro CLI │ ◄───────────────► │ mcp-agent-memory │ ◄─────────────────► │ agent-memory-daemon │
│ (the reader) │ │ (MCP server) │ ~/.agent-memory/ │ (the writer) │
└──────────────┘ └────────────────────┘ └────────────────────────┘
Kiro reads. The MCP server gives it three tools: memory_read, memory_append_session, memory_search. That's it. Nothing fancy.
A background daemon writes. It watches the sessions directory, reads session summaries on a cadence, runs them through an LLM to extract durable facts, and updates markdown files in ~/.agent-memory/memory/.
They never talk directly. The filesystem is the contract. ~/.agent-memory/ is all they share.
Kiro burns zero tokens on memory management. The heavy lifting happens async, outside the chat.
What it looks like now
Monday:
$ kiro-cli chat
> We use pnpm. Never suggest npm. Vitest not Jest. Main entry is src/cli.ts.
I prefer explicit return types.
[Kiro does work for 20 minutes]
> Great, call memory_append_session with a summary of what we agreed on.
Terminal closes. Life moves on.
Tuesday:
$ kiro-cli chat
[Kiro automatically calls memory_read per my steering rule]
> I see we use pnpm, Vitest (not Jest), src/cli.ts as the main entry, and
you prefer explicit return types. What are we working on today?
No re-explaining. No pasted summary. The AI just remembers.
Between sessions, the daemon woke up, read Monday's session file, extracted the durable facts, deduplicated them against what it already knew, and updated ~/.agent-memory/memory/project-preferences.md. I didn't lift a finger.
The part I'm most proud of: it costs almost nothing
The daemon runs an LLM to do the extraction. LLMs cost money. I didn't want this tool to quietly drain my Bedrock bill.
So I added a Kiro backend. Instead of calling Bedrock or OpenAI, the daemon shells out to kiro-cli itself using your existing Kiro credits. Paired with a lean consolidation agent config (ships with the package), each extraction pass costs about 0.01 Kiro credits. Default agent would have been ~0.07. That 7× savings is the difference between "nice-to-have" and "forgot it was running."
You can still pick Bedrock or OpenAI if that's your stack.
Try it
npm install -g mcp-agent-memory
mcp-agent-memory --setup
The wizard walks you through picking a backend, registering with Kiro (and Claude Desktop and Cursor if you want), and installing the daemon as a LaunchAgent on macOS.
Add this one-line steering rule at ~/.kiro/steering/memory.md:
At the start of every session, call memory_read (no arguments) to load my
memory index. When you learn something durable about me, my projects, or
my preferences, call memory_append_session with a concise markdown summary.
Restart kiro-cli. That's it.
Reading the memory yourself
The memory isn't a black box. It's just markdown files in ~/.agent-memory/memory/:
$ ls ~/.agent-memory/memory/
MEMORY.md cli-architecture.md project-preferences.md team-processes.md
$ cat ~/.agent-memory/memory/project-preferences.md
# Project Preferences
- Package manager: pnpm (never npm)
- Testing framework: Vitest (not Jest)
- Main entry: src/cli.ts
- Return types: explicit, not inferred
...
$ grep -r "Vitest" ~/.agent-memory/memory/
project-preferences.md:- Testing framework: Vitest (not Jest)
cat works. grep works. git works. If you hate what it stored, delete the file.
What this isn't
- Not a knowledge graph with vector search. If you want that, totalrecallai does it beautifully — SQLite, embeddings, web dashboard, the works.
- Not AgentCore Memory. That's a managed Bedrock service. This runs on your laptop.
- Not a replacement for steering files. Steering is "what is this project." Memory is "what have we learned together." Use both.
Why this flavor exists
If I were going to pay for a heavyweight memory system, I probably wouldn't have built this. But:
- I wanted memory as plain markdown I could read, grep, and version-control.
- I wanted Kiro CLI support specifically (totalrecallai doesn't list it — targets Claude Code, Cursor, Windsurf, Cline).
- I wanted near-zero ongoing cost via the Kiro backend.
- I wanted the MCP server's surface area to be tiny — 3 tools, no dashboard, no SDK.
If that set of constraints sounds right for you, this is your tool. If you want the database-backed semantic-search dashboard experience, try totalrecallai — it's genuinely great at what it does.
Links
- npm: mcp-agent-memory
- GitHub: tverney/mcp-agent-memory
- The async daemon: previous post
- MCP spec: Model Context Protocol
If you try it and something breaks, file an issue. If you've got a pattern for what should be memorable vs. forgettable, drop it in the comments — that's the next hard problem I don't have a great answer for yet.
Tomorrow morning, Kiro will remember who I am. It doesn't feel that I'm unknown anymore.
Top comments (1)
The thing I'm still figuring out: what should be memorable vs forgettable. Right now the daemon errs on the side of capturing too much and I prune by hand with
rm. If anyone has a heuristic (or a prompt) that works for them, I'd love to hear it.