I've been working on a Mac-native agent framework for about a year. One of the hardest problems: making the agent actually remember context across sessions in a way that's useful, not just "here's your last 10 messages."
What I ended up with is a knowledge graph — entities (people, projects, tools, decisions) with typed relations, stored locally, updated automatically as you interact.
When you say "draft the Q3 report like last time," it knows:
- What "last time" looks like (the template, the tone, who received it)
- Who the stakeholders are
- Which files are relevant
The rough architecture
- Agent loop runs locally on macOS (Swift daemon + Go backend)
- LLM-agnostic: Claude, GPT-4o, Gemini, Ollama all work
- Memory tier 1: SQLite graph with embedding-based fuzzy retrieval
-
Memory tier 2: Standard
MEMORY.mdfiles per project - Memory tier 3: Nightly-trained personal memory model, updated while you sleep
- Tools: file ops, browser (Playwright), calendar, terminal, screen
- IM bridge: agent can push updates to Slack/LINE/Feishu while you're away
The part that surprised me most
The memory graph changed how I use the agent more than any LLM upgrade did.
When an agent has persistent structured context, you stop re-explaining everything and start delegating actual work. The shift from "AI as search box" to "AI as coworker" isn't about the model — it's about memory architecture.
What broke along the way
Auto-approval for filesystem ops — lost a config file on day 3. Everything destructive now requires a human confirm step.
Trying to be a general assistant from day one — it works way better when you give it a specific workflow to own first. I started with "write and send my weekly status report."
Ignoring the LLM choice — for agentic tasks with tool chains, Claude Sonnet/Opus vs GPT-4o makes a meaningful difference in error recovery rate.
The project
I turned this into Kocoro (open-sourced the core runtime: github.com/Kocoro-lab/Kocoro). It's a Mac-native AI agent — local-first, memory graph built in, IM notifications, LLM-agnostic.
Just opened a closed beta — drop a comment if you want an invite.
What are others using for persistent agent memory on local setups?
Top comments (1)
Happy to share closed beta invites — just drop a comment here or DM me.
Also open to questions about the memory graph architecture. The part that got the most iteration was the SQLite + embedding retrieval layer — figuring out when to do exact graph traversal vs. fuzzy semantic lookup took a few weeks to get right.
If you're building something similar (local agent memory, persistent context, etc.), would love to compare notes.