I built a persistent memory graph for my Mac AI agent — here's the architecture

WAYLAND ZHANG — Wed, 03 Jun 2026 00:41:12 +0000

I've been working on a Mac-native agent framework for about a year. One of the hardest problems: making the agent actually remember context across sessions in a way that's useful, not just "here's your last 10 messages."

What I ended up with is a knowledge graph — entities (people, projects, tools, decisions) with typed relations, stored locally, updated automatically as you interact.

When you say "draft the Q3 report like last time," it knows:

What "last time" looks like (the template, the tone, who received it)
Who the stakeholders are
Which files are relevant

The rough architecture

Agent loop runs locally on macOS (Swift daemon + Go backend)
LLM-agnostic: Claude, GPT-4o, Gemini, Ollama all work
Memory tier 1: SQLite graph with embedding-based fuzzy retrieval
Memory tier 2: Standard MEMORY.md files per project
Memory tier 3: Nightly-trained personal memory model, updated while you sleep
Tools: file ops, browser (Playwright), calendar, terminal, screen
IM bridge: agent can push updates to Slack/LINE/Feishu while you're away

The part that surprised me most

The memory graph changed how I use the agent more than any LLM upgrade did.

When an agent has persistent structured context, you stop re-explaining everything and start delegating actual work. The shift from "AI as search box" to "AI as coworker" isn't about the model — it's about memory architecture.

What broke along the way

Auto-approval for filesystem ops — lost a config file on day 3. Everything destructive now requires a human confirm step.

Trying to be a general assistant from day one — it works way better when you give it a specific workflow to own first. I started with "write and send my weekly status report."

Ignoring the LLM choice — for agentic tasks with tool chains, Claude Sonnet/Opus vs GPT-4o makes a meaningful difference in error recovery rate.

The project

I turned this into Kocoro (open-sourced the core runtime: github.com/Kocoro-lab/Kocoro). It's a Mac-native AI agent — local-first, memory graph built in, IM notifications, LLM-agnostic.

Just opened a closed beta — drop a comment if you want an invite.

What are others using for persistent agent memory on local setups?

DEV Community: WAYLAND ZHANG

I built a persistent memory graph for my Mac AI agent — here's the architecture

The rough architecture

The part that surprised me most

What broke along the way

The project