DEV Community

WAYLAND ZHANG
WAYLAND ZHANG

Posted on

I built a persistent memory graph for my Mac AI agent — here's the architecture

I've been working on a Mac-native agent framework for about a year. One of the hardest problems: making the agent actually remember context across sessions in a way that's useful, not just "here's your last 10 messages."

What I ended up with is a knowledge graph — entities (people, projects, tools, decisions) with typed relations, stored locally, updated automatically as you interact.

When you say "draft the Q3 report like last time," it knows:

  • What "last time" looks like (the template, the tone, who received it)
  • Who the stakeholders are
  • Which files are relevant

The rough architecture

  • Agent loop runs locally on macOS (Swift daemon + Go backend)
  • LLM-agnostic: Claude, GPT-4o, Gemini, Ollama all work
  • Memory tier 1: SQLite graph with embedding-based fuzzy retrieval
  • Memory tier 2: Standard MEMORY.md files per project
  • Memory tier 3: Nightly-trained personal memory model, updated while you sleep
  • Tools: file ops, browser (Playwright), calendar, terminal, screen
  • IM bridge: agent can push updates to Slack/LINE/Feishu while you're away

The part that surprised me most

The memory graph changed how I use the agent more than any LLM upgrade did.

When an agent has persistent structured context, you stop re-explaining everything and start delegating actual work. The shift from "AI as search box" to "AI as coworker" isn't about the model — it's about memory architecture.

What broke along the way

Auto-approval for filesystem ops — lost a config file on day 3. Everything destructive now requires a human confirm step.

Trying to be a general assistant from day one — it works way better when you give it a specific workflow to own first. I started with "write and send my weekly status report."

Ignoring the LLM choice — for agentic tasks with tool chains, Claude Sonnet/Opus vs GPT-4o makes a meaningful difference in error recovery rate.

The project

I turned this into Kocoro (open-sourced the core runtime: github.com/Kocoro-lab/Kocoro). It's a Mac-native AI agent — local-first, memory graph built in, IM notifications, LLM-agnostic.

Just opened a closed beta — drop a comment if you want an invite.

What are others using for persistent agent memory on local setups?

Top comments (1)

Collapse
 
waylandz profile image
WAYLAND ZHANG

Happy to share closed beta invites — just drop a comment here or DM me.

Also open to questions about the memory graph architecture. The part that got the most iteration was the SQLite + embedding retrieval layer — figuring out when to do exact graph traversal vs. fuzzy semantic lookup took a few weeks to get right.

If you're building something similar (local agent memory, persistent context, etc.), would love to compare notes.