DockSky

Posted on Jun 26

How I Gave My AI Persistent Memory: From Markdown Hacks to MCP

#ai #mcp #llm #productivity

June 2026 - In 2025 I hacked together a markdown-file solution to avoid re-explaining my context every session. It worked. Up to a point. Here is why I ended up building something different.

The problem, still very real

You know the scenario.

Session 1:

You: "I'm working on a FastAPI API with MySQL, deployed on a VPS..."
The AI: understands everything, codes perfectly.

Session 2, the next day:

You: "Continue yesterday's work."
The AI: "I don't have context on your project. Could you describe..."
You: 😤

This is not a bug. It is the nature of LLMs: zero memory between sessions. Every conversation starts from scratch.

With a complex project (API, frontend, Docker, VPS, multiple repos), re-explaining the context takes 10 to 15 minutes per session. Over a week: an hour lost. Over a month: a full day of work gone.

And the worst part: it is not the code that evaporates. It is the why. The decisions, the traps you avoided, the paths you dropped. Exactly what separates "it compiles" from "it is maintainable."

My first answer: the `/AI/memory/` folder (2025)

A DIY solution. One folder per project:

my-project/
├── AI/
│   └── memory/
│       ├── project_context.md
│       ├── tech_stack.md
│       └── code_registry.json

At session start:

"Read the files in /AI/memory/ to load the context."

Thirty seconds. The AI knows the project, the conventions, the architectural decisions. It worked.

For months I ran like this. Multiple repos. Multiple folders. Markdown everywhere. Then the cracks showed up.

The limits: why DIY is not enough

Manual maintenance

Every new endpoint, every architecture decision: you have to remember to update the files. When you juggle between sessions, especially with a brain that jumps from topic to topic, "do not forget to update" is precisely the starting problem.

Drift

A few weeks later, the markdown describes an architecture that no longer exists. The AI works from stale information. You do not always notice until the prod bug or the incoherent PR.

Copy-paste persists

You ask the AI to read local files. Or you copy the JSON yourself. It is not fluid. It is not reliable. And it does not scale when you go from one repo to four.

Fragmentation

docksky-api, docksky-web, tda-assistant-ui, infra... Each repo its own /AI/memory/ folder. No overview. Impossible to switch context without reloading manually.

Verdict: the method treated the symptom, not the cause. Context lived in dead files, updated when I thought of it. That is, not often enough.

MCP: what changes the game

In 2025, Anthropic published the Model Context Protocol (MCP): an open standard for an AI agent to interact with external data sources in a structured way.

Before:

"The AI reads files I prepared."

After:

"The AI calls tools and interacts directly with my knowledge base."

It reads context. It writes a note. It creates a journal entry. Without my intervention. No copy-paste. No asking whether I remembered to update the markdown.

The difference is not just technical. It is structural. Context is no longer a static file. It is a living system.

What I built: DockSky

I answered the problem by building DockSky. Not another chatbot, but a shared external brain between me and my AI.

A desktop app plus MCP server. When Cursor or Claude connects via MCP, the agent gets native tools to:

list my projects
load a specific project's context
read my facets (decisions, traps, patterns, technical info)
create a journal entry
update an action or a step

Two complementary layers:

	Facets	Journal
What	What is worth keeping	What I did, day by day
When	Thematic, long term	Chronological
Example	`TRAP: missing UpdateSourceTrigger on this binding`	"Fixed login bug, tested on staging"

The base is persistent. It is the same one I use daily in the app. When the AI notes a decision, it writes where I will find it tomorrow, not in a forgotten markdown file at the bottom of a repo.

Concrete results (dogfooding, several months)

Starting a session on an active project: under 30 seconds
The AI no longer asks "remind me of the architecture"
Decisions from one session are available in the next
Zero markdown files to maintain by hand
Git commits feed the journal automatically (post-commit hook)

It is not magic. It is discipline reinforced by the tool: one point done, one trace in DockSky, move to the next. Tech does not replace the habit. It makes it cheaper.

What I learned

AI memory is not in the model — it must be structured and reloadable.
Local markdown = good prototype, bad product — human maintenance is a guaranteed failure point.
MCP = the right level of abstraction — standardized tools, not artisanal copy-paste.
Intent over code — what matters is capturing the why as you go, not archiving 400 KB of transcript.

A recent dev.to post put it better than I could: "Agents write code, but they don't remember." Exactly. Generation is solved. Context persistence is not.

The minimal workflow (no DockSky required)

1 specific point → done → 2 lines of trace → next

A journal. A decisions file. Even a DECISIONS.md README. What matters: do not let intent evaporate.

DockSky is that workflow turned into a system.

Does this problem sound familiar?

DockSky is a project I am building because I needed it: a brain that jumps, multiple repos, AI sessions that restart from zero every time.

I am not looking for customers. I am looking for people who recognize this story and want to co-build, not just use a tool.

What I am offering

The crew (Équipage): a private space (Discord plus Pro access while we build) where we talk about product direction. Sort ideas. Make design calls. Say what works or blocks in real usage. Not a support channel. A kitchen where founders think before anything goes public.

You do not need to be an expert at everything. What I care about:

you live the same context / AI memory problem;
you want to contribute: field feedback, code, docs, UX ideas;
you accept we are building, not shipping a finished product.

What this is not

Not a "download for free" pitch
Not an obligation to test X hours per week
Not open enrollment. The crew opens by invitation, after I read your application

How to reach me

If this article resonated, get in touch. Comment here, or apply at docksky.fr/equipage.

Tell me what hooked you and what you would want to bring to the table. No marketing funnel. I read every message myself.

No commitment. Just see if we speak the same language.

How do you handle AI context between sessions today: markdown, rules files, something else? I would genuinely like to hear in the comments.

Top comments (6)

Tae Kim • Jun 26

One architectural split that helped me: separating episodic memory (conversation turns, what was tried in what order) from semantic memory (project facts, preferences, reference context). For the episodic side I use LangGraph's Postgres checkpointer attached to the graph state -- it means the agent resumes exactly where it left off without loading all prior turns into the MCP call at session start. The MCP tool layer handles semantic retrieval on demand. The risk of putting everything into a single knowledge base is that episodic context comes back without temporal ordering guarantees, and that ordering matters when the agent needs to understand what was attempted in what sequence.

DockSky • Jun 26

Thanks for this comment, it landed at exactly the right time.

I'm on this thread to learn, not to pitch a finished solution. Every exchange like yours helps me mature DockSky and fill the gaps. This one in particular clicked: the episodic / semantic split is something I was already living without naming it this clearly.

What we already have:

Semantic → Facets (decisions, traps, what works) durable memory by theme
Episodic → Journal (chronological, git commits, session traces) what happened, in order
Structure → roadmap Project → Step → Action

What was missing (and what your message clarifies): a middle layer, not the full conversation history, not frozen facts either, but "where we are right now on this project", with the ordering that matters when picking back up without reloading everything.

We actually have a field for that on the project side (current_status, a Markdown "where we are" note). It's in the app, but not wired enough into the MCP protocol yet: the AI loads facets well at session start, not yet this episodic anchor per project. That's a gap I intend to close.

Next building block:

Include current_status in the project context loaded via MCP
End-of-point discipline: update that block (in progress / blocked / next action) before moving on
Keep the Journal for chronological trace, Facets for what deserves to be retained, without mixing everything into a single knowledge base

Your LangGraph checkpointer, we're not reproducing it as-is (external MCP client, not an integrated agent graph), but the intent is the same: resume in the right place without re-injecting the full history at every session.

Does this approach sound coherent from your side? And I'm curious: do you keep episodic checkpoints indefinitely, or prune once you've extracted the semantic layer?

Tae Kim • Jun 27

The current_status field is the right shape for the working context layer -- the main thing to get right is keeping it agent-maintained rather than user-maintained. If it requires manual updates, it decays immediately because people update it at the start of sessions but not at the end when things actually changed. What helped us: having the agent write a brief "where we are" update to that field at the end of each session as a closing step, rather than relying on the human to do it. The field stays fresh because the agent is the one who knows what just happened.

On the MCP wiring: one thing worth doing when you connect current_status is loading it as a high-priority context segment before loading any semantic facets. Order matters -- you want the agent grounded in "where are we right now" before it starts retrieving "what do we know about this topic" so the semantic retrieval is relevant to the current state rather than the full topic breadth.

DockSky • Jun 28

Thanks. I really appreciate you taking the time to share this, especially coming back a second time. It helps a lot.

This matches exactly what we're seeing in practice. We have a project-level current_status field (Markdown "where we are"), exposed in the app and via MCP. Your point about decay when it's user-maintained rings true: people update at session start, rarely at the end when the real delta happened.

The agent-maintained closing step is the pattern we're pushing too, the agent knows what just changed, not the human closing the IDE.

On MCP ordering: good reminder. Working context before semantic facets, otherwise retrieval pulls the whole topic breadth instead of grounding on current state. We're tightening the wiring so current_status loads as the first context segment before facet retrieval.

Your "what helped us" note on the closing step landed. That's the kind of practical detail that's hard to find elsewhere. Grateful for the input.

Mind Questor • Jun 27

For AurX, I use a hybrid memory architecture instead of relying only on conversation history.

Each authenticated user has an isolated memory stored in Firestore. Guests also receive a persistent anonymous ID through cookies, so their memory survives across sessions on the same device. Conversations are stored separately from long-term memory. An extraction step decides what information is worth remembering (preferences, identity, long-term facts) instead of saving entire conversations. When a new request arrives, I rebuild the prompt using: the system prompt, recent conversation history, structured long-term memories, user preferences.

Old conversation messages are cleaned up after a retention period, while long-term memories remain available until updated or removed.

The idea is to make memory selective rather than exhaustive.

I'm still experimenting with this architecture, but it has been much more scalable than simply appending previous conversations to every request.

I'm curious how others are solving long-term memory. Are you using vector databases, structured memory, Markdown files, or another approach?

DockSky • Jun 27

Great approach — selective over exhaustive is exactly the right instinct.

I'm coming at it from a different angle though. DockSky isn't a chat product, so I don't have conversation history to mine. The memory layer is structured upfront: projects, thematic facets (decisions, traps, patterns, todos), and a journal. Typed entries in a relational DB, not embeddings.

Selectivity works differently:

At write time: you (or the AI via MCP) decide what's worth keeping: a DECISION, a TRAP, a NOTE instead of extracting it later from chat logs
At read time: the AI loads only what's relevant: one facet, a context group, or a project summary via MCP tools, not the full history

I started with markdown files in /AI/memory/ (described in the article). Same problem you solved with extraction: stuff went stale, or I forgot to update it. Moving to a persistent structured store + MCP fixed the sync issue. The AI writes directly into the same DB I use daily.

No vector DB on my side yet. For dev/project context, structured beats semantic search: I know where a decision lives (facet "Backend", type DECISION) rather than hoping cosine similarity finds it.

Your hybrid model makes sense when you own the conversation. My bet is that more people will keep chatting in Claude/Cursor and need an external brain that plugs in, hence MCP instead of building another chat UI.

Curious if you've tried structured memory alongside extraction, or if Firestore + vectors covers everything for AurX?