Over the past few months I've been building with Cursor, Claude Code, and Codex - same tools a lot of builders are running right now. The pace is genuinely wild. I've shipped MVPs for friends where we went from a rough PRD to a working demo in a single day: auth wired up, database in place, deployed, link sent before midnight. Some of it fully vibe-coded. Some of it structured. All of it fast.
You're seeing this everywhere now. Builders spinning up entire products over a weekend, sometimes over an afternoon, because the tools finally keep up with the idea in your head.
And here's the thing people undersell: AI coding agents are actually pretty good at following engineering principles - when they have something to follow. Hand them a PRD, point them at your conventions, set a few non-negotiable rules, and they mostly stick to it. They'll reuse your components instead of inventing new ones. They'll write tests if tests are part of the standard. They won't skip the boring obvious stuff if the boring obvious stuff is written down somewhere they load every session.
The gap I kept hitting wasn't the coding. It was what happened to the decisions along the way.
When you move this fast, you make dozens of small calls in chat: which auth flow, which database, what "done" means for this feature, whether this module owns that boundary. The code commits to the repo. The reasoning usually doesn't. It stays in a thread that gets compacted, or a session that ends, or your head - which is a rough place to store architecture when you're juggling three builds at once.
I didn't want to slow down. I wanted the speed and a durable layer the agent could reload every session: accepted decisions, conventions, quality gates, evidence that work is actually finished.
So I built Persist OS - a local CLI that writes that layer into plain repo files and enforces it with a deterministic persist doctor gate. No network. No telemetry. No AI calls.
npx persist-os@latest init
The actual failure mode
Most agent setups treat chat history like it's source of truth. It isn't - it's the thing that evaporates first.
Persist OS flips the priority order. Repository memory wins over MCP context, external notes, and chat history. Accepted ADRs and engineering standards outrank whatever the model guessed in the last message. If they conflict, the agent is instructed to stop and report - not quietly pick a side.
That's the problem it solves: not "help me remember," but make the repo authoritative so the next session doesn't start from zero.
What persist init puts in your repo
One command scaffolds the whole memory structure under docs/ plus .persist/config.json:
docs/
00-product/ PRD, vision, roadmap
10-architecture/ stack, boundaries, decisions
20-security/ threat model, security rules
30-modules/ ownership per module
40-features/ PRD + acceptance + test plan per feature
50-quality/ quality gates
60-engineering/ standards, agent rules
adrs/ proposed → accepted decision records
It also generates the files your agent actually loads:
-
AGENTS.md- routing rules + Persist commands the agent runs itself -
CLAUDE.md- short pointer for Claude Code ("readAGENTS.md, don't trust chat") -
.cursor/rules/persist-memory.mdc- always-apply rule in Cursor -
.persist/hooks/pre-commit- runspersist doctorbefore commits land
Pick a stack preset if you want proposed ADRs for real forks (nextjs, python-fastapi, laravel-react, etc.). Every preset decision stays Proposed until you explicitly accept it - the schema won't let a preset silently become truth.
persist init --preset nextjs --ai-tools cursor,claude
git config core.hooksPath .persist/hooks # enable the doctor hook once per clone
How I run a one-day MVP with it
Say I'm building a checkout flow for a friend's shop. Fast build, but I still want the decisions to stick.
1. Scaffold memory before the agent touches code
persist feature create checkout
That creates docs/40-features/F-###-checkout/ with PRD.md, ACCEPTANCE.md, TEST_PLAN.md, TASKS.md, and the rest - not empty philosophy docs, but the exact checklist the project's delivery workflow expects. The agent isn't supposed to jump straight into src/. Module work follows the same pattern: feature docs first, module memory under docs/30-modules/, tasks after acceptance criteria and test plan exist.
2. Record decisions as proposals, accept what I mean
persist adr create payment-provider # writes docs/adrs/proposed/ADR-PROPOSED-payment-provider.md
persist adr accept payment-provider # promotes to docs/adrs/ADR-####-payment-provider.md
Nothing becomes accepted memory by accident. persist adopt on an existing repo and persist mcp add figma work the same way - inspect, propose, human accepts. The agent can capture MCP context offline into docs/ai/mcp/<server>.md, but Persist never calls the MCP server itself.
3. Let the agent load rules every session
Generated AGENTS.md tells the agent what to read and what to run:
Repository rules override model preferences. If instructions conflict, stop and report the conflict.
- `persist doctor` - validate repository memory; run before claiming any work is complete.
- `persist feature create <name>` - scaffold feature memory before non-trivial work.
- `persist adr supersede <old> <new-title>` - record a changed decision (never overwrite an accepted ADR).
In Claude Code, a SessionStart hook injects a live map of accepted ADRs and modules on top of that. Cursor gets the always-apply rule. Codex reads AGENTS.md directly. Different mechanisms, same floor: the agent reads the map, follows pointers into detail, doesn't need you to re-paste the architecture every time.
4. Gate "done" with evidence, not vibes
Before the agent can honestly say a feature is finished, the completion loop is:
pnpm test:run && pnpm typecheck && persist doctor
persist doctor is deterministic - no model judging your architecture. Exit codes: 0 healthy, 1 warnings, 2 errors. Same repo state, same result every time.
Concrete things it catches that chat never would:
| Check | What breaks |
|---|---|
| Completion evidence | Feature marked complete but REVIEW.md still pending or no test results recorded |
| ADR drift | Module doc cites ADR-0003-auth but that ADR was never accepted - error |
| Superseded decisions | Memory still points at an ADR you've since superseded - warning |
| Staleness |
src/ changed six commits ago, memory hasn't been touched since - warning |
| Context budget |
AGENTS.md + Cursor rule + CLAUDE.md bloated into a wall of text - warning |
So when you're moving fast, "done" means persist doctor exits 0 - not "the agent said it looks good." You can add persist guard --source src to fail commits where source changed without test files staged.
What changes when a decision changes
Fast builds change their mind. The failure mode there is silent contradiction - code on MySQL, ADR still says PostgreSQL, everything "passes."
Persist handles that with persist adr supersede <old> <new-title>: old ADR marked superseded, new one links back, Doctor flags anything still citing the old decision. The generated agent rules include the supersede command inline so the model uses it instead of editing history.
What it doesn't do: automatically detect that your accepted ADR and your current code quietly disagree. That needs reading, not grep. There's a generated architecture-drift-review skill for that. The gate stays dumb; the agent does the interpretation.
Not a vector memory engine (and that's intentional)
Tools like supermemory or mem0 solve retrieval - "find me what we said about Auth0." Useful, different problem.
I kept losing accepted decisions with review trails: where it was decided, whether a human accepted it, whether it's still the decision. That's a files-and-PR-review problem. Plain markdown ADRs, explicit accept/supersede, git diffs. Vector stores can index these files later; they don't replace them.
When it's worth the overhead
Use it when someone will inherit the repo - teammate, future you, fresh agent next week. That's when losing the why costs real time.
Skip it for a throwaway spike you'll delete Friday. No second session, no payoff.
The name is the test: persist what should persist. Prototype turns into product → run persist init. Stay proportional - small fix, just ship it; real feature or architectural fork, write it down.
Try it
MIT licensed. Runs entirely on your machine.
npx persist-os@latest init
- Repo: github.com/Karthick-Ramachandran/persist-os
- Why I built it: PHILOSOPHY.md
-
Example outputs per preset:
examples/generated-nextjs/,examples/generated-python-fastapi/, etc. in the repo
If you're shipping MVPs in a day and want the agent to keep following the same rules on day seven - this is what I wished existed. Tell me what breaks when you run it.
Top comments (0)