I gave my notes a cache hierarchy: phone L1, vault L2

#obsidian #productivity #pkm #discuss

I am not going to tell you which note app I run. The app is the least load-bearing part of this, and if I name it you will argue about the app instead of the decision. So I will do the more useful thing and show you how I decided, one axis at a time, including the option I threw out and the reason I threw it out.

The decision was not "which app." It was a layout question: should capturing a thought and keeping a thought happen in the same place, or in two places tuned for two different jobs? For years I assumed one place, because one place looks tidy on a whiteboard. I now run two, and the thing that moved me was a diagram I half-remembered from an undergraduate architecture course, sketched on a napkin at a coffee shop in February.

The shape of the problem

Here is the diagram. A CPU does not have one kind of memory. It has a hierarchy. Registers at the top, then L1 cache, then L2, then L3, then main memory, then disk. Each level down is larger and cheaper per byte and slower to reach. L1 is on the order of tens of kilobytes and answers in about a nanosecond. Main memory is gigabytes and answers in around a hundred. No single technology is fast and large and cheap at once, so the machine refuses to choose. It builds levels and moves data between them.

I stared at that napkin and realized my notes, the thing people now call a second brain, had the identical problem, and I had been pretending they did not. The place where I capture a thought needs to be instant, always within reach, and forgiving. The place where I keep a thought for two years needs to be large, searchable, durable, and portable. Those are not the same set of requirements. They are barely compatible. I had been shopping for one memory technology that was fast and large and cheap, and that product does not exist, for silicon or for notes.

The options on the table

I wrote down three honestly, not one plus two strawmen.

One unified store. A single vault I both type into on my phone and organize on my laptop. One folder of files, captured and curated in the same surface.
One database app. A hosted, Notion-style workspace where capture, structure, and storage all live in the same database, reachable from every device.
Two layers. A small, fast capture surface on the phone, feeding a larger, slower curation-and-storage surface on the laptop, with a defined path between them.

The axes I scored them on

I did not score on features. I scored on the same properties you would use to judge a level of memory.

Write latency at capture: how long from "I have a thought" to "the thought is recorded and I can let go of it." For me this is the dominant cost, and I will explain why in a moment.

Cost of a miss: what happens when the system is not ready in time. In a CPU, a cache miss costs you latency; you go fetch the line from a lower level. For a person capturing a thought, a miss costs you the thought.

Retrieval latency: how long from "I need that note" to "I am reading it." This matters, but it matters at my desk, where I have minutes, not on a sidewalk where I have a second and a half.

Durability and portability: will this note survive a vendor shutdown, an export, a decade. Plain markdown files on disk score high here. A proprietary database scores low, no matter how good the app feels today.

Merge cost: what it takes to reconcile the same note touched in two places. The more places a note can be edited, the higher this climbs.

Why capture latency won

The honest reason one axis dominated all the others: for a working memory system, the cost of a capture miss is not a slowdown. It is a total loss with no backing store.

This is exactly where the cache analogy bends, and the bend is the most important part, so I want to be precise about it. In real silicon, a cache miss is recoverable. The data still exists one level down; you just pay more nanoseconds to go get it. A cache is an optimization layered on top of a guarantee. Human attention has no such guarantee. A thought I fail to record in the first second or two is not slower to retrieve later. It is gone, and usually I do not even know it is gone, which is worse. There is no L2 for the idea I had in the shower and lost.

So the requirement on my top level is harsher than the requirement on a CPU's L1. L1 only has to be fast. My capture layer has to be fast enough that catching the thought beats losing it every single time: on a bad day, half asleep, one-handed, with a stranger talking at me. That pushed write-latency-at-capture and cost-of-a-miss to the top of the weighting and shoved everything else down. Retrieval can be slow. Storage can live elsewhere. Capture cannot.

The decision

Two layers. The phone is L1. It does one job: take a typed line and get it somewhere safe in about one tap, with no decision attached. No folder prompt, no tag picker, no "which notebook." A line goes in and I keep walking.

The laptop is L2. That is where the curating happens: where a captured line becomes a paragraph, gets filed, gets linked to a few older notes, or gets deleted because it was a 2 a.m. thought that did not survive contact with morning. My L2 is an Obsidian vault, plain markdown files synced through iCloud Drive, because durability and portability were the axes the storage level had to win, and plain text on disk wins them by default.

The path between the levels is the part people skip, and it is the part that makes a hierarchy a hierarchy instead of two disconnected apps. At the end of the day, the lines I captured on the phone are already in the vault as timestamped bullets under that day's note: - 14:22 the cache metaphor only works if eviction is automatic. I did not build a ritual where I sit down and copy things over, because a flush that depends on my discipline is a flush that will not happen, and I do not trust mine. The eviction is automatic. The line lands in the markdown file the same minute it lands in my inbox.

If you want to be pedantic about the analogy, what I described is closer to a write-through cache than a write-back one: the line is mirrored down to L2 the instant it hits L1, not held in the fast layer and flushed later. I chose that on purpose. Write-back would leave a window where a captured line exists only on the phone, and that window is exactly when a dropped device or a force-quit becomes a lost thought. The expensive work, deciding what the line means and where it belongs, is the part I defer. The mirroring is immediate; the curation is a later compaction pass over data that is already safe.

The rule that holds it together

One rule, and it is the whole system: a level never does the other level's job. The phone never curates. The vault never captures.

I learned this one the expensive way. For about three weeks last spring I tried to curate on the phone, because it felt efficient to tidy a note while standing in line for coffee. Within days my capture latency had crept from roughly a second to several, and I could feel the hesitation coming back. The reason was mechanical, not moral. The moment the capture screen also offers organizing affordances, capture stops being a reflex and becomes a small decision, and a small decision at the top of the hierarchy is the one thing the top of the hierarchy cannot afford. I tore it out and made the rule absolute. The phone got dumber on purpose, and capture got fast again.

What I rejected, and why

The unified store lost on exactly this rule. If the place I capture into is also the place I keep and organize, then every capture happens inside a structure, and structure asks questions. Which folder. Which existing note. Which tag. Each question is a few hundred milliseconds of deliberation at the precise moment I can least afford it. A single store forces a schema decision at write time. The whole point of a hierarchy is to let the fast level stay schema-free and push all the structure down to a level that has time for it.

The database app lost on durability and merge cost. A hosted database is a fine retrieval layer and a poor foundation, because the storage level is the one place I am least willing to rent. If the bottom of my hierarchy can be switched off by someone else's pricing decision, it is not storage, it is a long-term loan. Markdown files I can still read with cat in 2035 are the opposite of a loan. I will trade a nicer query interface for that every time, because I can always build a better reader on top of durable files, and I can never retrofit durability onto a format I do not control.

The review I scheduled for myself

I do not trust a decision I cannot re-examine, so I dropped a recurring note in the Obsidian vault to audit this one in 90 days. The questions I will put to future me:

Is capture still sub-second on a bad day, or has friction crept back in?
Is eviction actually automatic, or have I quietly started hand-copying lines again?
Did any line die between the layers this quarter? If so, where, and was it the path or me?
Am I curating on the phone again without noticing? (Architecture drift is silent.)
Is the bottom level still plain files I fully control, or did something convenient creep underneath it?

If every answer holds, the layout stays. If two or more have rotted, I redraw the napkin.

That is the entire method: stop hunting for one memory that is fast and large and durable, accept that it does not exist, and build the levels instead. The cache hierarchy is not a productivity hack I am selling you. It is just the shape you arrive at when you take capture latency and durability seriously at the same time and refuse to let either one lose.

If you think the cache metaphor breaks somewhere I did not catch, that is exactly the comment I want.

I build Simple Memo by myself — a one-tap iOS app that drops whatever I just typed into my email. On a curating day the same line is also sitting in a markdown vault in Obsidian. I write something up here every few days about the parts of working solo that I am still getting wrong.