TL;DR — I ran a personal wiki inside Claude Code for a month and shipped it as Hypomnema (npm install -g hypomnema). It's MIT, plain markdown + git, no vector DB, no API keys. The core is 14 lifecycle hooks that wire the wiki into Claude Code so it maintains itself — auto-commit, auto-push, session-state injection, lookup signals, compact-time guards. The most acute thing a month of dogfooding surfaced — AI behavior corrections rotting across three hand-synced storage layers — gets a fix that ships with this first drop. This is the field report.
(There's a Korean retrospective on velog if you read Korean — same author, longer narrative.)
The thing every note tool fails at
I've cycled through every personal-knowledge tool you've heard of. They all break in roughly the same way:
| The pain | |
|---|---|
| Note vaults (Obsidian, Notion) | You write fine. You never re-read. 100 notes = 100 islands. |
| RAG / vector DBs | Chunks grow forever. Knowledge doesn't compound — it accumulates. |
| AI notebooks (Mem, Reflect, etc.) | Closed format. No git. I don't trust them in five years. |
| Code wikis (auto-generated from a repo) | Code only. Can't capture decisions, research, or AI behavior corrections. |
The thing every one of them fails at is synthesis. They store. They retrieve. They never update what you already know with what you just read.
That's exactly the gap Karpathy pointed at in his Gist this April. RAG (Retrieval-Augmented Generation) re-reads chunks of your sources on every query and pastes them into the prompt; a wiki persists synthesis between queries:
"RAG re-reads sources every query. A wiki persists synthesis. The bottleneck was always bookkeeping, and LLMs reduce that cost to zero."
So I built one — small at first, then aggressive — and ran it daily for a month inside Claude Code before opening it up. The result is Hypomnema.
What it is, in one paragraph
Hypomnema is a personal wiki that lives in ~/hypomnema/ as plain markdown + git. You feed it sources with one slash command. Claude reads them, synthesizes structured pages, and updates existing pages instead of creating new islands. Fourteen lifecycle hooks wire it into Claude Code so the wiki maintains itself: auto-commit, auto-push, session-state injection, lookup signals, compact-time guards. There's no vector DB. There's no API key. The whole stack is "Node.js script + markdown + git + Claude."
npm install -g hypomnema
# inside Claude Code:
/hypo:init
That's the entire setup.
Six walls, six fixes
Wall 1 — "Every new session forgets where I was"
Coming back to a paused project after a week, Claude had no memory of it. I had to retype context for ten minutes.
The fix took three attempts.
-
Print
hot.mdto stderr from aSessionStarthook. Doesn't work — Claude Code's TUI captures hook stderr during init. You see nothing. -
Inject via
additionalContextfromSessionStart. Also doesn't work in a multi-hook setup — when more than oneSessionStarthook runs, Claude Code keeps only the last hook'sadditionalContext. Earlier hooks get silently overwritten. -
The pattern that finally stuck:
SessionStartwrites a marker file (os.tmpdir()/wiki-session-marker.json); the firstUserPromptSubmitreads the marker and injects via itsadditionalContext(which is reliable). 10-minute TTL, single-shot consumption.
SessionStart
↳ hypo-session-start.mjs
↳ git pull --ff-only
↳ write tmpdir marker {hot, session-state, ts}
UserPromptSubmit (first one only, within 10 min)
↳ hypo-first-prompt.mjs
↳ read marker → inject hot.md + session-state.md as additionalContext
↳ delete marker
Resume cost is now effectively O(1). The takeaway: you can't rely on prompts; you also can't rely on a single hook channel — you have to triangulate around platform constraints. I lost a day to "why isn't stderr printing" before I understood that.
Wall 2 — "/compact deletes my work-in-progress learnings"
Claude Code's /compact summarizes your conversation. It also blows away anything you didn't write down. I lost two productive afternoons before I understood that.
Fix: a PreCompact hook that refuses to compact if the session-close checklist is unfinished or the wiki lints fail.
PreCompact
↳ hypo-personal-check.mjs
↳ if lint errors → block + show what's broken
↳ if session-close incomplete → block + show next step
The same machinery covers /clear: a Stop-chain auto-minimal-crystallize step offers /hypo:crystallize --apply-session-close --minimal on non-trivial session ends, and a SessionEnd marker + SessionStart source=clear recovery makes the /clear-then-restart round-trip clean (ADR — Architecture Decision Record — 0022 Layer 3 in the repo).
Wall 3 — "The 3-mode privacy matrix didn't survive contact with reality"
An early prototype had personal / shared / public modes per page. Sounded fine on paper. In practice every privacy decision was about which paths to exclude, not about a mode label.
Fix: the public release deletes the mode concept entirely. There's a single .hypoignore file with glob patterns. One file, one source of truth. Code shrunk by half. Mental model doubled in clarity.
Wall 4 — "BM25 libraries can't find Korean substrings"
Most BM25 (Best Matching 25 — the standard term-frequency ranking function for full-text search) implementations split on whitespace. Korean ("장애 보고서") doesn't always have whitespace where you'd expect, so a search for "장애" missed the page that contained "장애보고서".
Fix: a hand-rolled bi-gram BM25 tokenizer for CJK. Search suddenly worked.
Wall 5 — "I corrected the AI three times, and it still made the same mistake tomorrow"
Every AI tool today accepts corrections for the current chat. None of them persist.
I built /hypo:feedback "stop doing X because Y" — stores the rule in pages/feedback/<slug>.md, which is inside the wiki, so it auto-syncs via the Stop hook to every machine. Good first step.
But: Claude Code itself has two more places where behavior rules live — ~/.claude/projects/<id>/memory/MEMORY.md (per-project, machine-local) and ~/.claude/CLAUDE.md <learned_behaviors> (global, machine-local). Both live outside the wiki. Both are read on every prompt. Hand-syncing three layers rots fast.
Wall 6 — "Three storage layers, hand-synced, rotting fast"
A month of dogfooding compressed this into a sharp shape: I had 14 hand-written entries in ~/.claude/CLAUDE.md <learned_behaviors>. Nine of them had the same rule in pages/feedback/ already. I'd edited the wiki page and forgotten the global rules file. I'd added a global rule and forgotten to write the page. The drift was invisible until I counted.
The response that became the headline: feedback is the single source of truth, <learned_behaviors> and MEMORY.md are derived.
- One place to edit:
pages/feedback/<slug>.md. - One command to project:
hypomnema feedback-sync --check(dry-run) /--write(apply). -
<learned_behaviors>andMEMORY.mdget filled inside marker-fenced managed blocks (HYPO:FEEDBACK-SYNC:START / END). Hand-edits inside those blocks surface asCONFLICT_MANUAL_EDIT. - Strict gate for the global rules layer: a feedback page must declare
scope: global+tier: L1+targets: claude-learned+promote_to_global: true+sensitivity ∈ {public, sanitized}. Five gates. Plus a hard cap of 10 entries so global rules can't quietly bloat the token budget. - One-way projection. The wiki is the SoT (single source of truth, SoT); the two downstreams are derived. No back-flow.
-
sensitivity: privateis forbidden — the wiki is git-pushed, so private data must stay outside the wiki entirely.
# Author the SoT page
/hypo:feedback "<rule>"
# Project to downstreams
hypomnema feedback-sync --check # dry-run
hypomnema feedback-sync --write # apply (managed blocks only)
hypomnema feedback-sync --bootstrap # scaffold drafts from existing state
This is the spine of the release and the reason it's the first public one: any earlier and the projection design would have shipped with the assumption that hand-sync works, which it doesn't.
SCHEMA — why we refused to auto-stub
The new type: feedback page requires nine fields unconditionally: status, scope, tier, targets, sensitivity, priority, memory_summary, reason, source. If targets includes claude-learned, add global_summary + promote_to_global.
When you run hypomnema upgrade --apply, the upgrade writes a fresh backfill checklist into your wiki root. It deliberately does not auto-stub.
The reason: scope / tier / targets / sensitivity / reason / source are meaning decisions, not formatting decisions. Auto-stubbing them with defaults like scope: project:? would silently project wrong behavior into two downstream surfaces (MEMORY.md and <learned_behaviors>). We chose loud manual work over silent wrong work.
SCHEMA.md itself stays byte-equal across upgrades (Option C preservation). Migration report tag stays [schema] — the only token historically valid across every shipped Meta vocabulary. One honest caveat: lint regex ^project:[a-z0-9][a-z0-9-]*$ and the default cwd-derived project-id format are incompatible — to use scope: project:* you must pass --project-id=<slug>. Full reconciliation comes in the next minor.
A related cross-project leak in the projection filter is also closed: a feedback page scoped to project:other no longer projects into the current project's MEMORY.md. Only scope: global or exact scope: project:${projectId} entries pass through.
The rest in one section
Extensions companion sync. The wiki ships extensions/{agents, commands, hooks, skills}/. init scaffolds; upgrade mirrors into ~/.claude/ (and with --codex, the hooks + commands subset into ~/.codex/). hypomnema doctor extensions audits orphans, matcher drift, non-registrable orphans. This means your personal agents / skills / commands ride along with the wiki and stay in sync across machines. What's left for a dotfiles tool like chezmoi is ~/.claude/CLAUDE.md body + settings.json deltas.
Auto-project on cwd match. Open a repo with a project marker (package.json, Cargo.toml, go.mod, pyproject.toml, …) but no wiki project? SessionStart offers to create one. "Y" scaffolds from templates/projects/_template/. "N" gets recorded; 5-minute per-cwd cooldown.
PostToolUse WebFetch / WebSearch auto-ingest signal. Claude fetches a URL or runs WebSearch → PostToolUse hook injects a nudge so Claude considers /hypo:ingest. Privacy: URL query / hash / userinfo are stripped before injection on WebFetch URLs (WebSearch isn't URL-redacted because there's no URL being called).
Update notifier + --codex core hook mirror + W8 lint stale design-history + code comment cleanup. Update banner at SessionStart (npm channel + Claude Code plugin channel; pick the one you're on). upgrade --codex now mirrors core hooks (not just extensions). Lint emits W8 for stale design-history.md. A comment-only cleanup pass landed across 13 files: rot-prone refs and codex verdict shorthand stripped while contract / spec / Layer anchors stay. New policy: time-bound cross-references belong in PR descriptions, not in code comments. A second cleanup pass is queued for the next track.
What you actually get
8 slash commands
| Command | Purpose |
|---|---|
/hypo:ingest <url-or-path> |
Save raw to sources/, synthesize a page in pages/. Updates existing page if topic exists.
|
/hypo:query "..." |
BM25 retrieval + LLM synthesis. Answers cite [[wikilink]]s. |
/hypo:crystallize |
End-of-session: 11-step checklist that locks today's learnings into the wiki. |
/hypo:resume |
Reload the most recent state of an active project. |
/hypo:feedback "..." |
Capture an AI behavior correction — writes the SoT page directly. |
/hypo:verify |
Audit pages whose verify_by date has passed. |
/hypo:lint |
Frontmatter / wikilink / schema validation. |
/hypo:graph |
Generate a wikilink dependency graph. |
5 CLI subcommands
hypomnema init # bootstrap a wiki
hypomnema upgrade [--apply] [--codex] # hooks / SCHEMA / extensions / migration report
hypomnema doctor [extensions] # integrity audit
hypomnema uninstall # remove companion files
hypomnema feedback-sync --check|--write|--bootstrap
14 lifecycle hooks
| Event | Hook | What it does |
|---|---|---|
| SessionStart | hypo-session-start |
Inject hot.md / session-state.md, git pull --ff-only, offer auto-project, update notifier |
| UserPromptSubmit | hypo-lookup |
BM25 top-3 HIT inject / MISS → closest-slug signal |
| UserPromptSubmit | hypo-compact-guard |
Detect /compact → enforce checklist |
| UserPromptSubmit | hypo-first-prompt |
First-prompt forced resume summary (marker-driven, 10-min TTL) |
| CwdChanged | hypo-cwd-change |
Inject the matching project's hot.md
|
| FileChanged | hypo-file-watch |
Notify on wiki-file changes (honors .hypoignore) |
| PostToolUse(Write/Edit) | hypo-auto-stage |
Auto git add
|
| PostToolUse(WebFetch/WebSearch) | hypo-web-fetch-ingest |
Inject /hypo:ingest nudge (URL redacted) |
| Stop | hypo-auto-commit |
Auto commit + pull --rebase + push |
| Stop | hypo-hot-rebuild |
Rebuild hot.md from the latest session activity |
| Stop | hypo-session-record |
Record session metadata for the observability score |
| Stop | hypo-auto-minimal-crystallize |
Offer minimal crystallize on non-trivial session end |
| SessionEnd | hypo-session-end |
Stash a /clear-survivable marker so the next SessionStart can recover |
| PreCompact | hypo-personal-check |
Block compact on lint failures or unfinished session-close |
The synthesis-heavy commands also ship as Claude Agent Skills, so they auto-trigger by description match — no slash required.
Directory shape
~/hypomnema/
├── hypo-config.md ← root marker
├── index.md ← page catalog
├── hot.md ← active project pointers
├── log.md ← append-only activity log
├── SCHEMA.md ← type system (v2.0, user-owned)
├── MIGRATION-*.md ← created on schema bumps with a backfill checklist
├── .hypoignore ← glob patterns to exclude
├── pages/
│ └── feedback/ ← AI behavior corrections (the SoT)
├── projects/<name>/
│ ├── hot.md
│ ├── session-state.md
│ └── session-log/
├── journal/{daily,weekly,monthly}/
├── extensions/{agents,commands,hooks,skills}/ ← mirrored to ~/.claude/
└── sources/ ← raw ingested sources, never edited
How it stacks up against the other LLM-wiki OSS
After Karpathy's Gist, ten-plus implementations appeared in a couple of weeks. I read all of them. Short version:
| Project | Strongest dimension | Where Hypomnema differs |
|---|---|---|
nvk/llm-wiki |
--mode thesis (parallel for/against agents) |
We don't have thesis mode yet — on the roadmap |
SamurAIGPT/llm-wiki-agent |
Multi-format ingest (PDF/Word/PPT) + contradiction detection | We do contradictions at lint time, not ingest |
swarmclawai/swarmvault |
Graph RAG with Louvain clustering | We use Obsidian's graph view; no custom RAG |
nashsu/llm_wiki (6.6k ⭐) |
Electron desktop GUI, multi-language | We're CLI + Obsidian, no GUI |
Tencent/WeKnora |
Enterprise RAG + ReAct + WeChat Mini | Different category; we're personal-scale |
lucasastorian/llmwiki |
Hosted web app (SQLite FTS5, MCP) | We're local-first, file-based |
OmegaWiki |
9-entity typed graph | We have lighter typed relations |
Where Hypomnema is alone:
-
Session lifecycle automation — no other project hooks
SessionStart,PreCompact,Stopto keep the wiki maintained. -
Source isolation —
sources/is immutable. Other projects blur source and synthesis. -
AI feedback as a single source of truth with one-way projections —
/hypo:feedbackwrites a typed page;hypomnema feedback-syncprojects into Claude Code's memory and global rules layers. -
Companion sync for
~/.claude/{agents,commands,hooks,skills}/— your personal Claude Code companion files ride along with the wiki. - Bi-gram BM25 for Korean — small thing, shipped by default.
The summary I keep coming back to: other projects are about how to structure wiki content. Hypomnema is about automating the loop of working with an AI assistant over months. Different problem.
Things deliberately left out
- No vector DB. BM25 + LLM synthesis covers personal-scale corpora. Vector DBs are a future failure mode (they get bought, deprecated, or leak credentials).
- No API keys. Claude Code is the only AI dependency, and you already have it.
- No GUI. Obsidian already exists and is excellent.
-
No mode matrix. A single
.hypoignorebeatspersonal / shared / public. - No auto-stub on SCHEMA bumps. Wrong defaults silently project wrong behavior. Loud manual work beats silent wrong work.
The whole stack is intentionally boring. The interesting part is what Claude does on top of it.
Dogfooding the SoT engine on ourselves
The day the release was ready to publish, I ran hypomnema feedback-sync --check against my own machine for the first real time. Three things broke at once:
- The engine reported claude-target candidates = 11, cap = 10,
overCap=true,dirty=true. - My
~/.claude/CLAUDE.md<learned_behaviors>had 14 hand-written entries. - Of those 14, 9 had a backing page already in the wiki (i.e., duplicate hand-syncs that should have been managed blocks). 5 had no backing page (hand-written-only rules I'd never page-promoted).
So "14 vs cap 10" wasn't an over-cap problem. It was three problems compressed: (a) one demote needed to land under the cap, (b) nine duplicates needed cleanup, (c) five page-less rules needed a separate promotion track.
I ran codex through three candidates for the demote and picked the lowest-operating-loss one. I edited its frontmatter (targets removed claude-learned, promote_to_global: false). I ran --check again — green. I ran --write — <learned_behaviors> now had a HYPO:FEEDBACK-SYNC:START / END fenced block with 10 managed entries. I removed the 9 duplicate hand-writtens. End state: 5 hand-written + 10 managed = 15 entries. --check clean. Vault commit 18dd4e8. Engine transitioned from dormant to active.
What's worth saying out loud about that moment: a self-operating system surfaces its operator's accumulated drift on first run. The engine was fine. The data was the failure mode — a month of hand-sync I hadn't kept clean. That's the shape of dogfood that actually catches something — not "does the binary run," but "what does the binary surface about the operator."
The five hand-written-only rules become next-minor work (page them, decide demote, run --write). The nine cleanup wasn't a one-off; it's evidence the cap + gate enforcement was right.
Ready to try?
npm install -g hypomnema
# in Claude Code:
/hypo:init
That's it. The hooks register themselves. First commit happens automatically. If you set a remote, Stop will keep every machine in sync.
- GitHub: https://github.com/sk-lim19f/Hypomnema
- npm: https://www.npmjs.com/package/hypomnema
- License: MIT
If you've been frustrated by the "I write notes I never re-read" loop, give it a week. The compounding shows up around day 10 — the moment a third article on the same topic updates an existing page instead of creating a new one. That's when the model in your head clicks.
The next-minor contribution magnet: a chezmoi bridge for ~/.claude/CLAUDE.md and settings.json so the feedback-sync projection works across machines without manual sync. The Extensions companion already covers agents / skills / commands / hooks, but the CLAUDE.md body itself is still machine-local. Solving it has impact on the whole LLM-wiki OSS ecosystem — no project there has done it yet.
Feedback, issues, PRs welcome. If you build something on top of it, drop a comment — I'd love to see it.
Top comments (0)