Context
I've been building KIOKU — a memory / second-brain OSS for Claude Code and Claude Desktop. The v0.4 post was a zero-new-features release with three "working but not correctly" stories.
This post covers v0.5, which shipped two releases on the same day (2026-04-23):
- v0.5.0: unified ingestion router for PDF / Markdown / EPUB / DOCX
- v0.5.1: hot cache + PostCompact hook for cross-compaction short-term memory
The arc is "ingest external knowledge → persist it across sessions." Along the way I ran into an external-schema drift that took four release-day iterations to fully handle, and measured my way out of building a feature I had on the roadmap. Both are in this post.
https://github.com/megaphone-tokyo/kioku
v0.5.0: unified ingestion
What changed
Before v0.5.0, KIOKU had separate MCP tools for PDF, Markdown, and URL ingestion. The caller (Claude Code / Desktop) had to pick the right tool based on extension. That stops scaling the moment you want to add EPUB, DOCX, RTF, ODT, etc.
v0.5.0 introduces a single entry point:
kioku_ingest_document("paper.pdf") → PDF handler
kioku_ingest_document("note.md") → same PDF extractor (plain text chunk)
kioku_ingest_document("book.epub") → EPUB handler (new, yauzl-based)
kioku_ingest_document("spec.docx") → DOCX handler (new, mammoth + yauzl)
The old kioku_ingest_pdf is kept as a deprecation alias in the v0.5–v0.7 window, removed in v0.8. New integrations should use kioku_ingest_document.
EPUB: 8-layer ZIP defense
EPUB is a ZIP container. Before handing XHTML to Readability, KIOKU runs it through eight serialized defenses (via yauzl):
| Layer | Blocks |
|---|---|
| 1 | zip-slip (entries with ../) |
| 2 | symlink entries pointing outside the vault |
| 3 | cumulative size cap (decompression bomb, 500 MB max) |
| 4 | entry count cap (10,000 max) |
| 5 | NFKC filename normalization (Unicode normalization attacks) |
| 6 | nested ZIP skip |
| 7 | XXE pre-scan on XHTML (reject <!DOCTYPE) |
| 8 | XHTML <script> and on*= sanitize |
XHTML that passes all eight layers is converted to Markdown via Readability + Turndown, saved in spine order to .cache/extracted/epub-<subdir>--<stem>-ch<NNN>.md. For multi-chapter books an -index.md is also emitted for the downstream auto-ingest cron to use as a table of contents when generating summaries.
DOCX: yauzl + mammoth two-stage
DOCX is also ZIP + XML. mammoth handles the Markdown conversion cleanly, but internally uses jszip, which is permissive on XXE and zip-slip. So KIOKU's DOCX handler opens the ZIP with yauzl first:
- Open ZIP with yauzl, run the 8-layer defense (shared with EPUB)
-
assertNoDoctype()XXE pre-scan onword/document.xmlanddocProps/core.xml - If all checks pass, close the ZIP and hand the pre-validated
Bufferto mammoth - Metadata goes into a
--- DOCX METADATA ---fence with an untrusted annotation
Images (VULN-D004/D007) and OLE embeds (VULN-D006) are deferred for MVP. Images can follow the EPUB pattern later; OLE has a wide enough threat surface to warrant a separate PR.
Unicode filenames
EPUB / DOCX extractors are invoked from cron (auto-ingest.sh) with absolute paths. The argv parser splits raw-sources/<subdir>/<stem>.<ext> with a regex. \w is ASCII-only in JavaScript, which doesn't match non-Latin filenames, so the regex uses Unicode property escapes:
const m = argv[2].match(/raw-sources\/([\p{L}\p{N}_-]+)\/([\p{L}\p{N}_-]+)\.(epub|docx)$/u);
// ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ ^
// Unicode-aware Unicode-aware unicode flag
\p{L} matches any letter in any script (Latin, Cyrillic, CJK, Arabic, Devanagari, etc.); \p{N} matches any numeric character; the /u flag is required. So 論文.epub, 日本語メモ.docx, and paper-01.epub all match via the cron path.
v0.5.1: hot cache for cross-compaction memory
Shipped the same day: wiki/hot.md — a short (≤500 words, hard cap 4000 chars) recent-context memo injected at SessionStart and re-injected after PostCompact (Claude Code's post-compaction hook event).
KIOKU was already injecting wiki/index.md at SessionStart, but that's "catalog of everything I've ever learned." Hot cache is the complementary layer: "what am I in the middle of right now."
Hot cache format
---
type: hot-cache
updated: 2026-04-23
---
## Recent Context
- What I'm currently working on
- Design decisions from the previous session
- Things to remember into the next session
Injection strategy
| Event | Injected | Why |
|---|---|---|
| SessionStart | hot.md + index.md | Fresh session needs full context |
| PostCompact (new) | hot.md only | index.md is already in context; skip to save tokens |
Stop hook is opt-out by default
An opt-in prompt at Stop (asking the LLM to consider updating hot.md) is available via KIOKU_HOT_AUTO_PROMPT=1. Default is off.
The reasoning is about security boundaries: session-logs/ is machine-local (.gitignore-excluded) and never pushed. But hot.md is under wiki/ and syncs to the private GitHub repo. An auto-written hot.md is a path for accidentally-masked secrets to land in commit history. So the automation is gated behind explicit opt-in.
Security layering around hot.md
-
applyMasks()runs on the content before injection (shared with session-logger, frommcp/lib/masking.mjs) -
scan-secrets.shnow includeswiki/hot.mdin its scan target list -
realpathcheck rejects symlink escape (paths resolving outside the vault) - 4000-char cap with truncate + WARN log
Four release-time iterations for Claude Code v2's per-event schema
Implementation was short. What took longer was matching Claude Code v2's hook output schema — which turns out to differ per event. Four iterations landed the same day, all before the v0.5.1 tag.
In v1, every event accepted the same flat shape:
{ "additionalContext": "hot.md content..." }
In v2:
| Event | Required schema |
|---|---|
| PreToolUse / UserPromptSubmit / PostToolUse |
hookSpecificOutput wrapper |
| SessionStart |
hookSpecificOutput (lenient) |
| PostCompact | top-level systemMessage |
| Stop | top-level systemMessage |
And v2 silently ignores v1 flat output. No validation error, no log line, just no injection happens.
The iteration chain:
-
round 1 — wrapped SessionStart injector in
hookSpecificOutput -
round 2 — put hot.md before index.md in the injected prompt; added
MAX_INDEX_CHARS=10KBcap -
round 3 — PostCompact was silently no-op'd; moved to top-level
systemMessage -
round 4 — noticed Stop's opt-in prompt (a separate codepath in
session-logger.mjs:369) was still using v1 flat; unified to top-levelsystemMessage
Round 4 is the one that stings. When round 3 made me realize "oh, PostCompact needs a different schema," the correct move was to grep every other site that emits that schema and audit them together. I didn't. I fixed the site I was staring at. The session-logger.mjs:369 Stop opt-in path — which I don't personally exercise because I don't set KIOKU_HOT_AUTO_PROMPT=1 — stayed on v1 flat through round 3. A final PM review caught it before the v0.5.1 tag. Without that review, the opt-in path would have shipped silently disabled.
The rule I extracted: grep + negative-assertions for external schemas
I promoted this to an internal rule (LEARN#9):
Whenever an external API / CLI schema changes, audit every site that emits or consumes that schema in the same PR. Grep for the emission pattern, list every site, migrate them together, and pin the new shape with negative assertions.
Concretely:
# Fresh grep every time an external schema changes
grep -rn "stdout.write.*JSON.stringify" hooks/ scripts/
Plus negative assertions in tests:
assert.ok(!parsed.additionalContext, 'Should not emit v1 flat schema');
Negative assertions are underrated for schema migrations. Positive tests ("expected shape is present") miss partial migrations — old sites that still emit the old shape still pass their own tests. "Old shape is absent" catches those explicitly and keeps reverts from sneaking back in.
The feature I decided not to build
The roadmap had v0.5.2 penciled in for defuddle — a skill that strips web ad / boilerplate HTML to save 40-60% of tokens before Readability sees it. LLM-wiki-adjacent OSS in this space markets it as a headline feature. I had 6-10h budgeted.
Before starting, I ran a 30-minute probe against my own data:
| Measurement | Result |
|---|---|
Body length across 10 real articles in raw-sources/articles/
|
1,640 – 170,204 bytes, median ~10KB — all substantial |
grep across 38 session-logs/ for short-content / readability.*fail / extracted.*empty / failed-extraction |
0 matches |
Failed-extraction cache entries in .cache/extracted/url-*
|
0 |
@mozilla/readability, the extractor I already use, was already doing enough boilerplate stripping that defuddle's headline number had no measured room to operate in my pipeline.
So I skipped it.
Skipping, with conditions
"Don't build it" decays into "we never revisited." To prevent that, I logged the decision in handoff/open-issues.md §19 with explicit reopening triggers:
- If
raw-sources/articles/starts producing substantively empty (<500 byte) extractions with multiple occurrences - If session logs start showing actual failures on iframe / embed / JavaScript-heavy SPAs
- If a comparable OSS demonstrates a clear UX win specifically attributable to defuddle
None of those fire today. If any of them do, I'll revisit. That's not "never," it's "not now, and here's the signal that would change my mind."
Why "measure before building" isn't the default
For an individual OSS, default incentives push toward shipping — the roadmap said so, adjacent projects ship it, it's technically interesting, people expect velocity. Skipping a feature your roadmap advertised feels like failure in the moment.
But a feature you didn't need still costs you long-term maintenance. Look at the 8-layer EPUB defense above — every new feature adds threat surface and test suite weight. 30 minutes of measurement to avoid 6-10h of implementation and the accumulated maintenance tax is a straightforward ROI win.
The second-order benefit: writing reopening conditions forces you to articulate what the feature is actually supposed to achieve. If you can't state "I'd build this when X happens," maybe you don't know why you were going to build it.
What's next (v0.6.0)
v0.5 closes the ingest → persist loop. v0.6.0 moves outward toward external users:
- Agent Skills cross-platform: symlinking skills into Cursor / Windsurf / Gemini CLI / Codex
-
Claude Code plugin marketplace: one-line install via
claude plugin install kioku@megaphone-tokyo -
Delta tracking:
.raw/.manifest.jsonsource-level sha256 so repeated ingests of the same PDF skip -
Obsidian Bases dashboard: dynamic views over
wiki/metadata (ingest history, orphans, recent summaries) - LP β + Discord soft launch: finally moving from solo dogfooding to external feedback
The shift I'm most nervous about is the external-feedback one. Solo dogfooding has been productive — most of KIOKU's hardening has come from me hitting my own footguns. But there's a whole class of issues I can't see because I'm one user with one usage pattern. v0.6.0 is the cycle that tests whether KIOKU survives contact with actual other humans using it.
Summary
-
v0.5.0 — unified ingest router (
kioku_ingest_document), new EPUB handler with 8-layer ZIP defense, new DOCX handler with yauzl + mammoth two-stage validation, Unicode property escapes for non-Latin filenames -
v0.5.1 — hot cache (
wiki/hot.md) + PostCompact hook; Stop hook is opt-out by default becausehot.mdsyncs to Git (different security boundary than session-logs) - v0.5.1 four release-time iterations — Claude Code v2 hook schema is per-event; "fix one site, audit all sites" missing cost me three extra rounds of release-day iteration. Now enshrined as LEARN#9
- v0.5.2 defuddle skipped — 30-minute probe on real data showed Readability already handles boilerplate well enough; skipped with explicit reopening conditions
- MIT licensed, feedback very welcome
https://github.com/megaphone-tokyo/kioku
Read alongside the first, second, third, and fourth (v0.4) posts for the full KIOKU arc so far.
Questions I'd love feedback on:
- For external-schema migrations in general (not just Claude Code): do you have a lightweight process for "audit every emitting/consuming site in the same PR"? Beyond grep + a PR-description table, what's worked for you?
- For "measure before building": have you ever skipped a roadmap feature based on your own usage data? What triggered the measurement — discipline, or some specific regret from building-before-measuring?
Other projects
hello from the seasons.
A gallery of seasonal photos I take, with a small twist: you can upload your own image and compose yourself into one of the season shots using AI. Cherry blossoms, autumn leaves, wherever. Built it for fun — photography is a long-running hobby, and mixing AI into the workflow felt right.
Built by @megaphone_tokyo — building things with code and AI. Freelance engineer, 10 years in. Tokyo, Japan.
Top comments (0)