DEV Community

just_an_electron
just_an_electron

Posted on

A self-updating knowledge base for my terminal AI assistant (Claude Code hooks)

I spend most of my day in the terminal with an AI coding assistant. Every session I would solve something worth remembering: a tricky fix, a config gotcha, a small runbook. Then I would lose it. It lived in a scrollback buffer that vanished when I closed the tab. A month later I would re-solve the same thing.

The idea

Make the assistant maintain its own notes. Three moving parts:

  1. On every prompt, search a small Markdown knowledge base (KB) and inject the relevant entries, so the assistant answers with prior context instead of re-deriving it.
  2. When a session ends, capture anything worth keeping back into the KB.
  3. On session start, load the index so the assistant knows what exists.

Claude Code has hooks: shell commands that fire on lifecycle events. Three of them cover the whole loop.

{
  "hooks": {
    "SessionStart":     [{ "hooks": [{ "type": "command", "command": "/path/kb-load.sh" }] }],
    "UserPromptSubmit": [{ "hooks": [{ "type": "command", "command": "/path/kb-search.sh" }] }],
    "SessionEnd":       [{ "hooks": [{ "type": "command", "command": "/path/kb-enqueue.sh" }] }]
  }
}
Enter fullscreen mode Exit fullscreen mode

Everything below is the useful stuff I learned wiring these up.

Retrieval: UserPromptSubmit runs on every message

UserPromptSubmit fires for each prompt you send, receives the prompt text on stdin, and can inject context before the model runs. That is the hook that makes the KB get checked automatically, instead of leaving it to the model to decide whether to look.

Three design rules: it must be cheap (grep, no LLM), lean (inject only the top matches, never dump whole files), and it must never block the prompt.

#!/usr/bin/env bash
# NOTE: no `set -e`. A UserPromptSubmit hook that exits non-zero BLOCKS the prompt.
# Every failure path must fall through to `exit 0` with no output.
set -uo pipefail

PROMPT="$(cat | python3 -c 'import sys,json;print(json.load(sys.stdin).get("prompt",""))')"
[ -n "$PROMPT" ] || exit 0

python3 - "$KB" "$PROMPT" <<'PY' || true
import sys, os, re, glob, json
kb, prompt = sys.argv[1], sys.argv[2]

# tokenize the prompt, drop stopwords/short tokens, keep ticket-ish IDs like abc-123
terms = [t for t in re.findall(r"[a-z][a-z0-9-]{2,}", prompt.lower()) if t not in STOP]
if not terms: sys.exit(0)

# rank every KB file by (distinct terms matched, then total matches)
scored = []
for f in glob.glob(os.path.join(kb, "**/*.md"), recursive=True):
    text = open(f, errors="ignore").read().lower()
    hits = sum(1 for t in terms if t in text)
    if hits: scored.append((hits, sum(text.count(t) for t in terms), f))
if not scored: sys.exit(0)
scored.sort(reverse=True)

# inject ONLY the top few: title + path. The model opens the file if it needs detail.
lines = ["# Relevant KB entries (consult before answering; open the file for detail):", ""]
for _, _, f in scored[:5]:
    lines.append(f"- {title_of(f)} [{os.path.relpath(f, kb)}]")
print(json.dumps({"hookSpecificOutput":
    {"hookEventName": "UserPromptSubmit", "additionalContext": "\n".join(lines)}}))
PY
exit 0
Enter fullscreen mode Exit fullscreen mode

The point is not that grep is clever. It is that the hook does the searching once, ranks the results, and hands the model the specific files to read. The model stops guessing which files or tools to check, because the relevant entry is already in front of it. Scanning a few dozen small Markdown files per prompt takes under 100ms, and the win is injecting only the top five, not the whole KB.

Capture: the mistake, and the fix

My first version captured on the Stop hook, which fires every time the assistant finishes a reply. The capture spawns a headless AI call (claude -p ...) to read the transcript and write KB entries. It worked, and then the assistant got painfully slow. Every turn ended with "running stop hook" for minutes.

Here is why. Stop runs synchronously, once per turn. A 30-turn session fired the headless capture about 30 times, each one re-reading a bigger transcript. I was re-capturing the same conversation over and over.

The unit I actually wanted was the session (start until /clear), captured once. That is SessionEnd, which fires on /clear and on exit. So I moved the capture there, and hit the real lesson:

SessionEnd cannot run a long or backgrounded job. It is documented as non-blocking (side effects only), and a process you background from it is not guaranteed to survive. The session is terminating, and its child processes can be killed with it.

So a multi-minute headless capture launched from SessionEnd gets cut off partway.

The fix is to split it across two hooks.

SessionEnd does something trivial and instant: it enqueues the ended transcript's path.

#!/usr/bin/env bash
set -euo pipefail
[ -n "${KB_CAPTURE:-}" ] && exit 0          # recursion guard (see below)
T="$(cat | python3 -c 'import sys,json;print(json.load(sys.stdin).get("transcript_path",""))')"
{ [ -n "$T" ] && [ -f "$T" ]; } && echo "$T" >> "$KB/tools/queue.txt"
exit 0
Enter fullscreen mode Exit fullscreen mode

SessionStart (of the next session) drains the queue and runs the capture in the background. That is safe here because this session stays alive.

QUEUE="$KB/tools/queue.txt"; LOCK="$KB/tools/.lock"
if [ -s "$QUEUE" ] && mkdir "$LOCK" 2>/dev/null; then   # mkdir is an atomic single-drain lock
  mv "$QUEUE" "$QUEUE.wip"                               # claim atomically so new enqueues are not lost
  ( sort -u "$QUEUE.wip" | while read -r t; do
        [ -f "$t" ] && KB_BG=1 KB_TRANSCRIPT="$t" "$KB/tools/kb-capture.sh" >>"$KB/log" 2>&1
      done
      rm -f "$QUEUE.wip"; rmdir "$LOCK"
  ) >/dev/null 2>&1 &        # backgrounded off a LIVE session, so it survives
  disown 2>/dev/null || true
fi
Enter fullscreen mode Exit fullscreen mode

Net effect: one capture per session, off the critical path. The only trade-off is that the write lands at the start of the next session (a few seconds later) rather than the instant you /clear, which is the one place a long job is guaranteed to survive.

Two smaller notes. macOS has no setsid, so detach with nohup ... & plus disown (a bash builtin) rather than setsid. And read stdin before backgrounding, because the detached copy has none.

Three things that made it cheap and safe

1. A cheaper model for the capture. It is a summarize-and-write task, so it does not need your most expensive model. Pin it.

KB_CAPTURE=1 claude -p "$(cat capture-prompt.md)" --model <cheap-fast-model> \
  --allowedTools "Read,Edit,Write,Grep,Glob"
Enter fullscreen mode Exit fullscreen mode

Roughly 5x cheaper, and it finishes fast, so queued captures clear quickly.

2. A pre-filter before spending a token. Most sessions are not worth capturing. A quick grep gates the AI call.

grep -qiE 'root cause|next step|blocked on|draft|fix|<ticket-pattern>' "$TRANSCRIPT" || exit 0
Enter fullscreen mode Exit fullscreen mode

3. A recursion guard, and set it first. The headless claude you spawn inherits your environment, including the hook registration. So it fires its own hooks, which spawn another headless claude, and so on without end. One env var breaks the loop: put [ -n "${KB_CAPTURE:-}" ] && exit 0 at the top of every hook, and set KB_CAPTURE=1 when you spawn the headless call. Without it, one real session fans out into an unbounded tree of AI calls.

The economics

Per prompt: a sub-100ms grep plus a small context injection of about five entries, and only when there is a match. Per session: one cheap capture, and only if the session actually learned something (the pre-filter plus a "did anything change" check before writing).

That is the trade. You pay a tiny, bounded cost to cache knowledge, so future sessions skip the expensive part: re-reading a large codebase, re-querying tools, re-investigating a problem you already solved. One avoided re-derivation pays for a lot of captures.

The lessons, condensed

  • UserPromptSubmit is where you inject retrieval. It is the only hook that runs on every prompt and can add context before the model.
  • Anything on Stop runs on the critical path of every turn. Keep it instant.
  • SessionEnd cannot run long or background work. Enqueue there, and do the heavy lifting on the next SessionStart, where the process survives.
  • Guard against recursion before you ever run a hook that spawns the assistant.

Top comments (0)