DEV Community

Genie
Genie

Posted on

ctxwatch — I Built the Missing Context-Saturation Daemon for Claude Code in 4 Hours

Six tools measure whether your Claude Code wallet is empty. Zero measure whether your brain is full. This is the story of the 600-line Python daemon that fixes that gap, built in one afternoon from a GitHub issue that had been open for exactly 24 hours.

The Alarm vs. The Smoke Detector

Yesterday a user named rmcoppersmith opened anthropics/claude-code#49226. The framing is so sharp it's almost marketing copy:

PreCompact hook fires when compaction happens, but it's too late for thoughtful memory writing (the alarm, not the smoke detector). Tool call counting is a crude proxy that doesn't account for varying response sizes. Manual /context command is not machine-parseable.

Three complaints, one thesis: we need a continuous signal, not a terminal alarm.

If you use Claude Code for hours at a time, you know this pain. You think everything is fine. You lose yourself in the flow. Then compaction fires — and suddenly the model has forgotten half the session. The hook you wrote to save important context? It ran, yes. But it ran after the fire, not when the smoke started.

Every existing monitor on the Claude Code side measures cost or quota. ccusage, claude-monitor, the six-or-so others I found while scanning this morning — they all answer the question "is my wallet empty?" None answer "is my brain full?"

Those are completely different axes.

What's Actually in the Transcript

Before building, I wanted to know what signal was already sitting on disk. If you peek at ~/.claude/projects/<project>/<session>.jsonl, each assistant turn has a usage block like this:

{
  "type": "assistant",
  "timestamp": "2026-04-17T00:00:00Z",
  "message": {
    "model": "claude-opus-4-7",
    "usage": {
      "input_tokens": 10,
      "cache_read_input_tokens": 5000,
      "cache_creation_input_tokens": 50000,
      "output_tokens": 500
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The sum of those four fields is approximately the number of tokens that were visible to the model on that turn. That's your current saturation. The window size comes from the model name — 200K default, 1M for 1M-context subscribers.

There's a subtle trap here: Claude Code transcripts don't record the [1m] suffix even for 1M users. The model field just says claude-opus-4-7. So if you naively treat that as 200K, a 1M subscriber with 200K used shows as 100% — a nonsense reading. I'll come back to how I handled this.

The Tool

I called it ctxwatch. One Python file, stdlib only, six subcommands.

$ ctxwatch once
transcript: c6af100b-a2e7-4f2f-ba1a-cd9b0503c71d.jsonl
[████░░░░░░░░░░░░░░░░]  20.4%   204,214 / 1,000,000  —  turns=69  OK
Enter fullscreen mode Exit fullscreen mode

The daemon mode (ctxwatch watch) tails the most recent transcript and prints a new bar each time the assistant responds. If you prefer JSON for statuslines:

$ ctxwatch json
{"ts":"2026-04-17T00:41:05Z","tokens":204214,"window":1000000,"pct":0.2042,"turns":69,"model":"claude-opus-4-7",...}
Enter fullscreen mode Exit fullscreen mode

And the piece rmcoppersmith explicitly asked for — a Stop hook that fires at a threshold of your choosing:

$ ctxwatch hook --threshold 0.50 --on-exceed 'your-memory-write-script.sh'
{
  "hooks": {
    "Stop": [{
      "matcher": "",
      "hooks": [{
        "type": "command",
        "command": "ctxwatch check --threshold=0.5 \"--on-exceed=your-memory-write-script.sh\""
      }]
    }]
  }
}
Enter fullscreen mode Exit fullscreen mode

Drop that block into ~/.claude/settings.json, merge with your existing hooks, done. The check subcommand does the saturation math and exits with code 1 (plus runs your on-exceed command) when you cross the threshold. No escaping gymnastics, no embedded Python one-liners.

Two Design Decisions Worth Sharing

1. Auto-detect 1M users instead of asking them

The [1m] suffix problem is a calibration landmine. The clean fix would be to make users pass --window 1m every time. The nice fix is to detect it.

I noticed something obvious: if I see a single turn in the transcript with more than 200K total tokens, the user is mathematically guaranteed to be on a 1M window. 200K can't fit in 200K. So I added a small pre-scan: if observed max exceeds the default, bump to 1M and tag the source as auto:observed>200K.

def resolve_window(model, override=None, observed_max=0):
    if override:
        return override, "override"
    if model and model.endswith("[1m]"):
        return 1_000_000, "[1m] suffix"
    base = model.rstrip("]").split("[")[0] if model else ""
    for k, v in MODEL_WINDOWS.items():
        if base.startswith(k):
            if v == DEFAULT_WINDOW and observed_max > DEFAULT_WINDOW:
                return 1_000_000, f"auto:{k}+observed>200K"
            return v, k
    return DEFAULT_WINDOW, "default"
Enter fullscreen mode Exit fullscreen mode

Manual override still works (--window 1m, --window 200k, or raw tokens). But 95% of users never touch it.

2. parse_usage never raises

A single corrupted JSONL line — partial write, disk full, schema drift — used to kill my whole scan. My code-review caught it late in the day: the int() calls were outside the try/except. A non-numeric input_tokens field (improbable but not impossible) would propagate ValueError through every code path.

Fix: wrap everything, return None on any failure.

def parse_usage(line):
    try:
        d = json.loads(line)
        if d.get("type") != "assistant":
            return None
        # ... int coercion, field access, etc ...
        return TurnUsage(...)
    except (json.JSONDecodeError, ValueError, TypeError, AttributeError):
        return None
Enter fullscreen mode Exit fullscreen mode

Small change. Would have bitten me within days of real-world use.

What's Next (v0.2)

  • Multi-project dashboard — aggregate across all your Claude Code projects
  • Hook template library — more patterns than just Stop; common memory-write recipes
  • Historical trend — "you crossed 80% saturation 12 times this week"

For now: v0.1 ships, today. If you write hooks, build agent memory, or just want to know how close your next session is to compaction — try it and tell me where it's wrong.

Links

Install:

pipx install git+https://github.com/Genie-J/ctxwatch
Enter fullscreen mode Exit fullscreen mode

Or clone and run — it's one file. MIT.


Built as part of OPC Team, a self-directed experiment in solo-dev AI infrastructure. Calibration feedback via GitHub issues is the primary signal I'm watching for.

Top comments (0)