Genie

Posted on Apr 16

ctxwatch — I Built the Missing Context-Saturation Daemon for Claude Code in 4 Hours

#claudecode #ai #python #cli

Six tools measure whether your Claude Code wallet is empty. Zero measure whether your brain is full. This is the story of the 600-line Python daemon that fixes that gap, built in one afternoon from a GitHub issue that had been open for exactly 24 hours.

The Alarm vs. The Smoke Detector

Yesterday a user named rmcoppersmith opened anthropics/claude-code#49226. The framing is so sharp it's almost marketing copy:

PreCompact hook fires when compaction happens, but it's too late for thoughtful memory writing (the alarm, not the smoke detector). Tool call counting is a crude proxy that doesn't account for varying response sizes. Manual /context command is not machine-parseable.

Three complaints, one thesis: we need a continuous signal, not a terminal alarm.

If you use Claude Code for hours at a time, you know this pain. You think everything is fine. You lose yourself in the flow. Then compaction fires — and suddenly the model has forgotten half the session. The hook you wrote to save important context? It ran, yes. But it ran after the fire, not when the smoke started.

Every existing monitor on the Claude Code side measures cost or quota. ccusage, claude-monitor, the six-or-so others I found while scanning this morning — they all answer the question "is my wallet empty?" None answer "is my brain full?"

Those are completely different axes.

What's Actually in the Transcript

Before building, I wanted to know what signal was already sitting on disk. If you peek at ~/.claude/projects/<project>/<session>.jsonl, each assistant turn has a usage block like this:

{
  "type": "assistant",
  "timestamp": "2026-04-17T00:00:00Z",
  "message": {
    "model": "claude-opus-4-7",
    "usage": {
      "input_tokens": 10,
      "cache_read_input_tokens": 5000,
      "cache_creation_input_tokens": 50000,
      "output_tokens": 500
    }
  }
}

The sum of those four fields is approximately the number of tokens that were visible to the model on that turn. That's your current saturation. The window size comes from the model name — 200K default, 1M for 1M-context subscribers.

There's a subtle trap here: Claude Code transcripts don't record the [1m] suffix even for 1M users. The model field just says claude-opus-4-7. So if you naively treat that as 200K, a 1M subscriber with 200K used shows as 100% — a nonsense reading. I'll come back to how I handled this.

The Tool

I called it ctxwatch. One Python file, stdlib only, six subcommands.

$ ctxwatch once
transcript: c6af100b-a2e7-4f2f-ba1a-cd9b0503c71d.jsonl
[████░░░░░░░░░░░░░░░░]  20.4%   204,214 / 1,000,000  —  turns=69  OK

The daemon mode (ctxwatch watch) tails the most recent transcript and prints a new bar each time the assistant responds. If you prefer JSON for statuslines:

$ ctxwatch json
{"ts":"2026-04-17T00:41:05Z","tokens":204214,"window":1000000,"pct":0.2042,"turns":69,"model":"claude-opus-4-7",...}

And the piece rmcoppersmith explicitly asked for — a Stop hook that fires at a threshold of your choosing:

$ ctxwatch hook --threshold 0.50 --on-exceed 'your-memory-write-script.sh'
{
  "hooks": {
    "Stop": [{
      "matcher": "",
      "hooks": [{
        "type": "command",
        "command": "ctxwatch check --threshold=0.5 \"--on-exceed=your-memory-write-script.sh\""
      }]
    }]
  }
}

Drop that block into ~/.claude/settings.json, merge with your existing hooks, done. The check subcommand does the saturation math and exits with code 1 (plus runs your on-exceed command) when you cross the threshold. No escaping gymnastics, no embedded Python one-liners.

Two Design Decisions Worth Sharing

1. Auto-detect 1M users instead of asking them

The [1m] suffix problem is a calibration landmine. The clean fix would be to make users pass --window 1m every time. The nice fix is to detect it.

I noticed something obvious: if I see a single turn in the transcript with more than 200K total tokens, the user is mathematically guaranteed to be on a 1M window. 200K can't fit in 200K. So I added a small pre-scan: if observed max exceeds the default, bump to 1M and tag the source as auto:observed>200K.

def resolve_window(model, override=None, observed_max=0):
    if override:
        return override, "override"
    if model and model.endswith("[1m]"):
        return 1_000_000, "[1m] suffix"
    base = model.rstrip("]").split("[")[0] if model else ""
    for k, v in MODEL_WINDOWS.items():
        if base.startswith(k):
            if v == DEFAULT_WINDOW and observed_max > DEFAULT_WINDOW:
                return 1_000_000, f"auto:{k}+observed>200K"
            return v, k
    return DEFAULT_WINDOW, "default"

Manual override still works (--window 1m, --window 200k, or raw tokens). But 95% of users never touch it.

2. `parse_usage` never raises

A single corrupted JSONL line — partial write, disk full, schema drift — used to kill my whole scan. My code-review caught it late in the day: the int() calls were outside the try/except. A non-numeric input_tokens field (improbable but not impossible) would propagate ValueError through every code path.

Fix: wrap everything, return None on any failure.

def parse_usage(line):
    try:
        d = json.loads(line)
        if d.get("type") != "assistant":
            return None
        # ... int coercion, field access, etc ...
        return TurnUsage(...)
    except (json.JSONDecodeError, ValueError, TypeError, AttributeError):
        return None

Small change. Would have bitten me within days of real-world use.

What's Next (v0.2)

Multi-project dashboard — aggregate across all your Claude Code projects
Hook template library — more patterns than just Stop; common memory-write recipes
Historical trend — "you crossed 80% saturation 12 times this week"

For now: v0.1 ships, today. If you write hooks, build agent memory, or just want to know how close your next session is to compaction — try it and tell me where it's wrong.

Links

Repo: https://github.com/Genie-J/ctxwatch
Issue that inspired this: anthropics/claude-code#49226
Sibling project: cc-healthcheck — static snapshot of what's in your context right now

Install:

pipx install git+https://github.com/Genie-J/ctxwatch

Or clone and run — it's one file. MIT.

Built as part of OPC Team, a self-directed experiment in solo-dev AI infrastructure. Calibration feedback via GitHub issues is the primary signal I'm watching for.

DEV Community

ctxwatch — I Built the Missing Context-Saturation Daemon for Claude Code in 4 Hours

The Alarm vs. The Smoke Detector

What's Actually in the Transcript

The Tool

Two Design Decisions Worth Sharing

1. Auto-detect 1M users instead of asking them

2. `parse_usage` never raises

What's Next (v0.2)

Links

Top comments (0)

The Alarm vs. The Smoke Detector

What's Actually in the Transcript

The Tool

Two Design Decisions Worth Sharing

1. Auto-detect 1M users instead of asking them

2. parse_usage never raises

What's Next (v0.2)

Links

2. `parse_usage` never raises