shabalingv-rgb

Posted on Jun 27

How I built an exact Claude Pro usage monitor — and found an undocumented OAuth endpoint

#macos #claude #productivity #python

If you use Claude Pro or Max heavily, you've probably hit the 5-hour session limit at the worst possible moment — mid-task, no warning. Claude.ai shows you a percentage, but there's no official API to read it programmatically.

This is the story of building a macOS menu bar plugin that shows the exact usage percentage — and accidentally discovering how Claude.ai gets that number in the first place.

The problem

I wanted a live indicator in my menu bar: current %, time until reset, weekly usage. Simple enough. I reached for ccusage (a popular npm tool that parses Claude's local JSONL logs) and built a SwiftBar plugin around it.

It worked — but two problems showed up quickly:

Problem 1: 44 seconds to open the menu. ccusage daily --json --breakdown was being called every minute on refresh. Turns out it scans all your conversation logs and does heavy aggregation. Opening the dropdown felt like waiting for a build.

Problem 2: Wrong percentage. My plugin showed 43%, Claude.ai showed 27%. A 16 percentage point difference is not a rounding error.

The root cause of problem 2: ccusage computes non-CR cost (total cost minus cache-read tokens) and compares it to an assumed $10 limit. But it was using the first model's cache-read price for all cache reads in the session. In a mixed Sonnet+Haiku session with 9.9M cache-read tokens:

ccusage used: Haiku price ($0.10/MTok) × 9.9M = $0.99 subtracted
Correct:      Sonnet price ($0.30/MTok) × 9.9M = $2.97 subtracted

That $2 difference inflated the "used" cost and pushed the displayed % way above reality.

Attempt 1: Pure JSONL parsing

I replaced ccusage with direct Python parsing of ~/.claude/projects/*/*.jsonl. This solved the pricing problem — each message is priced by its actual model, so cache-read costs are subtracted correctly.

PRICING = {
    "opus":   {"in": 5.0, "out": 25.0, "cw5": 6.25, "cr": 0.50},
    "sonnet": {"in": 3.0, "out": 15.0, "cw5": 3.75, "cr": 0.30},
    "haiku":  {"in": 1.0, "out":  5.0, "cw5": 1.25, "cr": 0.10},
}

def cost_for_message(usage, model_family):
    p = PRICING[model_family]
    i   = usage.get("input_tokens", 0)
    o   = usage.get("output_tokens", 0)
    cr  = usage.get("cache_read_input_tokens", 0)
    cw  = usage.get("cache_creation_input_tokens", 0)
    return (i*p["in"] + o*p["out"] + cw*p["cw5"]) / 1_000_000
    # cr intentionally excluded — it's not counted against the limit

Result: 28% vs claude.ai's 27%. One percentage point off. The speed went from "wait for it" to 0.55 seconds total.

But I kept wondering — how does Claude.ai get its number? There must be a server-side source.

The discovery: undocumented OAuth endpoint

While researching alternatives, I found mentions of an endpoint used by third-party native apps to query usage directly from Anthropic. After some digging:

curl -s https://api.anthropic.com/api/oauth/usage \
  -H "Authorization: Bearer sk-ant-oat01-..." \
  -H "anthropic-beta: oauth-2025-04-20" \
  -H "User-Agent: claude-code/2.0.37"

Response:

{
  "five_hour": {
    "utilization": 15.0,
    "resets_at": "2026-06-28T06:59:59.957829+00:00"
  },
  "seven_day": {
    "utilization": 13.0,
    "resets_at": "2026-07-02T23:59:59.957848+00:00"
  }
}

utilization: 15.0 — that's 15%. Exact. Same number as claude.ai. No calculation, no approximation.

The token (sk-ant-oat01-...) is already on your machine if you use Claude Code CLI — it stores it in macOS Keychain under "Claude Code-credentials":

import subprocess, json

raw = subprocess.run(
    ["security", "find-generic-password", "-s", "Claude Code-credentials", "-w"],
    capture_output=True, text=True
).stdout.strip()
token = json.loads(raw)["claudeAiOauth"]["accessToken"]

No login, no setup — the token is already there.

The final architecture (v5.0)

With the OAuth endpoint available, the design became clean:

OAuth call → exact utilization % + resets_at timestamp for both 5h and 7-day limits
JSONL scan → model breakdown (Opus/Sonnet/Haiku) + hourly burn rate
Both run in parallel via threading.Thread, so total time = max(HTTP, disk_IO) ≈ 1.1s

oauth_data = {}
jsonl_msgs = []

def fetch_oauth():
    # Keychain → HTTP → oauth_data
    ...

def fetch_jsonl():
    # glob ~/.claude/projects/*/*.jsonl → jsonl_msgs
    ...

t1 = threading.Thread(target=fetch_oauth, daemon=True)
t2 = threading.Thread(target=fetch_jsonl, daemon=True)
t1.start(); t2.start()
t1.join(); t2.join()

The OAuth result provides:

five_hour.resets_at → block start = resets_at - 5h, exact countdown
seven_day.resets_at → weekly reset day/time (confirmed: resets Friday 02:00 local)

JSONL is only used for the breakdown section — which model consumed what, and how fast you're burning through the limit right now.

If OAuth fails (offline, token expired), the plugin falls back to JSONL-only calculation automatically.

What it shows

Menu bar: 🟢 23% · 4h 24m — current usage + exact time until reset
Dropdown: model breakdown, burn rate, forecast ("at this pace, limit in ~4h 12m")
Weekly section: 7-day utilization % + projected % by end of week + exact reset timestamp

A note on the endpoint

api.anthropic.com/api/oauth/usage is not documented in Anthropic's public API. It's an internal endpoint used by Claude.ai and Claude Code. It works because the OAuth token Claude Code stores in your Keychain is the same one claude.ai uses.

This means Anthropic could change or remove it without notice. But given that Claude Code itself depends on it, it's unlikely to disappear soon.

Try it

The plugin works with SwiftBar (free) — drop the script in your plugins folder, make it executable, done.

Requirements: Python 3 (pre-installed on macOS), Claude Code CLI installed and logged in.

No Node.js. No npm packages. No brew dependencies.

→ github.com/shabalingv-rgb/swiftbar-claude-usage

If this saved you some frustration, the repo has a crypto donation address in the README.

DEV Community