Özgür Kurucan

Posted on May 9

How to Monitor Claude Code

#agents #claude #monitoring #vscode

How to Monitor Claude Code

You spent an hour with Claude Code. The terminal scrolled by, the feature shipped, you typed /cost: $18.

Do you know where that money went?
How many times did the agent re-read the same file?
Which tool call got stuck in a retry loop and ate half your hour?
What did the subagents actually do?
Did the context window fill up and trigger a silent compaction?

The honest answer is usually "no idea." By default, Claude Code is a black box. The terminal scrolls, the work gets done, the bill arrives — and everything that happened in between is invisible.

Argus is the open-source VS Code extension that opens that black box. It watches every Claude Code session live, shows you exactly where your tokens go, and flags retry loops while they are happening — not after the invoice arrives.

The 5 Real Problems with Running Claude Code Blind

If you have used Claude Code seriously for more than a week, you have hit at least three of these:

1. Bill shock

Your monthly invoice is 3x what you expected. /cost only shows you the total. Which session, which prompt, which model burned the tokens? You have no breakdown — and therefore no way to actually fix anything.

2. Silent retry loops

The agent runs the same Bash command 12 times, fails 12 times, and burns tokens on every attempt. You only notice because "it took a while." To find the loop you would have to open a 10,000-line JSONL transcript and read it by hand.

3. Duplicate reads draining your token budget

The agent pulls the same file into context 7 times across the session. Your cache hit ratio plummets, your input tokens explode. There is no native way to surface this — you would have to grep the transcript yourself.

4. The subagent blind spot

You spawned a Task and it returned with a result. What did it actually do? Which tools did it call? How much did it cost on its own? You typically only see the final message — the inner trace is buried.

5. Compaction quietly eats your context

The window filled up, Claude Code auto-compacted earlier history — but exactly when, and exactly what did it drop? This is the answer to "why did it forget that detail later?", and you have no native way to see it.

These five problems are the invisible tax on every serious Claude Code workflow. The fix is the same in all five cases: make your sessions observable.

What is Argus?

Argus is a free, open-source VS Code extension built specifically for Claude Code observability. Its only job is to:

Auto-discover every session under ~/.claude/projects/,
Parse it into a panel a human can actually read,
Update live while Claude Code is running,
Automatically flag wasted spend, retry loops, duplicate reads, and other anti-patterns.

Named after Argos, the hundred-eyed watcher of Greek mythology. The name is the job description — built to watch your agents.

Free. Open source (MIT). No login. No data ever leaves your machine — Argus only reads the JSONL transcripts Claude Code is already writing locally.

30-Second Install

Open VS Code, click the Extensions icon in the left sidebar (or press Ctrl+Shift+X / Cmd+Shift+X). Type "Argus" in the search box, find Argus — Claude Code Observability in the results, and hit Install.

The moment install finishes, an eye icon appears in the left Activity Bar. Click it and your Claude Code sessions are already listed. Zero configuration.

If you have never run Claude Code, Argus will show an empty list. Run a single Claude Code session and it will appear automatically.

What You See When You Open a Session

Click any session in the sidebar and a tab opens on the right. The tab is split into multiple sub-panels — each one answers one of the problems listed above:

Steps — "What did the agent do?"

Full list of every tool call. Which file it read, which command it ran, which edit it made. Search, filter by tool type, filter by success/error.

When to use it: "Just summarize what happened in this session."

Analysis — "Where did it go wrong?"

Argus runs a rule engine in the background and automatically flags the 5 problems above:

Duplicate reads — same file pulled into context multiple times (free token burn)
Retry loops — same command failing over and over
Failed tools — tool calls that errored out
Unused operations — files that were read but never used
Context pressure — window approaching its limit
Compaction events — the moments Claude Code dropped earlier history

Every finding has a "jump to step" link, so you can land on the exact moment it happened.

When to use it: Whenever a session feels slow. Or after a bill shock.

Cost — "Where did my money go?"

Per-step token and dollar breakdown. Input / output / cache read / cache write split out separately. Which model (Opus / Sonnet / Haiku) cost what. Your cache hit ratio.

Once you open this tab a few times, your prompting habits change. You start managing context more deliberately because you can see, in real dollars, what re-reading the same file 7 times costs.

When to use it: End of every month. And the first time the bill is bigger than expected.

Performance — "How much was waste?"

Efficiency score plus a wasted-cost estimate: dollars lost to duplicate reads, retry loops, and unnecessary tool calls.

The first view is shocking. "I am burning $4 an hour on nothing" — that kind of moment.

Flow — "How are these files connected?"

An interactive graph. Nodes are files, edges are Read / Write / Edit relationships. The fastest way to see, in a complex refactor, exactly which files the agent touched and in what order.

Context & Insights — "How could I have done it better?"

The Context tab shows your token budget and cache performance for the session. The Insights tab generates pattern-based recommendations — things like "you read 3 files 5+ times in this session, pin them in your prompt."

Live Monitoring: Intervene Before the Invoice

This is the single most valuable thing Argus does:

Open the Argus panel, then start Claude Code. Every tool call updates the panel in real time. The token counter advances, new steps appear, the analyzer raises new findings as they happen.

This is not post-mortem log reading anymore — it is live monitoring. You see the agent enter a retry loop as it enters one, not after the invoice. You get a chance to intervene.

Using the Sidebar Effectively

Once you have a lot of sessions, the sidebar matters:

Search by message content or project name
Model filter (great for comparing Opus / Sonnet / Haiku)
Date presets (1 hour / 24 hours / 7 days / 30 days) or custom range from a calendar
Group by project or group by model — whichever is more useful for you

My recommendation: once a week, set the filter to "this week" and skim the Cost and Analysis tabs. Five minutes. That habit alone will visibly reduce your token spend.

Typical Use Cases

"Why did this session take 25 minutes?"
→ Open Steps, sort by duration. Inspect the top 3 longest steps. Almost always a Bash command or a retry loop.

"My bill is higher than expected — why?"
→ Cost tab, with a date filter. Open the most expensive sessions, then check Performance for wasted cost.

"Was this refactor sequencing actually sensible?"
→ Flow tab. The dependency graph shows you the real order the agent touched files in.

"Who on my team is using which model?"
→ (For local machines.) Group by model. The people running Opus on trivial tasks become obvious immediately.

"Is the agent making the same mistakes everyone else does?"
→ The Duplicate Reads and Retry Loop findings under Analysis. Turn the patterns into internal training material if needed.

Who Is It For?

Solo developers: Lower your token spend with real data. Your prompts get sharper.

Teams: Cross-project AI usage and cost auditing. Identify and spread good practices.

Researchers: Inspect LLM-driven development patterns at trace level. Compare models head-to-head.

FAQ

Does any of my data go to Anthropic or anywhere else?
No. Argus runs locally and only reads the JSONL files already on your machine. Zero network requests.

Does Argus work when Claude Code is not running?
Yes. You can always open and inspect any past session.

Is it Mac-only?
No. macOS, Linux, and Windows are all supported.

Does it work in Cursor / Windsurf and other VS Code forks?
It works in most VS Code-compatible forks. Tested in Cursor. The most polished experience is still in stock VS Code.

Do I need to configure anything?
No. There are two optional settings — scan depth and language (tr / en) — and both work fine on their defaults.

Install Again (Bookmark This)

VS Code → Extensions (Ctrl+Shift+X) → search "Argus" → Install → eye icon in the Activity Bar → done.

Repo: github.com/yessGlory17/argus

Conclusion

Claude Code is a remarkable tool. But without observability, it is an expensive one. Like every other serious development workflow, it needs monitoring.

After installing Argus and watching a few sessions, going back is hard. Seeing what the agent actually does makes you a better prompter, which means fewer tokens, faster iteration, lower bills.

If you found this post by searching for "how to monitor Claude Code", "Claude Code cost tracking", or "Claude Code observability", the answer is short: install it, try it, and you will not want to go back.

If Argus saves you tokens, time, or sanity, drop a star on the repo. Issues and ideas are always welcome.

Happy (and observable) hacking.

Top comments (1)

John • May 16

The live vs post-mortem distinction is the key. /cost is useful after the fact, but most behavior changes when the number is visible during the session.