<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Avinash Sangle</title>
    <description>The latest articles on DEV Community by Avinash Sangle (@aavisangle).</description>
    <link>https://dev.to/aavisangle</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3878490%2F16544dab-61bc-4ca8-823e-58734c16fcd0.png</url>
      <title>DEV Community: Avinash Sangle</title>
      <link>https://dev.to/aavisangle</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aavisangle"/>
    <language>en</language>
    <item>
      <title>Claude Code Cost Tracking: Monitor and Cut Your Spending</title>
      <dc:creator>Avinash Sangle</dc:creator>
      <pubDate>Fri, 17 Apr 2026 05:04:30 +0000</pubDate>
      <link>https://dev.to/aavisangle/claude-code-cost-tracking-monitor-and-cut-your-spending-4cge</link>
      <guid>https://dev.to/aavisangle/claude-code-cost-tracking-monitor-and-cut-your-spending-4cge</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article was originally published on &lt;a href="https://avinashsangle.com/blog/claude-code-cost-tracking" rel="noopener noreferrer"&gt;avinashsangle.com&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How Much Does Claude Code Actually Cost?
&lt;/h2&gt;

&lt;p&gt;The pricing structure is straightforward. Claude Code Pro runs $20 per month (or $17 annually). The Max plan comes in two tiers: $100/month for 5x the Pro usage allowance, and $200/month for 20x. If you are on the API, you pay per token - Sonnet 4.6 at $3/$15 per million input/output tokens, and Opus 4.6 at $15/$75.&lt;/p&gt;

&lt;p&gt;Across enterprise deployments, the average lands between $150 and $250 per developer per month, according to Anthropic's published benchmarks. Ninety percent of users stay under $12 per day. But that top 10% can burn through tokens fast, especially with extended thinking enabled and Opus as the default model.&lt;/p&gt;

&lt;p&gt;The real issue? Tracking is scattered. Subscription users can't see dollar costs in the Console. API users get billing data but not per-session breakdowns. And everyone has local JSONL files sitting on their machine that most people don't even know exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Track costs with built-in commands:&lt;/strong&gt; &lt;code&gt;/cost&lt;/code&gt; for API users, &lt;code&gt;/stats&lt;/code&gt; for subscribers, &lt;code&gt;/usage&lt;/code&gt; for rate limit status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Find your hidden usage data:&lt;/strong&gt; Claude Code logs every session to &lt;code&gt;~/.claude/projects/&lt;/code&gt; as JSONL files with full token counts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use third-party tools for real visibility:&lt;/strong&gt; ccusage (4.8k GitHub stars) gives you daily, monthly, and per-session cost reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cut costs by 50% with 7 practical changes:&lt;/strong&gt; default to Sonnet, cap thinking tokens, clear context between tasks, and write specific prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Built-In Cost Tracking Commands You Should Know
&lt;/h2&gt;

&lt;p&gt;Claude Code ships with three commands for checking usage. Which one you should use depends on whether you are paying through the API or a subscription plan.&lt;/p&gt;

&lt;h3&gt;
  
  
  /cost - Session API Spend
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;/cost&lt;/code&gt; command shows your current session's token usage and estimated dollar cost. Designed for API users. Subscription users still see token counts, which is useful for understanding consumption patterns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total cost:            $0.55
Total duration (API):  6m 19.7s
Total duration (wall): 6h 33m 10.2s
Total code changes:    127 lines added, 43 lines removed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  /stats - Subscriber Usage Dashboard
&lt;/h3&gt;

&lt;p&gt;If you are on Pro or Max, &lt;code&gt;/stats&lt;/code&gt; opens a dashboard with a usage heatmap, session counts, token totals by model, and activity streaks. No dollar costs (flat-rate plan), but you see exactly how much of your allowance you are burning.&lt;/p&gt;

&lt;h3&gt;
  
  
  /usage - Rate Limit Status
&lt;/h3&gt;

&lt;p&gt;Shows your plan limits and current rate limit status. Check this when Claude Code feels slow or you suspect throttling. Shows both 5-hour and 1-week usage windows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Status Line Configuration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Show cost in the status line (API users)&lt;/span&gt;
claude config &lt;span class="nb"&gt;set &lt;/span&gt;status_line.show_cost &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Show token count in the status line&lt;/span&gt;
claude config &lt;span class="nb"&gt;set &lt;/span&gt;status_line.show_tokens &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;When to use which:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API users:&lt;/strong&gt; Use &lt;code&gt;/cost&lt;/code&gt; for dollar amounts and &lt;code&gt;/usage&lt;/code&gt; for rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pro/Max subscribers:&lt;/strong&gt; Use &lt;code&gt;/stats&lt;/code&gt; for usage patterns and &lt;code&gt;/usage&lt;/code&gt; for rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Everyone:&lt;/strong&gt; Configure the status line for passive monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where Claude Code Stores Your Usage Data
&lt;/h2&gt;

&lt;p&gt;Every session gets logged to your local filesystem as JSONL files. These contain detailed token counts for every API call - input tokens, output tokens, cache creation tokens, cache read tokens, and the model used. This is the same data third-party tools read to build their dashboards.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session Logs
&lt;/h3&gt;

&lt;p&gt;Claude Code writes one JSONL file per session to &lt;code&gt;~/.claude/projects/&lt;/code&gt;. If you are on a subscription plan, these local logs are the only way to get granular cost data since the Console doesn't expose it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find your session logs&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.claude/projects/

&lt;span class="c"&gt;# Look at the most recent session&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-lt&lt;/span&gt; ~/.claude/projects/ | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-5&lt;/span&gt;

&lt;span class="c"&gt;# Count tokens in a session with jq&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; ~/.claude/projects/&amp;lt;session-file&amp;gt;.jsonl | &lt;span class="se"&gt;\&lt;/span&gt;
  jq &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s1"&gt;'[.[].message.usage // empty] |
    { total_input: (map(.input_tokens) | add),
      total_output: (map(.output_tokens) | add),
      cache_read: (map(.cache_read_input_tokens // 0) | add),
      cache_creation: (map(.cache_creation_input_tokens // 0) | add) }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Status Line Snapshots
&lt;/h3&gt;

&lt;p&gt;Second file most people miss: &lt;code&gt;~/.claude/statusline.jsonl&lt;/code&gt;. Contains periodic snapshots with server-reported cumulative cost and your 5-hour and 1-week rate-limit usage percentages. This data is only in this local file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View recent status line snapshots&lt;/span&gt;
&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-5&lt;/span&gt; ~/.claude/statusline.jsonl | jq &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Extract cost progression over time&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; ~/.claude/statusline.jsonl | &lt;span class="se"&gt;\&lt;/span&gt;
  jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'[.timestamp, .cost_usd] | @csv'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Third-Party Tools for Claude Code Usage Analytics
&lt;/h2&gt;

&lt;p&gt;The built-in commands give a snapshot. For real visibility into trends, per-project breakdowns, and forecasting, you need more.&lt;/p&gt;

&lt;h3&gt;
  
  
  ccusage - The Most Popular Option
&lt;/h3&gt;

&lt;p&gt;4,800+ GitHub stars. CLI that reads your local JSONL files and produces clean tables with daily, monthly, or per-session cost breakdowns. Tracks cache tokens separately, supports billing window analysis, works offline with cached pricing data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install and run - no setup needed&lt;/span&gt;
npx ccusage              &lt;span class="c"&gt;# Daily report (default)&lt;/span&gt;
npx ccusage daily        &lt;span class="c"&gt;# Detailed daily breakdown&lt;/span&gt;
npx ccusage monthly      &lt;span class="c"&gt;# Monthly aggregated totals&lt;/span&gt;
npx ccusage session      &lt;span class="c"&gt;# Cost per conversation session&lt;/span&gt;
npx ccusage blocks       &lt;span class="c"&gt;# 5-hour billing window analysis&lt;/span&gt;

&lt;span class="c"&gt;# Filter by project&lt;/span&gt;
npx ccusage &lt;span class="nt"&gt;--instances&lt;/span&gt;  &lt;span class="c"&gt;# Group usage by project&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  claude-usage - Local Web Dashboard
&lt;/h3&gt;

&lt;p&gt;Reads the same local log files but renders them as charts with cost estimates, session timelines, and model breakdowns. Pro and Max subscribers get a progress bar for their allowance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude-Code-Usage-Monitor - Real-Time Alerts
&lt;/h3&gt;

&lt;p&gt;Real-time chart of token consumption with predictions about when you will hit your limits. Good for Max plan users who want early warnings before getting throttled.&lt;/p&gt;

&lt;h3&gt;
  
  
  ccost - Per-Request Granularity
&lt;/h3&gt;

&lt;p&gt;Analyzes per-request JSONL logs with detailed token counts using LiteLLM pricing data. Use when you want to know exactly which requests were the most expensive.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Interface&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;GitHub Stars&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ccusage&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;td&gt;Daily/monthly reports, billing windows&lt;/td&gt;
&lt;td&gt;4,800+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-usage&lt;/td&gt;
&lt;td&gt;Web dashboard&lt;/td&gt;
&lt;td&gt;Visual charts, subscriber progress&lt;/td&gt;
&lt;td&gt;1,200+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Usage-Monitor&lt;/td&gt;
&lt;td&gt;CLI (real-time)&lt;/td&gt;
&lt;td&gt;Limit predictions, early warnings&lt;/td&gt;
&lt;td&gt;500+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ccost&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;td&gt;Per-request cost analysis&lt;/td&gt;
&lt;td&gt;200+&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How to Set a Budget Limit for Claude Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Per-Command Budget Cap
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;--max-budget-usd&lt;/code&gt; flag caps the maximum dollar amount for a single print-mode command. Useful in CI/CD pipelines or automated scripts where a runaway agent could burn through tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Cap a single command at $5&lt;/span&gt;
claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--max-budget-usd&lt;/span&gt; 5.00 &lt;span class="s2"&gt;"Refactor the auth module"&lt;/span&gt;

&lt;span class="c"&gt;# Combine with max-turns for double protection&lt;/span&gt;
claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--max-budget-usd&lt;/span&gt; 10.00 &lt;span class="nt"&gt;--max-turns&lt;/span&gt; 5 &lt;span class="s2"&gt;"Fix failing tests in src/"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Workspace Rate Limits for Teams
&lt;/h3&gt;

&lt;p&gt;Claude Code creates a workspace called "Claude Code" when you first authenticate with Console. Set rate limits on this workspace in the Console's Limits page to cap Claude Code's share of your API allocation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent SDK Cost Tracking
&lt;/h3&gt;

&lt;p&gt;If you are building on the Claude Agent SDK, every result message includes a &lt;code&gt;total_cost_usd&lt;/code&gt; field.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/claude-agent-sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;totalSpend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Read the files in src/ and summarize the architecture&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;List all exported functions in src/auth.ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;prompts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;totalSpend&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;total_cost_usd&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`This call: $&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;total_cost_usd&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Total spend: $&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;totalSpend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toFixed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7 Ways to Cut Claude Code Costs by 50%
&lt;/h2&gt;

&lt;p&gt;After tracking my spending for a few weeks, I identified the patterns that were burning tokens fastest. These seven changes brought my daily average from ~$12 down to $5-6, with zero quality loss.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Default to Sonnet, Switch to Opus Only When Needed
&lt;/h3&gt;

&lt;p&gt;Sonnet 4.6 costs $3/$15 per million input/output tokens. Opus 4.6 costs $15/$75. That's 5x more expensive. For most coding tasks, Sonnet produces results that are just as good.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Switch models on the fly&lt;/span&gt;
/model sonnet    &lt;span class="c"&gt;# For everyday tasks&lt;/span&gt;
/model opus      &lt;span class="c"&gt;# For complex reasoning only&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Set MAX_THINKING_TOKENS to 10,000
&lt;/h3&gt;

&lt;p&gt;Extended thinking is the single biggest cost lever. Uncapped thinking tokens can generate tens of thousands of tokens per request. A 10,000 limit still gives Claude enough room to reason.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set thinking token limit&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MAX_THINKING_TOKENS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10000

&lt;span class="c"&gt;# Or lower the effort level for simple tasks&lt;/span&gt;
/effort low       &lt;span class="c"&gt;# Significant token savings&lt;/span&gt;
/effort medium    &lt;span class="c"&gt;# Balance of cost and quality&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Use /clear Between Tasks
&lt;/h3&gt;

&lt;p&gt;Stale context is a silent cost multiplier. Every message includes the full conversation history as input tokens. Run &lt;code&gt;/clear&lt;/code&gt; when you switch to unrelated work. Use &lt;code&gt;/rename&lt;/code&gt; first if you want to come back to the session later with &lt;code&gt;/resume&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Use /compact When Context Grows
&lt;/h3&gt;

&lt;p&gt;If you are mid-task and can't clear, use &lt;code&gt;/compact&lt;/code&gt; to summarize the conversation history. Reduces token count while preserving important context.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Write Specific Prompts
&lt;/h3&gt;

&lt;p&gt;Vague prompts are expensive. "Make this better" forces Claude to spend tokens figuring out what you want. "Extract the hardcoded strings in src/auth.js into constants" gets the job done in one pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Use Plan Mode Before Expensive Operations
&lt;/h3&gt;

&lt;p&gt;Press Shift+Tab twice to enter plan mode before starting a big task. Claude outlines its approach before writing code. Costs a few hundred tokens for the plan but saves thousands by preventing costly rework.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Break Work Into Scoped Sessions
&lt;/h3&gt;

&lt;p&gt;One session for everything is the most expensive way to use Claude Code. Context accumulates, cache misses increase, and irrelevant history gets sent with every request. Work in task-scoped sessions: one for fixing the login bug, another for adding the new API endpoint, a third for writing tests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code API vs Subscription: Which Costs Less?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Usage Profile&lt;/th&gt;
&lt;th&gt;API Cost/Month&lt;/th&gt;
&lt;th&gt;Best Plan&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Light (1-2 hrs/day)&lt;/td&gt;
&lt;td&gt;$30-50/mo&lt;/td&gt;
&lt;td&gt;API or Pro ($20)&lt;/td&gt;
&lt;td&gt;Pay-per-use wins&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Moderate (3-5 hrs/day)&lt;/td&gt;
&lt;td&gt;$100-180/mo&lt;/td&gt;
&lt;td&gt;Max 5x ($100)&lt;/td&gt;
&lt;td&gt;Up to 44% savings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Heavy (6+ hrs/day)&lt;/td&gt;
&lt;td&gt;$200-400/mo&lt;/td&gt;
&lt;td&gt;Max 20x ($200)&lt;/td&gt;
&lt;td&gt;Up to 50% savings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The API makes more sense with sporadic usage or when you need fine-grained budget controls like &lt;code&gt;--max-budget-usd&lt;/code&gt;. It's also the only option for per-project cost allocation when billing clients. The subscription wins on predictability.&lt;/p&gt;

&lt;p&gt;My approach: Max 5x plan for day-to-day, API key configured for automated scripts and CI pipelines where I want hard budget caps. Hybrid setup gives predictable costs for interactive work and strict controls for automation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How do I check my Claude Code costs?
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;/cost&lt;/code&gt; in any session for API spend totals with token counts and dollar estimates. Subscribers should use &lt;code&gt;/stats&lt;/code&gt; for a usage dashboard with heatmaps and model breakdowns, or &lt;code&gt;/usage&lt;/code&gt; for rate limit status. You can also configure the status line to show costs continuously.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where does Claude Code store usage data locally?
&lt;/h3&gt;

&lt;p&gt;Claude Code writes one JSONL file per session to &lt;code&gt;~/.claude/projects/&lt;/code&gt; with full token counts for every API call. It also writes periodic snapshots to &lt;code&gt;~/.claude/statusline.jsonl&lt;/code&gt; containing cumulative cost and rate-limit usage percentages. Third-party tools like ccusage read these files.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is ccusage and how do I use it?
&lt;/h3&gt;

&lt;p&gt;ccusage is an open-source CLI tool with 4,800+ GitHub stars that analyzes Claude Code usage from local JSONL files. Run &lt;code&gt;npx ccusage&lt;/code&gt; for a daily report, &lt;code&gt;npx ccusage monthly&lt;/code&gt; for monthly totals, or &lt;code&gt;npx ccusage session&lt;/code&gt; to see costs per conversation. Works offline with cached pricing data.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much does Claude Code cost per day on average?
&lt;/h3&gt;

&lt;p&gt;Anthropic reports the average at about $6 per developer per day, with 90% of users under $12 per day. Enterprise deployments average $150 to $250 per developer per month. Heavy Opus sessions with extended thinking can spike past $20 in a single day.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I set a budget limit for Claude Code API usage?
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;--max-budget-usd&lt;/code&gt; in print mode to cap spending per command: &lt;code&gt;claude -p --max-budget-usd 5.00 "your prompt"&lt;/code&gt;. For team-wide limits, set workspace rate limits in the Claude Console. You can also use &lt;code&gt;--max-turns&lt;/code&gt; to indirectly limit costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Claude Code Max plan worth it vs API pricing?
&lt;/h3&gt;

&lt;p&gt;If your API equivalent spend exceeds $100/month, the Max 5x plan at $100/month saves money. If you spend over $200/month on API, Max 20x is the better deal. For sporadic usage under $50/month, pay-per-token API pricing usually costs less overall.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does prompt caching reduce Claude Code costs?
&lt;/h3&gt;

&lt;p&gt;Claude Code automatically caches repeated content like system prompts and CLAUDE.md files. Cached tokens cost 90% less than fresh input tokens. The cache has a 5-minute TTL, so keeping sessions under 5 minutes apart maximizes savings. Track cache hit rates in local JSONL logs.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Read the full version&lt;/strong&gt; (with extra examples and updates) on the original post: &lt;a href="https://avinashsangle.com/blog/claude-code-cost-tracking" rel="noopener noreferrer"&gt;Claude Code Cost Tracking on avinashsangle.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>devops</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Claude Managed Agents vs Agent SDK: Which Should You Use?</title>
      <dc:creator>Avinash Sangle</dc:creator>
      <pubDate>Tue, 14 Apr 2026 11:38:39 +0000</pubDate>
      <link>https://dev.to/aavisangle/claude-managed-agents-vs-agent-sdk-which-should-you-use-4112</link>
      <guid>https://dev.to/aavisangle/claude-managed-agents-vs-agent-sdk-which-should-you-use-4112</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article was originally published on &lt;a href="https://avinashsangle.com/blog/claude-managed-agents" rel="noopener noreferrer"&gt;avinashsangle.com&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Anthropic launched &lt;strong&gt;Claude Managed Agents&lt;/strong&gt; in beta on April 8, 2026. It's a hosted service that runs long-horizon Claude agents in Anthropic's infrastructure - sandboxed, persistent, and integrated with MCP servers out of the box.&lt;/p&gt;

&lt;p&gt;If you're choosing between Managed Agents and the Agent SDK, the short answer is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pick &lt;strong&gt;Managed Agents&lt;/strong&gt; for multi-hour production workloads&lt;/li&gt;
&lt;li&gt;Pick the &lt;strong&gt;Agent SDK&lt;/strong&gt; when you need full control over the runtime&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's the breakdown after digging through the docs and API.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Managed Agents&lt;/strong&gt; = Anthropic runs the agent harness, sandbox, and runtime for you (hosted, beta)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent SDK&lt;/strong&gt; = you run the same engine yourself, with full control over infrastructure&lt;/li&gt;
&lt;li&gt;Pricing: standard token rates + &lt;strong&gt;$0.08 per session-hour&lt;/strong&gt; of active runtime + $10 per 1,000 web searches&lt;/li&gt;
&lt;li&gt;Early adopters: Notion, Rakuten, Asana - focused on long-running enterprise workflows&lt;/li&gt;
&lt;li&gt;Beta header required: &lt;code&gt;managed-agents-2026-04-01&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Core Difference
&lt;/h2&gt;

&lt;p&gt;Think of it like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Managed Agents = Vercel&lt;/strong&gt; (hosted, opinionated, pay-per-use)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent SDK = self-hosted Next.js&lt;/strong&gt; (you run it on your infra)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same underlying engine. Different operational trade-offs.&lt;/p&gt;

&lt;p&gt;Managed Agents handles the agent loop, sandboxed code execution, file system access, web browsing, persistent sessions, and checkpointing for you. You send a prompt, connect your MCP servers, and the agent runs - even for hours - without you maintaining any of that runtime infrastructure.&lt;/p&gt;

&lt;p&gt;The Agent SDK exposes the same engine for self-hosted runtimes. You get local file access, private network connectivity, custom tool execution, and full runtime control. No session-hour charges - just token costs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Managed Agents pricing on top of standard Claude API rates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$0.08 per session-hour&lt;/strong&gt; of active runtime&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;$10 per 1,000 web searches&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idle time is free&lt;/strong&gt; - sessions can wait for input without billing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a 2-hour research task, you're looking at roughly &lt;strong&gt;$0.16 in compute&lt;/strong&gt; plus token costs. For zero infrastructure management, that's a strong tradeoff for production workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Pick Which
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pick Managed Agents when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have multi-hour production workloads (research, batch processing, monitoring)&lt;/li&gt;
&lt;li&gt;You need sandboxed code execution out of the box&lt;/li&gt;
&lt;li&gt;Web browsing + MCP integrations matter&lt;/li&gt;
&lt;li&gt;You don't want to build or maintain agent infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pick Agent SDK when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need local file access (working against repos)&lt;/li&gt;
&lt;li&gt;Private network access required&lt;/li&gt;
&lt;li&gt;Custom tool execution logic&lt;/li&gt;
&lt;li&gt;You want predictable token-only costs without session-hour pricing&lt;/li&gt;
&lt;li&gt;Development and debugging - the SDK lets you inspect everything&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What You Get Out of the Box with Managed Agents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Sandboxed containers with code execution, file system, and web access&lt;/li&gt;
&lt;li&gt;Sessions can run for hours with &lt;strong&gt;checkpointing&lt;/strong&gt; for fault tolerance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server support&lt;/strong&gt; - any MCP server you've built for Claude Desktop or Claude Code can be configured for a Managed Agent session&lt;/li&gt;
&lt;li&gt;Built-in web browsing and search&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Beta Status
&lt;/h2&gt;

&lt;p&gt;Managed Agents is currently in beta. All endpoints require the beta header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;anthropic-beta: managed-agents-2026-04-01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The official Anthropic SDKs set this automatically when you use the beta namespace. Some features like multi-agent orchestration remain in limited research preview.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Notion, Rakuten, and Asana are early adopters - all using Managed Agents for enterprise workflows where the agent needs to run for extended periods, integrate with internal tools via MCP, and survive infrastructure failures.&lt;/p&gt;

&lt;p&gt;This is Anthropic moving up the value chain: instead of just selling the model, they're selling the complete runtime that wraps it. For teams without dedicated AI infrastructure, that's a meaningful shift.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Read the full deep-dive&lt;/strong&gt; with code examples, pricing math, and a decision flowchart on the original post: &lt;a href="https://avinashsangle.com/blog/claude-managed-agents" rel="noopener noreferrer"&gt;Claude Managed Agents vs Agent SDK on avinashsangle.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
