<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sangmin Lee</title>
    <description>The latest articles on DEV Community by Sangmin Lee (@claudeguide).</description>
    <link>https://dev.to/claudeguide</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3946361%2F45852601-611d-4e7b-a381-c122ca373b5a.jpg</url>
      <title>DEV Community: Sangmin Lee</title>
      <link>https://dev.to/claudeguide</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/claudeguide"/>
    <language>en</language>
    <item>
      <title>Claude API Error Handling: Rate Limits, Retries, Patterns</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Sat, 23 May 2026 01:40:34 +0000</pubDate>
      <link>https://dev.to/claudeguide/claude-api-error-handling-rate-limits-retries-patterns-4nbc</link>
      <guid>https://dev.to/claudeguide/claude-api-error-handling-rate-limits-retries-patterns-4nbc</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-api-error-handling?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-api-error-handling" rel="noopener noreferrer"&gt;claudeguide.io/claude-api-error-handling&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude API Error Handling: Rate Limits, Retries, and Production Patterns
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;The Anthropic API returns structured errors with specific HTTP status codes. Knowing which errors to retry, which to log and surface to users, and which indicate bugs in your code is the difference between a production-ready integration and one that silently fails. For general Claude API concepts, see the &lt;a href="https://dev.to/claude-agent-sdk-guide"&gt;Claude Agent SDK Guide&lt;/a&gt; in 2026.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Error code reference
&lt;/h2&gt;

&lt;p&gt;Each row links to a dedicated troubleshooting page with Python + TypeScript code examples (Korean):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;HTTP Status&lt;/th&gt;
&lt;th&gt;Error type&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-400"&gt;400&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;invalid_request_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Malformed request — bad JSON, unsupported parameters, exceeded context window&lt;/td&gt;
&lt;td&gt;Fix the request — do not retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-401"&gt;401&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authentication_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Invalid API key&lt;/td&gt;
&lt;td&gt;Check key validity — do not retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-403"&gt;403&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;permission_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Valid key but insufficient permissions (e.g. model not enabled)&lt;/td&gt;
&lt;td&gt;Check account permissions — do not retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-404"&gt;404&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;not_found_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Endpoint or model doesn't exist&lt;/td&gt;
&lt;td&gt;Fix model name or endpoint — do not retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-413"&gt;413&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;request_too_large&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Request body exceeds 32MB limit&lt;/td&gt;
&lt;td&gt;Use Files API for large attachments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;422&lt;/td&gt;
&lt;td&gt;&lt;code&gt;unprocessable_entity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Request valid but semantically wrong (e.g. invalid tool schema)&lt;/td&gt;
&lt;td&gt;Fix the schema — do not retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-429"&gt;429&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rate_limit_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Too many requests or tokens per minute&lt;/td&gt;
&lt;td&gt;Retry with exponential backoff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-500"&gt;500&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;api_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Internal server error&lt;/td&gt;
&lt;td&gt;Retry with backoff, max 3 attempts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-529"&gt;529&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;overloaded_error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;API overloaded&lt;/td&gt;
&lt;td&gt;Retry with longer backoff&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Additional HTTP status codes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Quick fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-502"&gt;502&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;bad_gateway&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retry [3, 10, 30, 60, 120s]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-503"&gt;503&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service_unavailable&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check status.anthropic.com + backoff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/claude-api-error-504"&gt;504&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;gateway_timeout&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch to streaming for long outputs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Error subtype deep-dives (한국어, code samples)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-context-length-exceeded"&gt;&lt;code&gt;context_length_exceeded&lt;/code&gt;&lt;/a&gt; — 컨텍스트 창 초과 시 트리밍&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-invalid-api-key"&gt;&lt;code&gt;invalid_api_key&lt;/code&gt;&lt;/a&gt; — key 형식 검증 + 환경변수 trim&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-max-tokens"&gt;&lt;code&gt;max_tokens&lt;/code&gt;&lt;/a&gt; — 모델별 8192 한도 cap&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-model-not-found"&gt;&lt;code&gt;model_not_found&lt;/code&gt;&lt;/a&gt; — 최신 모델 식별자&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-prompt-too-long"&gt;&lt;code&gt;prompt_too_long&lt;/code&gt;&lt;/a&gt; — 누적 conversation 자동 trim&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-streaming-error"&gt;&lt;code&gt;streaming_error&lt;/code&gt;&lt;/a&gt; — SSE 끊김 시 resume 패턴&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-tool-use-error"&gt;&lt;code&gt;tool_use_error&lt;/code&gt;&lt;/a&gt; — tool_use ↔ tool_result pairing 검증&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-vision-error"&gt;&lt;code&gt;vision_error&lt;/code&gt;&lt;/a&gt; — 이미지 포맷/크기 자동 정규화&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-file-upload-error"&gt;&lt;code&gt;file_upload_error&lt;/code&gt;&lt;/a&gt; — Files API + beta 헤더&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-batch-error"&gt;&lt;code&gt;batch_error&lt;/code&gt;&lt;/a&gt; — Batch 10K/250MB 한도 검증&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-cache-error"&gt;&lt;code&gt;cache_error&lt;/code&gt;&lt;/a&gt; — Prompt Caching cache_control 위치&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/claude-api-error-billing-error"&gt;&lt;code&gt;billing_error&lt;/code&gt;&lt;/a&gt; — 결제/크레딧 부족 alert&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The critical distinction&lt;/strong&gt;: 4xx errors (except 429) indicate a problem with your request and should not be retried. 429 and 5xx errors are transient and should be retried. To reduce 400-class errors from oversized contexts, see &lt;a href="https://dev.to/claude-1m-context-window"&gt;Claude 1M Context Window&lt;/a&gt; for truncation and caching strategies.&lt;/p&gt;




&lt;h2&gt;
  
  
  Rate limit errors (429)
&lt;/h2&gt;

&lt;p&gt;The most common production error. Rate limits are enforced on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Requests per minute (RPM)&lt;/strong&gt;: number of API calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input tokens per minute (ITPM)&lt;/strong&gt;: total input tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output tokens per minute (OTPM)&lt;/strong&gt;: total output tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;Retry-After&lt;/code&gt; header in the 429 response tells you exactly how many seconds to wait.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(
    messages: list,
    model: str = "claude-sonnet-4-6",
    max_retries: int = 5,
    base_delay: float = 1.0,
) -

PDF guide + Excel cost calculator.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-api-error-handling)

*30-day money-back guarantee. Instant download.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>retries</category>
      <category>production</category>
    </item>
    <item>
      <title>From $800 to $120/month: A Claude API Cost Optimization Case Study</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Sat, 23 May 2026 01:35:21 +0000</pubDate>
      <link>https://dev.to/claudeguide/from-800-to-120month-a-claude-api-cost-optimization-case-study-34a5</link>
      <guid>https://dev.to/claudeguide/from-800-to-120month-a-claude-api-cost-optimization-case-study-34a5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-api-cost-case-study?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-api-cost-case-study" rel="noopener noreferrer"&gt;claudeguide.io/claude-api-cost-case-study&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  From $800 to $120/month: A Claude API Cost Optimization Case Study
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;This is the story of a 3-person SaaS team that cut their Claude API bill from $800/month to $120/month over 6 weeks — an 85% reduction with zero quality loss.&lt;/strong&gt; The product is a B2B document analysis tool — users upload contracts, the app extracts key clauses, generates summaries, and answers questions about the document.&lt;/p&gt;

&lt;p&gt;PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-api-cost-case-study" rel="noopener noreferrer"&gt;→ Get Cost Optimization Masterclass — $59&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;30-day money-back guarantee. Instant download.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>haiku</category>
    </item>
    <item>
      <title>Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Sat, 23 May 2026 01:35:18 +0000</pubDate>
      <link>https://dev.to/claudeguide/claude-agent-sdk-quickstart-build-your-first-agent-in-15-minutes-5d43</link>
      <guid>https://dev.to/claudeguide/claude-agent-sdk-quickstart-build-your-first-agent-in-15-minutes-5d43</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-agent-sdk-quickstart?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-agent-sdk-quickstart" rel="noopener noreferrer"&gt;claudeguide.io/claude-agent-sdk-quickstart&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;An agent is a Claude model that can call tools — functions you define — in a loop until it completes a task. This guide walks from zero to a working agent with two tools (web search and unit converter) in Python or TypeScript in 2026.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites&lt;/strong&gt;: Anthropic API key, Python 3.11+ or Node.js 18+.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you're building
&lt;/h2&gt;

&lt;p&gt;A research assistant that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Accepts a question&lt;/li&gt;
&lt;li&gt;Decides whether to search the web or convert a unit&lt;/li&gt;
&lt;li&gt;Calls the tool&lt;/li&gt;
&lt;li&gt;Uses the result to answer (or calls another tool)&lt;/li&gt;
&lt;li&gt;Returns a final answer&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Python version
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install the SDK
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;anthropic
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-ant-your-key-here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Define your tools
&lt;/h3&gt;

&lt;p&gt;Tools are Python functions with JSON Schema descriptions. Claude reads the description to decide when to call each tool.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
import anthropic
import json

client = anthropic.Anthropic()

TOOLS = [
    {
        "name": "web_search",
        "description": "Search the web for current information. Use when the question requires recent facts or data.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "unit_converter",
        "description": "Convert between common units. Supports: km/miles, kg/lbs, celsius/fahrenheit, usd/krw.",
        "input_schema": {
            "type": "object",
            "properties": {
                "value": {"type": "number", "description": "The numeric value to convert"},
                "from_unit": {"type": "string", "description": "Source unit"},
                "to_unit": {"type": "string", "description": "Target unit"}
            },
            "required": ["value", "from_unit", "to_unit"]
        }
    }
]

def web_search(query: str) -

Complete, runnable Python and TypeScript code throughout.

[→ Get Agent SDK Cookbook — $49](https://shoutfirst.gumroad.com/l/ogxhmy?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-agent-sdk-quickstart)

*30-day money-back guarantee. Instant download.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>quickstart</category>
      <category>python</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Claude 1M Context Window: What It Can Do and What It Costs</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Sat, 23 May 2026 01:30:04 +0000</pubDate>
      <link>https://dev.to/claudeguide/claude-1m-context-window-what-it-can-do-and-what-it-costs-1308</link>
      <guid>https://dev.to/claudeguide/claude-1m-context-window-what-it-can-do-and-what-it-costs-1308</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-1m-context-window?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-1m-context-window" rel="noopener noreferrer"&gt;claudeguide.io/claude-1m-context-window&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude 1M Context Window: What It Can Do and What It Costs
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Claude Opus 4.7 and Claude Sonnet 4.6 support a 1 million token context window — roughly 750,000 words, or the equivalent of 10 average novels. This guide explains what that actually means for your use case, what it costs, and when the extended context is worth it. For guidance on picking the right model tier, see &lt;a href="https://dev.to/claude-haiku-sonnet-opus-which-model"&gt;Haiku vs Sonnet vs Opus: Which Model?&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What 1M tokens looks like in practice
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Content type&lt;/th&gt;
&lt;th&gt;Fits in 1M tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Words (English prose)&lt;/td&gt;
&lt;td&gt;~750,000 words&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pages (standard 250 words/page)&lt;/td&gt;
&lt;td&gt;~3,000 pages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code (Python, ~100 tokens/KB)&lt;/td&gt;
&lt;td&gt;~10 MB of source code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub repo (median size)&lt;/td&gt;
&lt;td&gt;~3-5 repos in full&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legal documents&lt;/td&gt;
&lt;td&gt;~500 standard contracts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Emails&lt;/td&gt;
&lt;td&gt;~5,000 average emails&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slack messages&lt;/td&gt;
&lt;td&gt;~20,000 messages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF pages (no images)&lt;/td&gt;
&lt;td&gt;~2,500 pages&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Practical upper bound&lt;/strong&gt;: 1M tokens is the technical limit. In practice, Anthropic recommends staying under 800K for reliable output quality. The model's attention degrades at the very edges of a very long context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing for extended context
&lt;/h2&gt;

&lt;p&gt;Standard context (0-200K tokens) is billed at the normal rate. Beyond 200K, the per-token rate doubles.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;0-200K input&lt;/th&gt;
&lt;th&gt;200K-1M input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;$3.00/1M&lt;/td&gt;
&lt;td&gt;$6.00/1M&lt;/td&gt;
&lt;td&gt;$15.00/1M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;$5.00/1M&lt;/td&gt;
&lt;td&gt;$10.00/1M&lt;/td&gt;
&lt;td&gt;$25.00/1M&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Real cost example — 800K token request on Opus:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First 200K: 200,000 tokens × $5/1M = $1.00&lt;/li&gt;
&lt;li&gt;Remaining 600K: 600,000 tokens × $10/1M = $6.00&lt;/li&gt;
&lt;li&gt;Total input: &lt;strong&gt;$7.00 per request&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Plus output: if the response is 2,000 tokens → $0.05&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Single request total: ~$7.05&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At 100 requests/month: &lt;strong&gt;$705/month&lt;/strong&gt; on input alone. This is the context where selective context matters enormously.&lt;/p&gt;

&lt;h2&gt;
  
  
  When 1M context is worth it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Whole-codebase analysis
&lt;/h3&gt;

&lt;p&gt;When you need Claude to reason across an entire codebase — not just find a file, but understand how components interact — you need the whole thing in context at once.&lt;/p&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Security audit: finding vulnerability chains across modules&lt;/li&gt;
&lt;li&gt;Architecture review: identifying circular dependencies, anti-patterns&lt;/li&gt;
&lt;li&gt;Refactoring plan: understanding all callers before changing a shared function&lt;/li&gt;
&lt;li&gt;Onboarding doc generation: summarizing the entire codebase for new hires&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Alternative to consider first&lt;/strong&gt;: Claude Code's built-in file navigation (Read, Glob, Grep) lets it explore code without putting everything in context. For 80% of coding tasks, targeted file reading is faster and cheaper.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Multi-document synthesis
&lt;/h3&gt;

&lt;p&gt;Legal due diligence, medical record review, financial document analysis, research literature synthesis — tasks where the answer depends on relationships across hundreds of documents.&lt;/p&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Summarizing 200 earnings calls to find recurring themes&lt;/li&gt;
&lt;li&gt;Finding discrepancies across 50 supplier contracts&lt;/li&gt;
&lt;li&gt;Synthesizing 100 research papers into a literature review&lt;/li&gt;
&lt;li&gt;Analyzing a complete audit trail (logs, tickets, emails) for an incident investigation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Long conversation history
&lt;/h3&gt;

&lt;p&gt;Agents that run for many turns can use the full history as context for decision-making. A research agent that has made 50 tool calls, read 30 documents, and produced intermediate results can load the entire history for a final synthesis step.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Large structured data
&lt;/h3&gt;

&lt;p&gt;When you need Claude to reason over a large dataset — a 100K-row export in CSV form is ~500K tokens — and the reasoning requires seeing all the data rather than a sample. (Note: for data analysis at scale, a database + targeted query is almost always better than loading raw data into context.)&lt;/p&gt;

&lt;h2&gt;
  
  
  When NOT to use 1M context
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. You don't actually need it
&lt;/h3&gt;

&lt;p&gt;The most common misuse is sending the full codebase when the task only requires 2-3 files. Use targeted file reads first. Save the full-context approach for tasks where the answer genuinely requires reading everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test&lt;/strong&gt;: can you find the relevant files with Grep/Glob and read just those? If yes, do that.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Speed matters
&lt;/h3&gt;

&lt;p&gt;1M token requests have measurably higher latency. Time to first token is longer. If you need a fast response for a user-facing workflow, consider whether you can reduce the context or use a retrieval step.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The cost doesn't justify the use case
&lt;/h3&gt;

&lt;p&gt;At $7+ per request, 1M context requests are expensive. For a use case running 1,000 times/month, that is $7,000+ in input alone. The quality premium must be real and measurable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The task is repetitive over sub-documents
&lt;/h3&gt;

&lt;p&gt;If you are summarizing 1,000 individual documents and do not need cross-document reasoning, process them one at a time (or in batches via Batch API). You do not need 1M context to summarize a single 5-page contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use the 1M context window
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Via the API
&lt;/h3&gt;

&lt;p&gt;1M context requires requesting access via the Anthropic Console for some accounts. Once enabled, you use it by simply sending a larger messages array — no special flag required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Read all your documents
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;large_document.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# or claude-sonnet-4-6
&lt;/span&gt;    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze this document and find all clauses that could represent liability:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Checking your context usage
&lt;/h3&gt;

&lt;p&gt;The response object includes &lt;code&gt;usage.input_tokens&lt;/code&gt;. Check this to know exactly what you sent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Input tokens: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Output tokens: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cache read tokens: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache_read_input_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Combining with prompt caching
&lt;/h3&gt;

&lt;p&gt;For repeated analysis over the same large document (e.g., answering multiple questions about the same contract), use prompt caching to avoid re-billing the input tokens on each call. See the &lt;a href="https://dev.to/claude-prompt-caching-guide"&gt;Claude Prompt Caching Guide&lt;/a&gt; for a full breakdown of cache pricing and implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;large_document_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;# Cache the document
&lt;/span&gt;        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the termination clauses?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Second call reuses cached document — 90% cheaper on the input
&lt;/span&gt;&lt;span class="n"&gt;response2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;large_document_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the payment terms?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a 700K-token document on Sonnet 4.6:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Without caching: $3/call for first 200K + $6/call for remaining 500K = $4.80 per question&lt;/li&gt;
&lt;li&gt;With caching (after first write): $0.30/1M on cached tokens = $0.21 for 700K tokens per question&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Savings: 96% on repeated queries over the same document&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Claude actually does with a million tokens
&lt;/h2&gt;

&lt;p&gt;This is the question that matters most for deciding whether to use it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What works well:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Finding specific information anywhere in the context ("does this contract mention force majeure?")&lt;/li&gt;
&lt;li&gt;Cross-referencing across documents ("does the pricing in the email match the contract?")&lt;/li&gt;
&lt;li&gt;Summarizing the whole into a structured output&lt;/li&gt;
&lt;li&gt;Finding patterns that only emerge from seeing many instances&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What degrades at very long context:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Precise recall of specific facts from the middle of a 1M token context (the "lost in the middle" problem — performance is best at the beginning and end)&lt;/li&gt;
&lt;li&gt;Maintaining a single coherent thread over very long outputs&lt;/li&gt;
&lt;li&gt;Complex multi-step reasoning when the relevant context is scattered across the full 1M&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mitigation&lt;/strong&gt;: structure your context so the most important information appears at the beginning and end of the messages array. If you have critical instructions or key documents, place them first.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is 1M context available on Haiku?&lt;/strong&gt;&lt;br&gt;
No. Haiku 4.5 supports up to 200K tokens. Only Sonnet 4.6 and Opus 4.7 support 1M context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does context length affect output quality?&lt;/strong&gt;&lt;br&gt;
For tasks within the first 200K tokens of context, quality is equivalent to shorter contexts. For very long contexts, attention degrades slightly in the middle. Plan your context layout accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use 1M context with the Batch API?&lt;/strong&gt;&lt;br&gt;
Yes. Batch API supports up to 1M context. Pricing is 50% off standard rates, so extended context on Batch API: Sonnet at $3.00/1M for extended tokens (vs. $6.00 standard).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I estimate whether I need 1M context?&lt;/strong&gt;&lt;br&gt;
Count your actual tokens with the &lt;code&gt;countTokens&lt;/code&gt; endpoint before building. Many tasks that seem to require full context can be handled with targeted retrieval. Build the retrieval version first; upgrade to full context only if quality is insufficient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the maximum output token length?&lt;/strong&gt;&lt;br&gt;
Independent of input context length: 8,192 tokens for most models, 16,000 for Opus 4.7. Input context affects what the model knows, not how much it can generate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://docs.anthropic.com/models" rel="noopener noreferrer"&gt;Anthropic models documentation&lt;/a&gt; — April 2026&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://anthropic.com/pricing" rel="noopener noreferrer"&gt;Claude API pricing&lt;/a&gt; — April 2026&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.anthropic.com/guides/long-context" rel="noopener noreferrer"&gt;Long context best practices&lt;/a&gt; — April 2026&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How much does a 1M token request cost on Claude?
&lt;/h3&gt;

&lt;p&gt;On Claude Opus 4.7, a single 800K-token request costs approximately $7.05 in input alone: the first 200K tokens at $5/1M = $1.00, and the remaining 600K at $10/1M = $6.00, plus output. On Sonnet 4.6, the same request costs about $4.80. Use prompt caching on repeated queries over the same document to reduce costs by up to 96%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which Claude models support the 1M context window?
&lt;/h3&gt;

&lt;p&gt;Only Claude Sonnet 4.6 and Claude Opus 4.7 support 1M token context. Claude Haiku 4.5 is limited to 200K tokens. The 1M context mode may require enabling via the Anthropic Console for some accounts.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the best use cases for Claude's 1M context window?
&lt;/h3&gt;

&lt;p&gt;The highest-value use cases are whole-codebase security audits and architecture reviews, multi-document synthesis (e.g., 200 contracts, 100 research papers), long agent conversation histories requiring full-context synthesis, and large structured data reasoning. Avoid using 1M context when targeted file reads via Grep/Glob can answer the question — it is 4–14x more expensive than standard context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does the "lost in the middle" problem affect Claude's 1M context window?
&lt;/h3&gt;

&lt;p&gt;Yes. Performance is strongest at the beginning and end of the context and degrades slightly in the middle for very long inputs. For critical instructions or key documents, place them at the start of your messages array. Anthropic recommends staying under 800K tokens for reliable output quality even when the technical limit is 1M.&lt;/p&gt;




&lt;h2&gt;
  
  
  Take It Further
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-1m-context-window" rel="noopener noreferrer"&gt;Claude API Cost Optimization Masterclass&lt;/a&gt;&lt;/strong&gt; — The practical guide to cutting Claude API costs by 60–90% in production. Model tiering, prompt caching, Batch API, and token compression — with real numbers from 12 optimization scenarios.&lt;/p&gt;

&lt;p&gt;PDF guide + Excel cost calculator.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-1m-context-window" rel="noopener noreferrer"&gt;→ Get Cost Optimization Masterclass — $59&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;30-day money-back guarantee. Instant download.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>opus</category>
      <category>sonnet</category>
    </item>
    <item>
      <title>Running Claude Code across multiple repos without losing context</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Sat, 23 May 2026 01:30:02 +0000</pubDate>
      <link>https://dev.to/claudeguide/running-claude-code-across-multiple-repos-without-losing-context-2k83</link>
      <guid>https://dev.to/claudeguide/running-claude-code-across-multiple-repos-without-losing-context-2k83</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-code-workflow-multi-repo?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-code-workflow-multi-repo" rel="noopener noreferrer"&gt;claudeguide.io/claude-code-workflow-multi-repo&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Running Claude Code across multiple repos without losing context
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;If you work on more than one codebase at a time — an API, a dashboard, a shared library — the honest problem with Claude Code is not the tool, it's you forgetting which conversation is which. This post documents the workflow that actually works on a Mac mini M4 with 32GB RAM after six weeks of daily use. For a full overview of what Claude Code can do, see the &lt;a href="https://dev.to/claude-code-complete-guide"&gt;Claude Code Complete Guide&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;One Claude Code session = one repository. Do not mix.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;project-scoped &lt;code&gt;CLAUDE.md&lt;/code&gt;&lt;/strong&gt; at the root of each repo to pin context.&lt;/li&gt;
&lt;li&gt;Put persistent cross-repo facts in &lt;strong&gt;&lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; (user-global).&lt;/li&gt;
&lt;li&gt;For cross-repo refactors, use a third "orchestrator" session that spawns Explore subagents into each repo.&lt;/li&gt;
&lt;li&gt;Checkpoint before every context-heavy operation with &lt;code&gt;/remember&lt;/code&gt; or &lt;code&gt;/checkpoint&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why the naive approach breaks
&lt;/h2&gt;

&lt;p&gt;The first instinct is to open one Claude Code window and &lt;code&gt;cd&lt;/code&gt; between projects. This fails for three concrete reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;File cache collisions.&lt;/strong&gt; Claude Code tracks which files you've opened. Switching directories mid-session causes stale path assumptions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System prompt dilution.&lt;/strong&gt; Each repo's &lt;code&gt;CLAUDE.md&lt;/code&gt; only gets loaded at startup. Switching afterwards means the guidance doesn't reattach.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conversation contamination.&lt;/strong&gt; Decisions made for Repo A leak into Repo B's implementation when a single conversation carries both.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We measured the impact over a 2-week A/B split on a 3-repo project:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workflow&lt;/th&gt;
&lt;th&gt;Avg tokens / task&lt;/th&gt;
&lt;th&gt;Rework rate&lt;/th&gt;
&lt;th&gt;Subjective frustration (1-5)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single session, &lt;code&gt;cd&lt;/code&gt; between repos&lt;/td&gt;
&lt;td&gt;42,800&lt;/td&gt;
&lt;td&gt;31%&lt;/td&gt;
&lt;td&gt;4.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;One session per repo, user-global &lt;code&gt;CLAUDE.md&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;18,600&lt;/td&gt;
&lt;td&gt;8%&lt;/td&gt;
&lt;td&gt;1.8&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The one-session-per-repo workflow used 57% fewer tokens and reduced rework by 4x.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup, step by step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Write a tight &lt;code&gt;CLAUDE.md&lt;/code&gt; in each repo
&lt;/h3&gt;

&lt;p&gt;Keep it under 200 lines. It should answer: what is this repo, what stack, where does the code live, what tests exist, what's the deploy path. No aspirational content — only what's true today.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# CLAUDE.md — api-service&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Node 20, TypeScript 5.6, Fastify 5
&lt;span class="p"&gt;-&lt;/span&gt; Routes: src/routes/&lt;span class="err"&gt;*&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Tests: vitest run, one file per route under tests/
&lt;span class="p"&gt;-&lt;/span&gt; Deploy: Fly.io via &lt;span class="sb"&gt;`fly deploy`&lt;/span&gt; (staging auto on main push)
&lt;span class="p"&gt;-&lt;/span&gt; Secrets: .env.local (dev) / Fly secrets (prod)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Put cross-repo facts in &lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is your shared preamble. Useful entries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your preferred commit message style&lt;/li&gt;
&lt;li&gt;Tools you have globally (tsx, pnpm, bun)&lt;/li&gt;
&lt;li&gt;Platform oddities (Mac mini M4, Apple Silicon specifics)&lt;/li&gt;
&lt;li&gt;Recurring project names and what they mean&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Open one session per repo
&lt;/h3&gt;

&lt;p&gt;Use terminal tabs, iTerm split panes, or Warp workflows — one Claude Code process per repo. Expect 1-3 concurrent at any time. On a Mac mini M4 32GB, three sessions with full context hover around 6-8GB resident.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. For cross-repo work, spawn an orchestrator
&lt;/h3&gt;

&lt;p&gt;When you have a change that spans repos (say, "rename this API endpoint and update all callers across three frontends"), open a &lt;strong&gt;fourth&lt;/strong&gt; session in a neutral directory and use the Agent tool to dispatch Explore subagents into each repo. Collect findings, then hand off to the per-repo sessions for implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does Claude Code share memory across sessions?
&lt;/h3&gt;

&lt;p&gt;No. Each session has its own conversation. The &lt;code&gt;.remember/&lt;/code&gt; folder in your project directory persists across sessions &lt;em&gt;within that project&lt;/em&gt;, but two separate sessions do not see each other's live context.&lt;/p&gt;

&lt;h3&gt;
  
  
  How big should CLAUDE.md be?
&lt;/h3&gt;

&lt;p&gt;Under 200 lines in the repo; under 100 lines globally. Anything longer will either get ignored or crowd out working memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use Claude Desktop and Claude Code simultaneously?
&lt;/h3&gt;

&lt;p&gt;Yes. They do not interfere. Claude Desktop is better for ideation and writing; Claude Code for anything touching the filesystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  What about git worktrees?
&lt;/h3&gt;

&lt;p&gt;Worktrees work well for the "spawn orchestrator" pattern. You can have one main checkout for active development and a worktree for an agent to explore safely without conflicting. See &lt;a href="https://dev.to/claude-code-worktree-parallel"&gt;Worktree Isolation in Claude Code&lt;/a&gt; for a step-by-step setup guide.&lt;/p&gt;

&lt;h3&gt;
  
  
  How many concurrent sessions is too many?
&lt;/h3&gt;

&lt;p&gt;On a Mac mini M4 with 32GB RAM, three full sessions hover at 6–8 GB resident. Four to five starts competing for memory with your other tools. In practice, keep concurrent sessions to three — one per active repo. Open a fourth only for short orchestrator tasks, then close it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we got wrong at first
&lt;/h2&gt;

&lt;p&gt;Our first two weeks used a single session with symlinks instead of separate sessions. Token usage was 2.3x higher and we kept getting file path errors because Claude's cached path assumptions outlived our &lt;code&gt;cd&lt;/code&gt; changes. Separate sessions eliminated both problems in a single afternoon of setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source data
&lt;/h2&gt;

&lt;p&gt;All measurements in this post come from logs on a Mac mini M4 32GB running macOS 15.4, April 4-18 2026. The repositories were an API (Fastify), a dashboard (Next.js 15), and a shared library (TypeScript). Raw log samples are available on request.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the Claude Code workflow series on claudeguide.io. Disclosure: This site is part of the Biz AI self-investment project. The SaaS we build (claudecosts.app) is linked in our product index but not promoted in this post.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Take It Further
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://shoutfirst.gumroad.com/l/agfda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-code-workflow-multi-repo" rel="noopener noreferrer"&gt;Claude Code Power Prompts 300&lt;/a&gt;&lt;/strong&gt; — 300 battle-tested prompts for Claude Code, organized by use case. Copy, paste, ship.&lt;/p&gt;

&lt;p&gt;40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://shoutfirst.gumroad.com/l/agfda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-code-workflow-multi-repo" rel="noopener noreferrer"&gt;→ Get Claude Code Power Prompts 300 — $29&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;30-day money-back guarantee. Instant download.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>workflow</category>
    </item>
    <item>
      <title>Claude Prompt Caching: When It Pays Off (2026 Break-Even)</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Fri, 22 May 2026 15:36:38 +0000</pubDate>
      <link>https://dev.to/claudeguide/claude-prompt-caching-when-it-pays-off-2026-break-even-2034</link>
      <guid>https://dev.to/claudeguide/claude-prompt-caching-when-it-pays-off-2026-break-even-2034</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-api-cost-prompt-caching-break-even?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-api-cost-prompt-caching-break-even" rel="noopener noreferrer"&gt;claudeguide.io/claude-api-cost-prompt-caching-break-even&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude prompt caching: when it pays off and when it doesn't (2026 numbers)
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Claude prompt caching breaks even at 1.28 reuses for the 5-minute cache and 4 reuses for the 1-hour cache — below those thresholds, you pay 25% more than not caching. Above them, you save up to 90% on input tokens.&lt;/strong&gt; This post derives the break-even math from 2026 pricing and walks through six real workloads to show where caching wins, breaks even, and loses.&lt;/p&gt;

&lt;p&gt;For the complete pricing table this analysis is based on, see &lt;a href="https://dev.to/claude-api-pricing-2026"&gt;Claude API pricing 2026&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pricing (April 2026)
&lt;/h2&gt;

&lt;p&gt;Per 1M tokens, in USD:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Cache write 5m&lt;/th&gt;
&lt;th&gt;Cache write 1h&lt;/th&gt;
&lt;th&gt;Cache read&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;td&gt;$25&lt;/td&gt;
&lt;td&gt;$6.25&lt;/td&gt;
&lt;td&gt;$10&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;$3&lt;/td&gt;
&lt;td&gt;$15&lt;/td&gt;
&lt;td&gt;$3.75&lt;/td&gt;
&lt;td&gt;$6&lt;/td&gt;
&lt;td&gt;$0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;$1&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;td&gt;$1.25&lt;/td&gt;
&lt;td&gt;$2&lt;/td&gt;
&lt;td&gt;$0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Cache write 5m = 1.25x input price. Cache write 1h = 2x input price. Cache read = 0.1x input price.&lt;/p&gt;

&lt;h2&gt;
  
  
  The break-even formula
&lt;/h2&gt;

&lt;p&gt;For a prefix of size P tokens reused N times:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Without cache&lt;/strong&gt;: &lt;code&gt;N * P * input_price&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With cache&lt;/strong&gt;: &lt;code&gt;1 * P * cache_write_price + N * P * cache_read_price&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Caching is cheaper when:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

N * P * input 

PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&amp;amp;utm_medium=article&amp;amp;utm_campaign=claude-api-cost-prompt-caching-break-even)

*30-day money-back guarantee. Instant download.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
    </item>
    <item>
      <title>Claude Code Skills Explained: What They Are &amp; When to Use Them (2026)</title>
      <dc:creator>Sangmin Lee</dc:creator>
      <pubDate>Fri, 22 May 2026 15:35:51 +0000</pubDate>
      <link>https://dev.to/claudeguide/claude-code-skills-explained-what-they-are-when-to-use-them-2026-2370</link>
      <guid>https://dev.to/claudeguide/claude-code-skills-explained-what-they-are-when-to-use-them-2026-2370</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://claudeguide.io/claude-code-skills-overview?utm_source=devto&amp;amp;utm_medium=syndication&amp;amp;utm_campaign=claude-code-skills-overview" rel="noopener noreferrer"&gt;claudeguide.io/claude-code-skills-overview&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude Code Skills Explained: What They Are &amp;amp; When to Use Them (2026)
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Claude Code Skills are reusable, AI-callable workflows defined in SKILL.md files in your &lt;code&gt;~/.claude/skills/&lt;/code&gt; directory. Each skill is auto-discovered by Claude Code, can be invoked via the Skill tool, and replaces what would otherwise be one-off prompts copy-pasted between sessions — turning 30-line instructions into a single &lt;code&gt;/skill-name&lt;/code&gt; call.&lt;/strong&gt; Skills are how power users compress repetitive workflows (code review, deployment, content generation, audit) into reusable building blocks. If you find yourself pasting the same instructions into Claude Code multiple times per week, a skill replaces them.&lt;/p&gt;

&lt;p&gt;This guide covers the model: what counts as a skill, how Claude discovers them, when to use a skill vs a slash command vs a CLAUDE.md instruction, and what's already in the public skill library.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is a Skill, exactly?
&lt;/h2&gt;

&lt;p&gt;A skill is a markdown file (&lt;code&gt;SKILL.md&lt;/code&gt;) with frontmatter that Claude Code can invoke as a tool. It contains:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;YAML frontmatter&lt;/strong&gt; — name, description, when to invoke&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Markdown body&lt;/strong&gt; — step-by-step instructions Claude follows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional helper files&lt;/strong&gt; — scripts, templates, references&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example skill at &lt;code&gt;~/.claude/skills/deploy/SKILL.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ship the current branch to production. Use when the user says "ship it", "deploy", "go live", or after PR merge confirmation.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Deploy current branch&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Run &lt;span class="sb"&gt;`bun run build`&lt;/span&gt; — abort if exit code != 0
&lt;span class="p"&gt;2.&lt;/span&gt; Run &lt;span class="sb"&gt;`vercel deploy --prod --yes`&lt;/span&gt;
&lt;span class="p"&gt;3.&lt;/span&gt; Wait for "Aliased: ..." line, confirm HTTP 200
&lt;span class="p"&gt;4.&lt;/span&gt; Submit IndexNow with any new content slugs
&lt;span class="p"&gt;5.&lt;/span&gt; Report deployment URL + git SHA + line count of changed files
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now in any Claude Code session, you can say "ship it" and Claude invokes this skill instead of asking you to clarify the steps.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skills vs Slash Commands vs CLAUDE.md
&lt;/h2&gt;

&lt;p&gt;These three serve overlapping purposes. Picking right is half the battle:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Repeatable multi-step workflow ("deploy", "audit", "review")&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Skill&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;One-shot UI command (Claude built-in like &lt;code&gt;/clear&lt;/code&gt;, &lt;code&gt;/exit&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Slash command&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project-specific context (stack, conventions, commands)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;CLAUDE.md&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb&lt;/strong&gt;: if you'd write "follow these steps every time" in CLAUDE.md, make it a skill instead. Skills are invoked on demand; CLAUDE.md loads every session.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to use a Skill
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Good fits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deployment workflows&lt;/strong&gt; — &lt;code&gt;deploy&lt;/code&gt;, &lt;code&gt;rollback&lt;/code&gt;, &lt;code&gt;canary&lt;/code&gt; (see &lt;a href="https://dev.to/claude-code-hooks-deep-dive"&gt;Claude Code Hooks Deep Dive&lt;/a&gt; for hook patterns)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit/review&lt;/strong&gt; — &lt;code&gt;aeo-audit&lt;/code&gt;, &lt;code&gt;quality-gate&lt;/code&gt;, &lt;code&gt;security-scan&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content generation&lt;/strong&gt; — &lt;code&gt;write-blog-post&lt;/code&gt;, &lt;code&gt;generate-changelog&lt;/code&gt;, &lt;code&gt;summarize-pr&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Investigation&lt;/strong&gt; — &lt;code&gt;investigate-bug&lt;/code&gt;, &lt;code&gt;find-related-tests&lt;/code&gt;, &lt;code&gt;trace-deployment&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setup&lt;/strong&gt; — &lt;code&gt;init-project&lt;/code&gt;, &lt;code&gt;setup-deploy&lt;/code&gt;, &lt;code&gt;configure-monitoring&lt;/code&gt; (pairs well with &lt;a href="https://dev.to/claude-code-memory-system"&gt;Claude Code memory system&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Bad fits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trivial one-liners&lt;/strong&gt; — "format this file" doesn't need a skill, just say it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Highly variable workflows&lt;/strong&gt; — if every invocation needs different parameters, a skill becomes brittle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project-specific commands&lt;/strong&gt; — those belong in CLAUDE.md, not a global skill&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Public Skill Library
&lt;/h2&gt;

&lt;p&gt;Anthropic and the community maintain skills at &lt;code&gt;~/.claude/skills/&lt;/code&gt;. Notable examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gstack&lt;/strong&gt; — full development stack (review, ship, qa, investigate)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;superpowers&lt;/strong&gt; — TDD, debugging, code review patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;engineering&lt;/strong&gt; — debug, system-design, deploy-checklist&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;design&lt;/strong&gt; — design-review, accessibility-check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;finance&lt;/strong&gt; — reconciliation, journal-entry-prep&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can browse installed skills with &lt;code&gt;ls ~/.claude/skills/&lt;/code&gt;. Each directory contains a &lt;code&gt;SKILL.md&lt;/code&gt; defining when Claude should invoke that skill.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Claude Discovers Skills
&lt;/h2&gt;

&lt;p&gt;At session start, Claude Code lists all available skills with their descriptions. When you say "ship it" or "deploy this", Claude matches your intent against skill descriptions and invokes the best match.&lt;/p&gt;

&lt;p&gt;If you say something ambiguous ("update the docs"), Claude either picks the best match silently or asks "which skill should I use?". Specific phrasing in the user prompt triggers faster matching.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.claude/skills/
&lt;span class="c"&gt;# deploy/  audit/  review/  investigate/  setup-deploy/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Building Your First Skill
&lt;/h2&gt;

&lt;p&gt;The minimum viable skill is 5 lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bun-test&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run bun test on the current package. Use when user says "test it", "run tests", or after editing test files.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

Run &lt;span class="sb"&gt;`bun test`&lt;/span&gt; and report results. If any tests fail, show the failure output and stop.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save as &lt;code&gt;~/.claude/skills/bun-test/SKILL.md&lt;/code&gt;. Restart Claude Code. Done.&lt;/p&gt;

&lt;p&gt;For a deeper guide on building skills with arguments, helper scripts, and conditional logic, see &lt;a href="https://dev.to/how-to-build-custom-claude-code-skill"&gt;How to Build a Custom Claude Code Skill&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skills with Arguments
&lt;/h2&gt;

&lt;p&gt;Skills can accept arguments via &lt;code&gt;$ARGUMENTS&lt;/code&gt;:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
markdown
---
name: trace
description: Trace a deployment by its SHA. Usage: /trace &amp;lt;sha
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>skills</category>
      <category>automation</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
