Originally published at claudeguide.io/claude-api-pricing-2026
Claude API Pricing 2026: Complete Breakdown with Calculators
Anthropic's Claude API uses a per-token pricing model. You pay for tokens consumed — input (what you send) and output (what the model generates). This guide covers every pricing tier, feature, and real-world cost example as of April 2026.
Current pricing table (April 2026)
Standard API
| Model | Input per 1M tokens | Output per 1M tokens |
|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Opus 4.7 | $5.00 | $25.00 |
Prompt caching
| Model | Cache write per 1M | Cache read per 1M |
|---|---|---|
| Claude Haiku 4.5 | $1.25 | $0.10 |
| Claude Sonnet 4.6 | $3.75 | $0.30 |
| Claude Opus 4.7 | $6.25 | $0.50 |
Cache read prices are 10% of standard input prices. Cache writes are 125% of standard input prices.
Batch API (50% off all standard rates)
| Model | Input per 1M tokens | Output per 1M tokens |
|---|---|---|
| Claude Haiku 4.5 | $0.50 | $2.50 |
| Claude Sonnet 4.6 | $1.50 | $7.50 |
| Claude Opus 4.7 | $2.50 | $12.50 |
Batch API processes requests asynchronously within 24 hours. No streaming. Ideal for non-time-sensitive bulk workloads.
1M context window (extended context)
For Sonnet 4.6 and Opus 4.7, input tokens beyond 200K are billed at higher rates. Haiku 4.5 does not support 1M context.
| Context range | Sonnet 4.6 input | Opus 4.7 input |
|---|---|---|
| 0 – 200K tokens | $3.00/1M | $5.00/1M |
| 200K – 1M tokens | $6.00/1M | $10.00/1M |
Output pricing is unchanged regardless of context length.
Three ratios to memorize
1. Output is 5x more expensive than input (for all models). A 1K-token output costs the same as a 5K-token input. Every prompt engineering choice that reduces output length saves 5x more than the same reduction in input.
2. Opus is 5x more expensive than Haiku. A Haiku workload costing $100/month costs $500/month on Opus. Use the cheapest model that clears your quality bar. For a practical guide to matching tasks to models, see Haiku vs Sonnet vs Opus: which model to use.
3. Cache reads are 10% of input price. If the same system prompt is reused across calls, every cache hit saves 90% on that input slice. The break-even is reached at 1.28 cache hits per write. See the prompt caching break-even guide for the full calculation with worked examples.
Worked cost examples
Example 1: High-volume classification
- Task: classify user messages into 12 categories
- Input: 500 tokens (message + system prompt)
- Output: 10 tokens (one label + confidence)
- Volume: 200,000 requests/month
- Model: Haiku 4.5
Calculation:
- Input: 200,000 × 500 tokens = 100M tokens → $100
- Output: 200,000 × 10 tokens = 2M tokens → $10
- Total: $110/month
If you used Opus: $550 input + $50 output = $600/month. That is $490/month wasted.
Example 2: Customer support drafts
- Task: generate reply drafts for support tickets
- Input: 2,000 tokens (ticket + system prompt + few-shot examples)
- Output: 300 tokens (draft reply)
- Volume: 30,000 requests/month
- Model: Sonnet 4.6
- Caching: system prompt (1,200 tokens) cached across all requests
Without caching:
- Input: 30,000 × 2,000 = 60M tokens → $180
- Output: 30,000 × 300 = 9M tokens → $135
- Total: $315/month
With prompt caching:
- Cache write: 1,200 tokens × 1 write = 1,200 tokens → $0.005 (negligible)
- Cache reads: 1,200 tokens × 30,000 = 36M tokens → $10.80
- Non-cached input: 800 tokens × 30,000 = 24M tokens → $72
- Output: unchanged → $135
- Total with caching: $217.80/month (31% savings)
Example 3: Document summarization (1M context)
- Task: summarize 400K-token legal contracts
- Input: 400,000 tokens per request
- Output: 800 tokens per summary
- Volume: 200 requests/month
- Model: Opus 4.7
Calculation:
- First 200K tokens: 200,000 × 200 = 40M tokens → $200
- Extended (200K-400K): 200,000 × 200 = 40M tokens at $10/1M → $400
- Output: 200 × 800 = 160,000 tokens → $4
- Total: $604/month
Note: a 400K-token document on Sonnet 4.6 would cost $200 + $200 = $400 input + $2 output = $402/month — saving $200/month with minimal quality loss in most summarization tasks. Test before assuming Opus is required.
Example 4: Batch API for nightly data enrichment
- Task: enrich 50,000 product records with descriptions
- Input: 300 tokens per record
- Output: 200 tokens per record
- Model: Sonnet 4.6, Batch API
Without batch (standard):
- Input: 50,000 × 300 = 15M tokens → $45
- Output: 50,000 × 200 = 10M tokens → $150
- Total: $195/run
With Batch API:
- Input: 15M tokens at $1.50/1M → $22.50
- Output: 10M tokens at $7.50/1M → $75
- Total: $97.50/run (50% savings)
At twice-weekly runs: $195/week → $97.50/week = $410/month saved.
How to calculate your own costs
Step 1: Estimate token volumes
Use the countTokens API endpoint to measure actual token counts for your prompts rather than estimating:
import anthropic
client = anthropic.Anthropic()
response = client.messages.count_tokens(
model="claude-sonnet-4-6",
system="Your system prompt here",
messages=[{"role": "user", "content": "Sample user message"}],
)
print(f"Input tokens: {response.input_tokens}")
Step 2: Calculate cost
python
def estimate_monthly_cost(
model: str,
input_tokens_per_request: int,
output_tokens_per_request: int,
requests_per_month: int,
cached_tokens_per_request: int = 0,
) -
PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.
[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-pricing-2026)
*30-day money-back guarantee. Instant download.*
Top comments (0)