All information verified from Anthropic's official documentation as of May 2026.
Models
| Model | ID | Context | Best For |
|---|---|---|---|
| Claude Opus 4.7 | claude-opus-4-7 |
1M | Complex reasoning, hardest tasks |
| Claude Sonnet 4.6 | claude-sonnet-4-6 |
1M | Most production workloads |
| Claude Haiku 4.5 | claude-haiku-4-5-20251001 |
200K | Fast, simple tasks |
💡 Opus 4.7 and Sonnet 4.6 both support 1M token context at flat rate — no surcharge.
API Pricing (per million tokens)
| Model | Input | Output | Batch Input | Batch Output |
|---|---|---|---|---|
| Opus 4.7 | $5.00 | $25.00 | $2.50 | $12.50 |
| Sonnet 4.6 | $3.00 | $15.00 | $1.50 | $7.50 |
| Haiku 4.5 | $1.00 | $5.00 | $0.50 | $2.50 |
⚠️ Batch API = 50% discount, but processes within 24 hours (async only).
Prompt Caching
| Cache Type | Cost |
|---|---|
| Cache write (5 min TTL) | 1.25x input rate |
| Cache write (1 hour TTL) | 2x input rate |
| Cache hit | ~0.1x input rate (up to 90% savings) |
💡 Best for: system prompts, repeated context, long documents.
Subscription Plans (not API)
| Plan | Monthly | Annual | Notes |
|---|---|---|---|
| Free | $0 | $0 | Sonnet 4.6, rolling 5hr limit |
| Pro | $20/mo | $17/mo | All models + Claude Code |
| Max 5x | $100/mo | — | 5x Pro usage |
| Max 20x | $200/mo | — | 20x Pro usage |
| Team Standard | $25/seat | $20/seat | Min 5 seats |
| Team Premium | $125/seat | $100/seat | Includes Claude Code & Cowork |
⚠️ Subscriptions ≠ API access. API is always billed separately per token.
Basic API Call
import anthropic
client = anthropic.Anthropic(api_key="your_key")
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(message.content[0].text)
Batch API
import anthropic
client = anthropic.Anthropic()
batch = client.messages.batches.create(
requests=[
{
"custom_id": "request-1",
"params": {
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}
}
]
)
print(batch.id)
Key Tips
- Haiku → simple classification, summaries, high-volume tasks
- Sonnet → most production use cases, best price/performance
- Opus → complex reasoning only (~1.7x more expensive than Sonnet, 5x more than Haiku)
- Use Batch API for non-realtime workloads (50% cheaper)
- Use prompt caching for repeated system prompts (up to 90% cheaper)
-
Opus 4.7 uses a new tokenizer — may consume up to 35% more tokens than Opus 4.6 for the same input. Update your
max_tokensaccordingly. -
Opus 4.7 rejects non-default
temperature,top_p, andtop_k(returns 400 error) — omit these parameters entirely.
If this was useful, a ❤️ helps more than you'd think.
HiyokoBar:https://hiyokoko.gumroad.com/l/hiyokobar
X: @hiyoyok
Top comments (0)