DEV Community

Cover image for Claude API Cheatsheet 2026 — Models, Pricing, Limits in One Place
hiyoyo
hiyoyo

Posted on

Claude API Cheatsheet 2026 — Models, Pricing, Limits in One Place

All information verified from Anthropic's official documentation as of May 2026.


Models

Model ID Context Best For
Claude Opus 4.7 claude-opus-4-7 1M Complex reasoning, hardest tasks
Claude Sonnet 4.6 claude-sonnet-4-6 1M Most production workloads
Claude Haiku 4.5 claude-haiku-4-5-20251001 200K Fast, simple tasks

💡 Opus 4.7 and Sonnet 4.6 both support 1M token context at flat rate — no surcharge.


API Pricing (per million tokens)

Model Input Output Batch Input Batch Output
Opus 4.7 $5.00 $25.00 $2.50 $12.50
Sonnet 4.6 $3.00 $15.00 $1.50 $7.50
Haiku 4.5 $1.00 $5.00 $0.50 $2.50

⚠️ Batch API = 50% discount, but processes within 24 hours (async only).


Prompt Caching

Cache Type Cost
Cache write (5 min TTL) 1.25x input rate
Cache write (1 hour TTL) 2x input rate
Cache hit ~0.1x input rate (up to 90% savings)

💡 Best for: system prompts, repeated context, long documents.


Subscription Plans (not API)

Plan Monthly Annual Notes
Free $0 $0 Sonnet 4.6, rolling 5hr limit
Pro $20/mo $17/mo All models + Claude Code
Max 5x $100/mo 5x Pro usage
Max 20x $200/mo 20x Pro usage
Team Standard $25/seat $20/seat Min 5 seats
Team Premium $125/seat $100/seat Includes Claude Code & Cowork

⚠️ Subscriptions ≠ API access. API is always billed separately per token.


Basic API Call

import anthropic

client = anthropic.Anthropic(api_key="your_key")

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
print(message.content[0].text)
Enter fullscreen mode Exit fullscreen mode

Batch API

import anthropic

client = anthropic.Anthropic()

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": "request-1",
            "params": {
                "model": "claude-sonnet-4-6",
                "max_tokens": 1024,
                "messages": [{"role": "user", "content": "Hello!"}]
            }
        }
    ]
)
print(batch.id)
Enter fullscreen mode Exit fullscreen mode

Key Tips

  • Haiku → simple classification, summaries, high-volume tasks
  • Sonnet → most production use cases, best price/performance
  • Opus → complex reasoning only (~1.7x more expensive than Sonnet, 5x more than Haiku)
  • Use Batch API for non-realtime workloads (50% cheaper)
  • Use prompt caching for repeated system prompts (up to 90% cheaper)
  • Opus 4.7 uses a new tokenizer — may consume up to 35% more tokens than Opus 4.6 for the same input. Update your max_tokens accordingly.
  • Opus 4.7 rejects non-default temperature, top_p, and top_k (returns 400 error) — omit these parameters entirely.

If this was useful, a ❤️ helps more than you'd think.

HiyokoBar:https://hiyokoko.gumroad.com/l/hiyokobar
X: @hiyoyok

Top comments (0)