Originally published at claudeguide.io/anthropic-batch-api-guide
Anthropic Message Batches API: 50% Cost Reduction for Bulk Processing
The Anthropic Message Batches API processes large volumes of requests asynchronously at 50% of standard pricing. Instead of sending requests one by one and paying full price, you batch up to 10,000 requests, submit them together, and retrieve results within 24 hours (typically 1–4 hours). The trade-off is latency: you cannot use batches for real-time user interactions. Use batches for document processing, data enrichment, content generation at scale, and any task where you can tolerate multi-hour turnaround.
When to use the Batches API
Use batches when:
- You're processing a large dataset offline (document analysis, data extraction)
- The task can tolerate hours of delay (not user-facing)
- Cost reduction matters more than immediate results
- You need to process 100+ similar requests
Use real-time API when:
- Users are waiting for results
- Latency under 30 seconds is required
- Request count is under 50
Cost comparison (Sonnet 4 as of April 2026):
| Standard | Batch | |
|---|---|---|
| Input | $3/M tokens | $1.50/M tokens |
| Output | $15/M tokens | $7.50/M tokens |
| Latency | 1–30 seconds | 1–24 hours |
At 1 million tokens per day, batches save ~$750/month.
Creating a batch
import anthropic
import json
client = anthropic.Anthropic()
# Prepare your requests (up to 10,000 per batch)
requests_data = [
{
"custom_id": f"extract-{i}", # Your unique ID for tracking
"params": {
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"system": "Extract the key facts from this text as a JSON object.",
"messages": [
{"role": "user", "content": f"Extract from: {document}"}
]
}
}
for i, document in enumerate(documents)
]
# Create the batch
batch = client.messages.batches.create(requests=requests_data)
print(f"Batch created: {batch.id}")
print(f"Status: {batch.processing_status}")
# Output: "in_progress"
The custom_id: your identifier for each request. Use it to match results to inputs. Must be unique within the batch (up to 64 characters).
Monitoring batch status
python
import time
def wait_for_batch(batch_id: str, poll_interval: int = 60) -
[→ Get the Cost Optimization Toolkit — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=anthropic-batch-api-guide)
*30-day money-back guarantee. Instant download.*
Top comments (0)