DEV Community

Sangmin Lee
Sangmin Lee

Posted on • Originally published at claudeguide.io

Anthropic Message Batches API: 50% Cost Reduction for Bulk Processing

Originally published at claudeguide.io/anthropic-batch-api-guide

Anthropic Message Batches API: 50% Cost Reduction for Bulk Processing

The Anthropic Message Batches API processes large volumes of requests asynchronously at 50% of standard pricing. Instead of sending requests one by one and paying full price, you batch up to 10,000 requests, submit them together, and retrieve results within 24 hours (typically 1–4 hours). The trade-off is latency: you cannot use batches for real-time user interactions. Use batches for document processing, data enrichment, content generation at scale, and any task where you can tolerate multi-hour turnaround.


When to use the Batches API

Use batches when:

  • You're processing a large dataset offline (document analysis, data extraction)
  • The task can tolerate hours of delay (not user-facing)
  • Cost reduction matters more than immediate results
  • You need to process 100+ similar requests

Use real-time API when:

  • Users are waiting for results
  • Latency under 30 seconds is required
  • Request count is under 50

Cost comparison (Sonnet 4 as of April 2026):

Standard Batch
Input $3/M tokens $1.50/M tokens
Output $15/M tokens $7.50/M tokens
Latency 1–30 seconds 1–24 hours

At 1 million tokens per day, batches save ~$750/month.


Creating a batch

import anthropic
import json

client = anthropic.Anthropic()

# Prepare your requests (up to 10,000 per batch)
requests_data = [
    {
        "custom_id": f"extract-{i}",  # Your unique ID for tracking
        "params": {
            "model": "claude-sonnet-4-5",
            "max_tokens": 1024,
            "system": "Extract the key facts from this text as a JSON object.",
            "messages": [
                {"role": "user", "content": f"Extract from: {document}"}
            ]
        }
    }
    for i, document in enumerate(documents)
]

# Create the batch
batch = client.messages.batches.create(requests=requests_data)
print(f"Batch created: {batch.id}")
print(f"Status: {batch.processing_status}")
# Output: "in_progress"
Enter fullscreen mode Exit fullscreen mode

The custom_id: your identifier for each request. Use it to match results to inputs. Must be unique within the batch (up to 64 characters).


Monitoring batch status


python
import time

def wait_for_batch(batch_id: str, poll_interval: int = 60) -

[→ Get the Cost Optimization Toolkit — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=anthropic-batch-api-guide)

*30-day money-back guarantee. Instant download.*
Enter fullscreen mode Exit fullscreen mode

Top comments (0)