Anthropic shipped Claude Fable 5 on June 9, 2026. If you build software with Claude, the practical change is the model name: claude-fable-5. It uses the same Messages API shape as earlier Claude models, so most migrations are a string swap plus normal validation around streaming, tool use, errors, and costs. If you have already integrated Claude, this will feel similar to moving to the Claude Opus 4.8 API.
TL;DR
- Create an API key in the Anthropic Console.
- Export it as
ANTHROPIC_API_KEY. - Send a
POSTrequest to the Messages API. - Use
model: "claude-fable-5". - Stream long responses.
- Read
usagefrom every response to calculate cost.
Pricing is $10 per million input tokens and $50 per million output tokens.
Before you start
You need four things before making your first request.
An Anthropic account
Sign up at console.anthropic.com. Use the Console to manage API keys, usage, and billing.An API key
Create one under API Keys. Copy it immediately and store it securely. Treat it like a password.Billing or an Enterprise plan
Fable 5 is available on the standard Claude API and on consumption-based Enterprise plans. Confirm billing before sending production traffic. If you are still evaluating the model, read the overview of what Claude Fable 5 is.An SDK or HTTP client
The official Anthropic SDKs are the easiest path, but raw HTTP works too.
Set your API key as an environment variable:
export ANTHROPIC_API_KEY="sk-ant-..."
Both the Python and TypeScript SDKs read ANTHROPIC_API_KEY automatically. Do not hard-code keys in source code. If a key leaks, rotate it in the Console immediately.
One behavior to know: Fable 5 includes safeguards that route a small slice of sensitive requests, such as cybersecurity, biology and chemistry, and model distillation attempts, to Claude Opus 4.8. This happens in under 5% of sessions. Your request still succeeds, but the returned model field may show a different model.
Make your first Claude Fable 5 API call
Start with curl so you can see the raw request.
Endpoint:
POST https://api.anthropic.com/v1/messages
Example:
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-fable-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Summarize what makes a good REST API in 3 bullet points."
}
]
}'
The endpoint is documented in the Anthropic Messages API reference.
Required headers:
| Header | Purpose |
|---|---|
x-api-key |
Your Anthropic API key |
anthropic-version |
API version, usually 2023-06-01
|
content-type |
Must be application/json
|
Required body fields:
| Field | Purpose |
|---|---|
model |
Use claude-fable-5
|
max_tokens |
Maximum output tokens |
messages |
Conversation messages |
Example response:
{
"id": "msg_01ABC...",
"type": "message",
"role": "assistant",
"model": "claude-fable-5",
"content": [
{
"type": "text",
"text": "- Predictable, resource-oriented URLs..."
}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 18,
"output_tokens": 96
}
}
Important details:
-
contentis a list, not a string. - A response can include text blocks, tool-use blocks, and thinking blocks.
- Always check each block’s
type. -
stop_reasontells you why generation stopped. -
usagegives exact token counts for cost calculation.
Call Fable 5 from Python
Install the official SDK:
pip install anthropic
Basic request:
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env
response = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Summarize what makes a good REST API."
}
],
)
for block in response.content:
if block.type == "text":
print(block.text)
The SDK uses the same request shape as curl: model, max_tokens, and messages.
Add a system prompt
Use system for role, style, and output constraints:
response = client.messages.create(
model="claude-fable-5",
max_tokens=2048,
system="You are a senior backend engineer. Be concise and use code examples.",
messages=[
{
"role": "user",
"content": "Write a Flask route that validates a JSON body."
}
],
)
for block in response.content:
if block.type == "text":
print(block.text)
Use the system prompt for stable instructions that apply across the whole conversation. Avoid changing it on every request if you plan to use prompt caching later.
Stream long outputs
Use streaming for long responses. It improves perceived latency and helps avoid request timeouts.
with client.messages.stream(
model="claude-fable-5",
max_tokens=4096,
messages=[
{
"role": "user",
"content": "Explain idempotency keys for payment APIs."
}
],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
final = stream.get_final_message()
print(f"\n\nTokens: {final.usage.output_tokens}")
stream.text_stream yields text chunks as they arrive. After the stream ends, stream.get_final_message() returns the complete final message, including usage.
Call Fable 5 from TypeScript / Node
Install the SDK:
npm install @anthropic-ai/sdk
Basic request:
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic(); // reads ANTHROPIC_API_KEY
const msg = await client.messages.create({
model: "claude-fable-5",
max_tokens: 1024,
messages: [
{
role: "user",
content: "List 3 common API security mistakes.",
},
],
});
console.log(msg.content);
Extract only text blocks:
const text = msg.content
.filter((block) => block.type === "text")
.map((block) => block.text)
.join("");
console.log(text);
For frontend chat apps, stream from your backend route and forward chunks to the browser. Before wiring this into an app, test the request contract manually. The same workflow is covered in testing the ChatGPT API with Apidog.
Use tools / function calling with Fable 5
Tool use lets the model request calls to functions that you define. You provide a tool schema, the model decides whether to call it, your code runs the function, and then you send the result back.
Define a tool:
tools = [
{
"name": "get_order_status",
"description": "Look up the status of a customer order by ID.",
"input_schema": {
"type": "object",
"properties": {
"order_id": {
"type": "string"
}
},
"required": ["order_id"],
},
}
]
Send the tool definition with the request:
messages = [
{
"role": "user",
"content": "What's the status of order A1855?"
}
]
response = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
tools=tools,
messages=messages,
)
If the model wants to use a tool, the response has:
response.stop_reason == "tool_use"
Then find the tool_use block, run your function, and send the result back:
if response.stop_reason == "tool_use":
tool_use = next(
block for block in response.content
if block.type == "tool_use"
)
# Run your real function with the model's chosen input
result = lookup_order(tool_use.input["order_id"]) # your code
messages.append({
"role": "assistant",
"content": response.content,
})
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": result,
}
],
})
followup = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
tools=tools,
messages=messages,
)
The critical field is tool_use_id. The tool_result must reference the exact id from the original tool_use block.
For multi-step agents, wrap this in a loop:
while True:
response = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
tools=tools,
messages=messages,
)
if response.stop_reason == "end_turn":
break
if response.stop_reason == "tool_use":
tool_use = next(
block for block in response.content
if block.type == "tool_use"
)
result = run_tool(tool_use.name, tool_use.input)
messages.append({
"role": "assistant",
"content": response.content,
})
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": result,
}
],
})
The Python SDK also includes a tool runner, but implementing the loop manually makes it easier to add logging, approvals, retries, or audit trails.
Use adaptive thinking and effort
Fable 5 supports adaptive thinking. When enabled, the model decides when and how deeply to reason before answering.
Example:
response = client.messages.create(
model="claude-fable-5",
max_tokens=4096,
thinking={"type": "adaptive"},
output_config={"effort": "high"}, # low | medium | high
messages=[
{
"role": "user",
"content": "Design a retry strategy for a flaky webhook receiver."
}
],
)
Use this selectively.
Good use cases:
- multi-step design problems
- debugging plans
- architecture tradeoffs
- long-horizon agent tasks
Avoid it for:
- short lookups
- simple transformations
- one-line summaries
- low-cost high-volume routes
effort controls depth and token usage. Higher effort can produce more thorough answers, but costs more.
Handle errors and fallback behavior
Production integrations should catch typed SDK exceptions instead of parsing error strings.
import anthropic
client = anthropic.Anthropic()
try:
response = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Explain CORS preflight requests."
}
],
)
except anthropic.AuthenticationError:
# 401: bad or missing API key
print("Invalid API key. Rotate it in the Console and re-export.")
except anthropic.RateLimitError as e:
# 429: too many requests
retry_after = e.response.headers.get("retry-after", "60")
print(f"Rate limited. Retry after {retry_after}s.")
except anthropic.BadRequestError as e:
# 400: malformed request
print(f"Bad request: {e.message}")
Common errors:
| Error | HTTP status | Cause | Fix |
|---|---|---|---|
AuthenticationError |
401 | Missing, malformed, or revoked API key | Check ANTHROPIC_API_KEY and rotate if needed |
RateLimitError |
429 | Too many requests or tokens | Back off and retry |
BadRequestError |
400 | Invalid request shape | Check messages, max_tokens, roles, and JSON structure |
The SDK retries 429 and 5xx errors with exponential backoff by default. If you add custom retry logic, read the retry-after header.
Do not hard-fail on fallback model names
Fable 5 may route some sensitive requests to Claude Opus 4.8. This is not an error. The request succeeds, but response.model may not equal claude-fable-5.
Avoid this:
assert response.model == "claude-fable-5"
Prefer this:
actual_model = response.model
print(f"Answered by: {actual_model}")
If your application needs to know which model answered, read it from the returned response object.
Estimate cost per request
Fable 5 pricing:
-
$10per million input tokens -
$50per million output tokens
Every response includes exact usage:
response = client.messages.create(
model="claude-fable-5",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Write a SQL query to find duplicate emails."
}
],
)
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
input_cost = input_tokens / 1_000_000 * 10
output_cost = output_tokens / 1_000_000 * 50
total = input_cost + output_cost
print(f"Input: {input_tokens} tokens = ${input_cost:.6f}")
print(f"Output: {output_tokens} tokens = ${output_cost:.6f}")
print(f"Total: ${total:.6f}")
Example:
2,000 input tokens * $10 / 1M = $0.020
500 output tokens * $50 / 1M = $0.025
Total = $0.045
Output tokens cost five times more than input tokens, so control output length first:
- set a reasonable
max_tokens - ask for concise answers in the system prompt
- avoid verbose formats unless needed
- stream long responses instead of increasing timeouts
The cost math is similar to the one used for Claude Opus 4.8 pricing, but with Fable 5’s pricing numbers.
Test and debug the Claude Fable 5 API with Apidog
Before writing client code, send a few requests manually and inspect the exact response shape. Apidog works well for this because you can create the request, store secrets as environment variables, inspect streamed responses, and save the request for your team.
1. Create the request
Create a new HTTP request in Apidog.
Set the method:
POST
Set the URL:
https://api.anthropic.com/v1/messages
2. Store your API key
Create an environment variable:
anthropic_api_key
Store your Anthropic key as a secret value. This keeps it out of saved requests and shared exports.
3. Add headers
Add these headers:
x-api-key: {{anthropic_api_key}}
anthropic-version: 2023-06-01
content-type: application/json
4. Add the JSON body
Use the minimal payload:
{
"model": "claude-fable-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Explain idempotency keys for payment APIs."
}
]
}
Send the request and inspect:
contentstop_reasonusagemodel
5. Test streaming
Add stream: true:
{
"model": "claude-fable-5",
"max_tokens": 1024,
"stream": true,
"messages": [
{
"role": "user",
"content": "Explain idempotency keys for payment APIs."
}
]
}
Apidog renders server-sent events as they arrive, so you can verify the streaming response before implementing it in your app.
6. Save the request and generate code
Save the working request into a collection. Then generate a starter snippet in Python, JavaScript, curl, or another language.
This gives you a known-good request to compare against your application code when debugging. When you are ready, Download Apidog and start with the minimal request body above.



Top comments (0)