Hassann

Posted on Jun 10 • Originally published at apidog.com

How to Use the Claude Fable 5 API

Anthropic shipped Claude Fable 5 on June 9, 2026. If you build software with Claude, the practical change is the model name: claude-fable-5. It uses the same Messages API shape as earlier Claude models, so most migrations are a string swap plus normal validation around streaming, tool use, errors, and costs. If you have already integrated Claude, this will feel similar to moving to the Claude Opus 4.8 API.

Try Apidog today

TL;DR

Create an API key in the Anthropic Console.
Export it as ANTHROPIC_API_KEY.
Send a POST request to the Messages API.
Use model: "claude-fable-5".
Stream long responses.
Read usage from every response to calculate cost.

Pricing is $10 per million input tokens and $50 per million output tokens.

Before you start

You need four things before making your first request.

An Anthropic account

Sign up at console.anthropic.com. Use the Console to manage API keys, usage, and billing.
An API key

Create one under API Keys. Copy it immediately and store it securely. Treat it like a password.
Billing or an Enterprise plan

Fable 5 is available on the standard Claude API and on consumption-based Enterprise plans. Confirm billing before sending production traffic. If you are still evaluating the model, read the overview of what Claude Fable 5 is.
An SDK or HTTP client

The official Anthropic SDKs are the easiest path, but raw HTTP works too.

Set your API key as an environment variable:

export ANTHROPIC_API_KEY="sk-ant-..."

Both the Python and TypeScript SDKs read ANTHROPIC_API_KEY automatically. Do not hard-code keys in source code. If a key leaks, rotate it in the Console immediately.

One behavior to know: Fable 5 includes safeguards that route a small slice of sensitive requests, such as cybersecurity, biology and chemistry, and model distillation attempts, to Claude Opus 4.8. This happens in under 5% of sessions. Your request still succeeds, but the returned model field may show a different model.

Make your first Claude Fable 5 API call

Start with curl so you can see the raw request.

Endpoint:

POST https://api.anthropic.com/v1/messages

Example:

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Summarize what makes a good REST API in 3 bullet points."
      }
    ]
  }'

The endpoint is documented in the Anthropic Messages API reference.

Required headers:

Header	Purpose
`x-api-key`	Your Anthropic API key
`anthropic-version`	API version, usually `2023-06-01`
`content-type`	Must be `application/json`

Required body fields:

Field	Purpose
`model`	Use `claude-fable-5`
`max_tokens`	Maximum output tokens
`messages`	Conversation messages

Example response:

{
  "id": "msg_01ABC...",
  "type": "message",
  "role": "assistant",
  "model": "claude-fable-5",
  "content": [
    {
      "type": "text",
      "text": "- Predictable, resource-oriented URLs..."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 18,
    "output_tokens": 96
  }
}

Important details:

content is a list, not a string.
A response can include text blocks, tool-use blocks, and thinking blocks.
Always check each block’s type.
stop_reason tells you why generation stopped.
usage gives exact token counts for cost calculation.

Call Fable 5 from Python

Install the official SDK:

pip install anthropic

Basic request:

import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Summarize what makes a good REST API."
        }
    ],
)

for block in response.content:
    if block.type == "text":
        print(block.text)

The SDK uses the same request shape as curl: model, max_tokens, and messages.

Add a system prompt

Use system for role, style, and output constraints:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=2048,
    system="You are a senior backend engineer. Be concise and use code examples.",
    messages=[
        {
            "role": "user",
            "content": "Write a Flask route that validates a JSON body."
        }
    ],
)

for block in response.content:
    if block.type == "text":
        print(block.text)

Use the system prompt for stable instructions that apply across the whole conversation. Avoid changing it on every request if you plan to use prompt caching later.

Stream long outputs

Use streaming for long responses. It improves perceived latency and helps avoid request timeouts.

with client.messages.stream(
    model="claude-fable-5",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Explain idempotency keys for payment APIs."
        }
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    final = stream.get_final_message()

print(f"\n\nTokens: {final.usage.output_tokens}")

stream.text_stream yields text chunks as they arrive. After the stream ends, stream.get_final_message() returns the complete final message, including usage.

Call Fable 5 from TypeScript / Node

Install the SDK:

npm install @anthropic-ai/sdk

Basic request:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); // reads ANTHROPIC_API_KEY

const msg = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: "List 3 common API security mistakes.",
    },
  ],
});

console.log(msg.content);

Extract only text blocks:

const text = msg.content
  .filter((block) => block.type === "text")
  .map((block) => block.text)
  .join("");

console.log(text);

For frontend chat apps, stream from your backend route and forward chunks to the browser. Before wiring this into an app, test the request contract manually. The same workflow is covered in testing the ChatGPT API with Apidog.

Use tools / function calling with Fable 5

Tool use lets the model request calls to functions that you define. You provide a tool schema, the model decides whether to call it, your code runs the function, and then you send the result back.

Define a tool:

tools = [
    {
        "name": "get_order_status",
        "description": "Look up the status of a customer order by ID.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string"
                }
            },
            "required": ["order_id"],
        },
    }
]

Send the tool definition with the request:

messages = [
    {
        "role": "user",
        "content": "What's the status of order A1855?"
    }
]

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    tools=tools,
    messages=messages,
)

If the model wants to use a tool, the response has:

response.stop_reason == "tool_use"

Then find the tool_use block, run your function, and send the result back:

if response.stop_reason == "tool_use":
    tool_use = next(
        block for block in response.content
        if block.type == "tool_use"
    )

    # Run your real function with the model's chosen input
    result = lookup_order(tool_use.input["order_id"])  # your code

    messages.append({
        "role": "assistant",
        "content": response.content,
    })

    messages.append({
        "role": "user",
        "content": [
            {
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": result,
            }
        ],
    })

    followup = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )

The critical field is tool_use_id. The tool_result must reference the exact id from the original tool_use block.

For multi-step agents, wrap this in a loop:

while True:
    response = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )

    if response.stop_reason == "end_turn":
        break

    if response.stop_reason == "tool_use":
        tool_use = next(
            block for block in response.content
            if block.type == "tool_use"
        )

        result = run_tool(tool_use.name, tool_use.input)

        messages.append({
            "role": "assistant",
            "content": response.content,
        })

        messages.append({
            "role": "user",
            "content": [
                {
                    "type": "tool_result",
                    "tool_use_id": tool_use.id,
                    "content": result,
                }
            ],
        })

The Python SDK also includes a tool runner, but implementing the loop manually makes it easier to add logging, approvals, retries, or audit trails.

Use adaptive thinking and effort

Fable 5 supports adaptive thinking. When enabled, the model decides when and how deeply to reason before answering.

Example:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},  # low | medium | high
    messages=[
        {
            "role": "user",
            "content": "Design a retry strategy for a flaky webhook receiver."
        }
    ],
)

Use this selectively.

Good use cases:

multi-step design problems
debugging plans
architecture tradeoffs
long-horizon agent tasks

Avoid it for:

short lookups
simple transformations
one-line summaries
low-cost high-volume routes

effort controls depth and token usage. Higher effort can produce more thorough answers, but costs more.

Handle errors and fallback behavior

Production integrations should catch typed SDK exceptions instead of parsing error strings.

import anthropic

client = anthropic.Anthropic()

try:
    response = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": "Explain CORS preflight requests."
            }
        ],
    )

except anthropic.AuthenticationError:
    # 401: bad or missing API key
    print("Invalid API key. Rotate it in the Console and re-export.")

except anthropic.RateLimitError as e:
    # 429: too many requests
    retry_after = e.response.headers.get("retry-after", "60")
    print(f"Rate limited. Retry after {retry_after}s.")

except anthropic.BadRequestError as e:
    # 400: malformed request
    print(f"Bad request: {e.message}")

Common errors:

Error	HTTP status	Cause	Fix
`AuthenticationError`	401	Missing, malformed, or revoked API key	Check `ANTHROPIC_API_KEY` and rotate if needed
`RateLimitError`	429	Too many requests or tokens	Back off and retry
`BadRequestError`	400	Invalid request shape	Check `messages`, `max_tokens`, roles, and JSON structure

The SDK retries 429 and 5xx errors with exponential backoff by default. If you add custom retry logic, read the retry-after header.

Do not hard-fail on fallback model names

Fable 5 may route some sensitive requests to Claude Opus 4.8. This is not an error. The request succeeds, but response.model may not equal claude-fable-5.

Avoid this:

assert response.model == "claude-fable-5"

Prefer this:

actual_model = response.model
print(f"Answered by: {actual_model}")

If your application needs to know which model answered, read it from the returned response object.

Estimate cost per request

Fable 5 pricing:

$10 per million input tokens
$50 per million output tokens

Every response includes exact usage:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Write a SQL query to find duplicate emails."
        }
    ],
)

input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens

input_cost = input_tokens / 1_000_000 * 10
output_cost = output_tokens / 1_000_000 * 50
total = input_cost + output_cost

print(f"Input:  {input_tokens} tokens  = ${input_cost:.6f}")
print(f"Output: {output_tokens} tokens = ${output_cost:.6f}")
print(f"Total:  ${total:.6f}")

Example:

2,000 input tokens  * $10 / 1M = $0.020
500 output tokens   * $50 / 1M = $0.025
Total                         = $0.045

Output tokens cost five times more than input tokens, so control output length first:

set a reasonable max_tokens
ask for concise answers in the system prompt
avoid verbose formats unless needed
stream long responses instead of increasing timeouts

The cost math is similar to the one used for Claude Opus 4.8 pricing, but with Fable 5’s pricing numbers.

Test and debug the Claude Fable 5 API with Apidog

Before writing client code, send a few requests manually and inspect the exact response shape. Apidog works well for this because you can create the request, store secrets as environment variables, inspect streamed responses, and save the request for your team.

1. Create the request

Create a new HTTP request in Apidog.

Set the method:

POST

Set the URL:

https://api.anthropic.com/v1/messages

2. Store your API key

Create an environment variable:

anthropic_api_key

Store your Anthropic key as a secret value. This keeps it out of saved requests and shared exports.

3. Add headers

Add these headers:

x-api-key: {{anthropic_api_key}}
anthropic-version: 2023-06-01
content-type: application/json

4. Add the JSON body

Use the minimal payload:

{
  "model": "claude-fable-5",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "Explain idempotency keys for payment APIs."
    }
  ]
}

Send the request and inspect:

content
stop_reason
usage
model

5. Test streaming

Add stream: true:

{
  "model": "claude-fable-5",
  "max_tokens": 1024,
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Explain idempotency keys for payment APIs."
    }
  ]
}

Apidog renders server-sent events as they arrive, so you can verify the streaming response before implementing it in your app.

6. Save the request and generate code

Save the working request into a collection. Then generate a starter snippet in Python, JavaScript, curl, or another language.

This gives you a known-good request to compare against your application code when debugging. When you are ready, Download Apidog and start with the minimal request body above.

DEV Community

How to Use the Claude Fable 5 API

TL;DR

Before you start

Make your first Claude Fable 5 API call

Call Fable 5 from Python

Add a system prompt

Stream long outputs

Call Fable 5 from TypeScript / Node

Use tools / function calling with Fable 5

Use adaptive thinking and effort

Handle errors and fallback behavior

Do not hard-fail on fallback model names

Estimate cost per request

Test and debug the Claude Fable 5 API with Apidog

1. Create the request

2. Store your API key

3. Add headers

4. Add the JSON body

5. Test streaming

6. Save the request and generate code

Top comments (0)