DEV Community

Cover image for How to Use the Claude Fable 5 API
Hassann
Hassann

Posted on • Originally published at apidog.com

How to Use the Claude Fable 5 API

Anthropic shipped Claude Fable 5 on June 9, 2026. If you build software with Claude, the practical change is the model name: claude-fable-5. It uses the same Messages API shape as earlier Claude models, so most migrations are a string swap plus normal validation around streaming, tool use, errors, and costs. If you have already integrated Claude, this will feel similar to moving to the Claude Opus 4.8 API.

Try Apidog today

TL;DR

  1. Create an API key in the Anthropic Console.
  2. Export it as ANTHROPIC_API_KEY.
  3. Send a POST request to the Messages API.
  4. Use model: "claude-fable-5".
  5. Stream long responses.
  6. Read usage from every response to calculate cost.

Pricing is $10 per million input tokens and $50 per million output tokens.

Claude Fable 5 API overview

Before you start

You need four things before making your first request.

  1. An Anthropic account

    Sign up at console.anthropic.com. Use the Console to manage API keys, usage, and billing.

  2. An API key

    Create one under API Keys. Copy it immediately and store it securely. Treat it like a password.

  3. Billing or an Enterprise plan

    Fable 5 is available on the standard Claude API and on consumption-based Enterprise plans. Confirm billing before sending production traffic. If you are still evaluating the model, read the overview of what Claude Fable 5 is.

  4. An SDK or HTTP client

    The official Anthropic SDKs are the easiest path, but raw HTTP works too.

Anthropic Console setup

Set your API key as an environment variable:

export ANTHROPIC_API_KEY="sk-ant-..."
Enter fullscreen mode Exit fullscreen mode

Both the Python and TypeScript SDKs read ANTHROPIC_API_KEY automatically. Do not hard-code keys in source code. If a key leaks, rotate it in the Console immediately.

One behavior to know: Fable 5 includes safeguards that route a small slice of sensitive requests, such as cybersecurity, biology and chemistry, and model distillation attempts, to Claude Opus 4.8. This happens in under 5% of sessions. Your request still succeeds, but the returned model field may show a different model.

Make your first Claude Fable 5 API call

Start with curl so you can see the raw request.

Endpoint:

POST https://api.anthropic.com/v1/messages
Enter fullscreen mode Exit fullscreen mode

Example:

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Summarize what makes a good REST API in 3 bullet points."
      }
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

The endpoint is documented in the Anthropic Messages API reference.

Required headers:

Header Purpose
x-api-key Your Anthropic API key
anthropic-version API version, usually 2023-06-01
content-type Must be application/json

Required body fields:

Field Purpose
model Use claude-fable-5
max_tokens Maximum output tokens
messages Conversation messages

Example response:

{
  "id": "msg_01ABC...",
  "type": "message",
  "role": "assistant",
  "model": "claude-fable-5",
  "content": [
    {
      "type": "text",
      "text": "- Predictable, resource-oriented URLs..."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 18,
    "output_tokens": 96
  }
}
Enter fullscreen mode Exit fullscreen mode

Important details:

  • content is a list, not a string.
  • A response can include text blocks, tool-use blocks, and thinking blocks.
  • Always check each block’s type.
  • stop_reason tells you why generation stopped.
  • usage gives exact token counts for cost calculation.

Call Fable 5 from Python

Install the official SDK:

pip install anthropic
Enter fullscreen mode Exit fullscreen mode

Basic request:

import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Summarize what makes a good REST API."
        }
    ],
)

for block in response.content:
    if block.type == "text":
        print(block.text)
Enter fullscreen mode Exit fullscreen mode

The SDK uses the same request shape as curl: model, max_tokens, and messages.

Add a system prompt

Use system for role, style, and output constraints:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=2048,
    system="You are a senior backend engineer. Be concise and use code examples.",
    messages=[
        {
            "role": "user",
            "content": "Write a Flask route that validates a JSON body."
        }
    ],
)

for block in response.content:
    if block.type == "text":
        print(block.text)
Enter fullscreen mode Exit fullscreen mode

Use the system prompt for stable instructions that apply across the whole conversation. Avoid changing it on every request if you plan to use prompt caching later.

Stream long outputs

Use streaming for long responses. It improves perceived latency and helps avoid request timeouts.

with client.messages.stream(
    model="claude-fable-5",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Explain idempotency keys for payment APIs."
        }
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    final = stream.get_final_message()

print(f"\n\nTokens: {final.usage.output_tokens}")
Enter fullscreen mode Exit fullscreen mode

stream.text_stream yields text chunks as they arrive. After the stream ends, stream.get_final_message() returns the complete final message, including usage.

Call Fable 5 from TypeScript / Node

Install the SDK:

npm install @anthropic-ai/sdk
Enter fullscreen mode Exit fullscreen mode

Basic request:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); // reads ANTHROPIC_API_KEY

const msg = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: "List 3 common API security mistakes.",
    },
  ],
});

console.log(msg.content);
Enter fullscreen mode Exit fullscreen mode

Extract only text blocks:

const text = msg.content
  .filter((block) => block.type === "text")
  .map((block) => block.text)
  .join("");

console.log(text);
Enter fullscreen mode Exit fullscreen mode

For frontend chat apps, stream from your backend route and forward chunks to the browser. Before wiring this into an app, test the request contract manually. The same workflow is covered in testing the ChatGPT API with Apidog.

Use tools / function calling with Fable 5

Tool use lets the model request calls to functions that you define. You provide a tool schema, the model decides whether to call it, your code runs the function, and then you send the result back.

Define a tool:

tools = [
    {
        "name": "get_order_status",
        "description": "Look up the status of a customer order by ID.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string"
                }
            },
            "required": ["order_id"],
        },
    }
]
Enter fullscreen mode Exit fullscreen mode

Send the tool definition with the request:

messages = [
    {
        "role": "user",
        "content": "What's the status of order A1855?"
    }
]

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    tools=tools,
    messages=messages,
)
Enter fullscreen mode Exit fullscreen mode

If the model wants to use a tool, the response has:

response.stop_reason == "tool_use"
Enter fullscreen mode Exit fullscreen mode

Then find the tool_use block, run your function, and send the result back:

if response.stop_reason == "tool_use":
    tool_use = next(
        block for block in response.content
        if block.type == "tool_use"
    )

    # Run your real function with the model's chosen input
    result = lookup_order(tool_use.input["order_id"])  # your code

    messages.append({
        "role": "assistant",
        "content": response.content,
    })

    messages.append({
        "role": "user",
        "content": [
            {
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": result,
            }
        ],
    })

    followup = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )
Enter fullscreen mode Exit fullscreen mode

The critical field is tool_use_id. The tool_result must reference the exact id from the original tool_use block.

For multi-step agents, wrap this in a loop:

while True:
    response = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )

    if response.stop_reason == "end_turn":
        break

    if response.stop_reason == "tool_use":
        tool_use = next(
            block for block in response.content
            if block.type == "tool_use"
        )

        result = run_tool(tool_use.name, tool_use.input)

        messages.append({
            "role": "assistant",
            "content": response.content,
        })

        messages.append({
            "role": "user",
            "content": [
                {
                    "type": "tool_result",
                    "tool_use_id": tool_use.id,
                    "content": result,
                }
            ],
        })
Enter fullscreen mode Exit fullscreen mode

The Python SDK also includes a tool runner, but implementing the loop manually makes it easier to add logging, approvals, retries, or audit trails.

Use adaptive thinking and effort

Fable 5 supports adaptive thinking. When enabled, the model decides when and how deeply to reason before answering.

Example:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},  # low | medium | high
    messages=[
        {
            "role": "user",
            "content": "Design a retry strategy for a flaky webhook receiver."
        }
    ],
)
Enter fullscreen mode Exit fullscreen mode

Use this selectively.

Good use cases:

  • multi-step design problems
  • debugging plans
  • architecture tradeoffs
  • long-horizon agent tasks

Avoid it for:

  • short lookups
  • simple transformations
  • one-line summaries
  • low-cost high-volume routes

effort controls depth and token usage. Higher effort can produce more thorough answers, but costs more.

Handle errors and fallback behavior

Production integrations should catch typed SDK exceptions instead of parsing error strings.

import anthropic

client = anthropic.Anthropic()

try:
    response = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": "Explain CORS preflight requests."
            }
        ],
    )

except anthropic.AuthenticationError:
    # 401: bad or missing API key
    print("Invalid API key. Rotate it in the Console and re-export.")

except anthropic.RateLimitError as e:
    # 429: too many requests
    retry_after = e.response.headers.get("retry-after", "60")
    print(f"Rate limited. Retry after {retry_after}s.")

except anthropic.BadRequestError as e:
    # 400: malformed request
    print(f"Bad request: {e.message}")
Enter fullscreen mode Exit fullscreen mode

Common errors:

Error HTTP status Cause Fix
AuthenticationError 401 Missing, malformed, or revoked API key Check ANTHROPIC_API_KEY and rotate if needed
RateLimitError 429 Too many requests or tokens Back off and retry
BadRequestError 400 Invalid request shape Check messages, max_tokens, roles, and JSON structure

The SDK retries 429 and 5xx errors with exponential backoff by default. If you add custom retry logic, read the retry-after header.

Do not hard-fail on fallback model names

Fable 5 may route some sensitive requests to Claude Opus 4.8. This is not an error. The request succeeds, but response.model may not equal claude-fable-5.

Avoid this:

assert response.model == "claude-fable-5"
Enter fullscreen mode Exit fullscreen mode

Prefer this:

actual_model = response.model
print(f"Answered by: {actual_model}")
Enter fullscreen mode Exit fullscreen mode

If your application needs to know which model answered, read it from the returned response object.

Estimate cost per request

Fable 5 pricing:

  • $10 per million input tokens
  • $50 per million output tokens

Every response includes exact usage:

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Write a SQL query to find duplicate emails."
        }
    ],
)

input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens

input_cost = input_tokens / 1_000_000 * 10
output_cost = output_tokens / 1_000_000 * 50
total = input_cost + output_cost

print(f"Input:  {input_tokens} tokens  = ${input_cost:.6f}")
print(f"Output: {output_tokens} tokens = ${output_cost:.6f}")
print(f"Total:  ${total:.6f}")
Enter fullscreen mode Exit fullscreen mode

Example:

2,000 input tokens  * $10 / 1M = $0.020
500 output tokens   * $50 / 1M = $0.025
Total                         = $0.045
Enter fullscreen mode Exit fullscreen mode

Output tokens cost five times more than input tokens, so control output length first:

  • set a reasonable max_tokens
  • ask for concise answers in the system prompt
  • avoid verbose formats unless needed
  • stream long responses instead of increasing timeouts

The cost math is similar to the one used for Claude Opus 4.8 pricing, but with Fable 5’s pricing numbers.

Test and debug the Claude Fable 5 API with Apidog

Before writing client code, send a few requests manually and inspect the exact response shape. Apidog works well for this because you can create the request, store secrets as environment variables, inspect streamed responses, and save the request for your team.

Testing Claude Fable 5 API with Apidog

1. Create the request

Create a new HTTP request in Apidog.

Set the method:

POST
Enter fullscreen mode Exit fullscreen mode

Set the URL:

https://api.anthropic.com/v1/messages
Enter fullscreen mode Exit fullscreen mode

2. Store your API key

Create an environment variable:

anthropic_api_key
Enter fullscreen mode Exit fullscreen mode

Store your Anthropic key as a secret value. This keeps it out of saved requests and shared exports.

3. Add headers

Add these headers:

x-api-key: {{anthropic_api_key}}
anthropic-version: 2023-06-01
content-type: application/json
Enter fullscreen mode Exit fullscreen mode

4. Add the JSON body

Use the minimal payload:

{
  "model": "claude-fable-5",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "Explain idempotency keys for payment APIs."
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Send the request and inspect:

  • content
  • stop_reason
  • usage
  • model

5. Test streaming

Add stream: true:

{
  "model": "claude-fable-5",
  "max_tokens": 1024,
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Explain idempotency keys for payment APIs."
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Apidog renders server-sent events as they arrive, so you can verify the streaming response before implementing it in your app.

6. Save the request and generate code

Save the working request into a collection. Then generate a starter snippet in Python, JavaScript, curl, or another language.

This gives you a known-good request to compare against your application code when debugging. When you are ready, Download Apidog and start with the minimal request body above.

Top comments (0)