DEV Community

Cover image for How to Use the Claude API with Python
Jon Ibañez del campo
Jon Ibañez del campo

Posted on

How to Use the Claude API with Python

You have a Python script. You want it to think.

That’s the whole premise. This tutorial shows you how to connect your code to Claude — Anthropic’s AI model — so it can read, reason, and respond inside your own projects.

I wrote this after spending an afternoon figuring it out myself. No prior AI experience needed. If you’ve written a Python function before, you can follow this.


Before You Start

Two things are worth knowing up front.

First, the API costs money. Not much — $5 gets you weeks of normal usage — but it’s not free like the Claude.ai chat interface. You’ll need to add credits at console.anthropic.com.

Second, you’ll need Python 3.9 or later. Check yours:

python --version
Enter fullscreen mode Exit fullscreen mode

If it’s older than 3.9, update it at python.org. During installation on Windows, check "Add Python to PATH" — it’s easy to miss, and skipping it breaks everything.


Setup

Create a folder, set up a virtual environment, and install the SDK.

mkdir claude-project
cd claude-project
python -m venv venv
Enter fullscreen mode Exit fullscreen mode

Activate it:

# Mac/Linux
source venv/bin/activate

# Windows
venv\Scripts\activate
Enter fullscreen mode Exit fullscreen mode

Your terminal line should now start with (venv). That’s how you know it’s active. If you install without it, your packages end up in the wrong place.

pip install anthropic python-dotenv
Enter fullscreen mode Exit fullscreen mode

Your API Key

Go to console.anthropic.com, create an account, and generate a key under API Keys.

Store it in a .env file in your project folder:

ANTHROPIC_API_KEY=your-key-here
Enter fullscreen mode Exit fullscreen mode

Keep it out of GitHub:

echo .env > .gitignore
Enter fullscreen mode Exit fullscreen mode

This matters. A public API key gets found, used, and charged to you within hours.


The First Call

Here’s what talking to Claude from Python looks like:

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "What is a REST API?"
        }
    ]
)

print(message.content[0].text)
Enter fullscreen mode Exit fullscreen mode

Run it. Claude answers in your terminal.

Three things to understand about this call:

model is which version of Claude you’re using. claude-sonnet-4-6 is the default for most use cases — fast and capable.

max_tokens is the maximum length of Claude’s response. Set it too low and the response gets cut off mid-sentence. 1024 is a safe starting point.

messages is a list of turns in the conversation. Each turn has a role — user for your messages, assistant for Claude’s.


What Comes Back

The response object holds more than just text:

print(message.content[0].text)       # Claude's response
print(message.stop_reason)           # Why it stopped — usually "end_turn"
print(message.usage.input_tokens)    # Tokens in your message
print(message.usage.output_tokens)   # Tokens in Claude's reply
Enter fullscreen mode Exit fullscreen mode

Tokens are roughly words. Watching them matters because that’s what you’re paying for.


Giving Claude a Role

By default, Claude is a general assistant. A system prompt changes that.

Think of it as a briefing you give before the conversation starts:

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a Python code reviewer. Be direct. Point out issues first, then explain why.",
    messages=[
        {"role": "user", "content": "Review this: for i in range(len(my_list)): print(my_list[i])"}
    ]
)

print(message.content[0].text)
Enter fullscreen mode Exit fullscreen mode

Same model. Completely different behavior. The system prompt is where most of the real control lives.


Conversations

The API has no memory. Every call starts fresh unless you pass the history yourself.

This is what a real conversation looks like:

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

history = []

def chat(message: str) -> str:
    history.append({"role": "user", "content": message})

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a helpful programming assistant.",
        messages=history
    )

    reply = response.content[0].text
    history.append({"role": "assistant", "content": reply})

    return reply

print(chat("What is a decorator in Python?"))
print(chat("Show me a real example."))
print(chat("How would that work in Flask?"))
Enter fullscreen mode Exit fullscreen mode

Each call passes the full history. Claude reads it, understands the context, and continues the thread.

The mistake most people make: appending the user message but forgetting to append Claude’s reply. Do that and the next message arrives without context — Claude answers as if the conversation never happened.


Streaming

Waiting for a full response before printing anything works fine for scripts. For anything user-facing, streaming feels much better.

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain recursion simply."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

print()
Enter fullscreen mode Exit fullscreen mode

Instead of waiting, words appear as Claude generates them — the same experience you get in the Claude.ai interface.


A Real Use Case

Here’s a function worth keeping. It summarizes any text you pass to it:

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

def summarize(text: str, sentences: int = 3) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        system=f"Summarize the following text in {sentences} sentences. Return only the summary.",
        messages=[{"role": "user", "content": text}]
    )
    return response.content[0].text

article = """
The James Webb Space Telescope has captured the deepest infrared image
of the universe ever taken. The image covers a patch of sky approximately
the size of a grain of sand held at arm's length. It contains thousands
of galaxies, some of which formed less than a billion years after the
Big Bang. Scientists believe this data will reshape our understanding
of how the earliest galaxies formed and evolved.
"""

print(summarize(article, sentences=2))
Enter fullscreen mode Exit fullscreen mode

Change the system prompt and it becomes a translator, a classifier, a data extractor. The pattern is always the same.


Handling Errors

Networks fail. Rate limits happen. Wrap your calls:

from dotenv import load_dotenv
from anthropic import Anthropic, APIError, RateLimitError, APIConnectionError

load_dotenv()
client = Anthropic()

def ask(question: str) -> str:
    try:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            messages=[{"role": "user", "content": question}]
        )
        return response.content[0].text

    except RateLimitError:
        return "Rate limit reached. Wait a moment and try again."

    except APIConnectionError:
        return "Connection failed. Check your internet."

    except APIError as e:
        return f"API error {e.status_code}."
Enter fullscreen mode Exit fullscreen mode

Choosing a Model

Three options, three tradeoffs:

Model Use when
claude-sonnet-4-6 Most things. Fast, capable, cost-effective
claude-opus-4-6 Hard problems that need deep reasoning
claude-haiku-4-5-20251001 High volume, simple tasks, lowest cost

Start with Sonnet. Switch if you have a reason.


Things That Will Catch You

A few things I ran into that aren’t obvious from the docs:

The API costs money. Claude.ai doesn’t. It’s easy to assume they work the same way — they don’t. Add credits before you start.

load_dotenv() doesn’t call itself. If your key isn’t loading, this is probably why.

max_tokens being too low cuts responses mid-thought. If answers feel incomplete, raise it.

The conversation history needs both sides: user messages and Claude’s replies. Miss one and the context breaks.

On Mac and Linux, python might point to Python 2. Use python3 if things aren’t working as expected.


What's Next

The foundation is here. Where it goes depends on what you’re building.

Tool use lets Claude call your own Python functions — useful when you want it to interact with real data or external services.

Vision lets you send images alongside text, so Claude can read screenshots, diagrams, or documents.

Async support via AsyncAnthropic is worth exploring if you’re handling multiple requests at once.

The full documentation is at platform.claude.com/docs.


# The whole thing in ten lines

from dotenv import load_dotenv
from anthropic import Anthropic

load_dotenv()
client = Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Your message here."}]
)

print(response.content[0].text)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)