DEV Community

Karim Khalil
Karim Khalil

Posted on

Model Invocation: Amazon Bedrock

🖥️ What is Amazon Bedrock?

Amazon Bedrock is an AWS service that makes it easy to build and scale generative AI applications using pre-trained Foundational Models. Key benefits include:

  • No infrastructure management
  • Unified API access to multiple model providers
  • Seamless integration with other AWS services (like Lambda, API Gateway, etc.)

🔑 Prerequisites

To follow along, ensure you have:

  • Python 3.7+
python --version
Enter fullscreen mode Exit fullscreen mode
  • pip installed
  • An AWS account with access to Amazon Bedrock

⚙️ Step 1: Set Up Your Environment

Install the AWS SDK for Python:

pip install boto3
Enter fullscreen mode Exit fullscreen mode

Configure your AWS credentials:

aws configure
Enter fullscreen mode Exit fullscreen mode

🧠 Step 2: Choose your foundational model

There are a variety of foundational models that Amazon Bedrock gives you access to from leading AI model providers. For example:

  • Anthropic Sonnet 3.5/3.7 - Great for complex tasks.
  • Stability AI Stable Diffusion – Image generation from text prompts.

For a complete list of all supported FMs, please click on this link.

📤Step 3: Invoke the Model

For this demo, we will choose Claude Sonnet 3.7 by Anthropic.

❗ Make sure you have access to the model on Amazon Bedrock using this link.

Here’s a basic example to call the Claude model using Bedrock:

📦 Cell 1 – Import libraries

import boto3
import json
Enter fullscreen mode Exit fullscreen mode

🌐 Cell 2 – Create Bedrock client

bedrock = boto3.client(service_name="bedrock-runtime",region_name="us-east-1")
Enter fullscreen mode Exit fullscreen mode

📝 Cell 3 – Prepare the request body

# Define the request payload
body = json.dumps({
    "max_tokens": 256,  # Maximum number of tokens in the response
    "messages": [
        {"role": "user", "content": "Hello, world"}  # User message to the AI model
    ],
    "anthropic_version": "bedrock-2023-05-31"  # Required version for Claude 3 models
})
Enter fullscreen mode Exit fullscreen mode

NOTE: Each model has a its own required input structure (body format).

➡️ You can find the correct request format for each model in the Amazon Bedrock Model Catalog.

🤖 Cell 4 – Invoke the model

# Invoke the Claude 3 Sonnet model with the given input
response = bedrock.invoke_model(
    body=body,
    modelId="us.anthropic.claude-3-7-sonnet-20250219-v1:0"  # Model ID for Claude 3.7 Sonnet
)
Enter fullscreen mode Exit fullscreen mode

NOTE: You can find the ID for each model under the Amazon Bedrock Model Catalog.

⚠️ If you get such error: “Invocation of model ID with on-demand throughput isn’t supported.”
✅ You will need to append “us.” to the beginning of the ID.
ℹ️ What happens when “us.” is added ? It enables cross-region inference which enables your inference (model invocation) requests to be dynamically routed across multiple AWS Regions. This helps manage unplanned traffic bursts, improve throughput, and enhance application resilience during peak usage times.

📥 Cell 5 – Parse and display the response

# Parse the response JSON and print the model's reply
response_body = json.loads(response.get("body").read())
print(response_body.get("content"))  # Display the generated output
Enter fullscreen mode Exit fullscreen mode

Let’s put it all together:

import boto3
import json

bedrock = boto3.client(service_name="bedrock-runtime",region_name="us-east-1")

body = json.dumps({
  "max_tokens": 256,
  "messages": [{"role": "user", "content": "Hello, world"}],
  "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock.invoke_model(body=body, modelId="us.anthropic.claude-3-7-sonnet-20250219-v1:0")

response_body = json.loads(response.get("body").read())
print(response_body.get("content"))
Enter fullscreen mode Exit fullscreen mode

📌 Wrapping Up

Amazon Bedrock + Python is a powerful combo for integrating generative AI into your workflows. With just a few lines of Python, you can tap into best-in-class models from multiple providers and build intelligent applications without ever managing infrastructure.

💡 Bonus: Streaming with invoke_model_with_response_stream

For real-time applications or when you want to show partial model responses as they're generated (like chat apps), you can use invoke_model_with_response_stream instead of invoke_model.

Here’s an example:

# Use the native inference API to send a text message to Anthropic Claude
# and print the response stream.

import boto3
import json

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID, e.g., Claude 3.7 Sonnet.
model_id = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# Define the prompt for the model.
prompt = "Give me the longest possible story in english"

# Format the request payload using the model's native structure.
native_request = {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 512,
    "temperature": 0.5,
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", "text": prompt}],
        }
    ],
}

# Convert the native request to JSON.
request = json.dumps(native_request)

# Invoke the model with the request.
streaming_response = client.invoke_model_with_response_stream(
    modelId=model_id, body=request
)

# Extract and print the response text in real-time.
for event in streaming_response["body"]:
    chunk = json.loads(event["chunk"]["bytes"])
    if chunk["type"] == "content_block_delta":
        print(chunk["delta"].get("text", ""), end="")
Enter fullscreen mode Exit fullscreen mode

🔍 What’s Happening Here?

  • Instead of waiting for the full model response, you get small parts (“chunks”) as soon as they’re available.
  • This is super useful for smoother user experiences in live UIs.

🛠️ Note: The format of streamed responses may vary slightly between models, so refer to the Bedrock Model Documentation for exact schema.

🙋 Need Help?

If you have any questions, run into issues, or just want to explore more advanced use cases like chaining Bedrock with Lambda, Step Functions, or building chatbots — don’t hesitate to ask!

Top comments (1)

Collapse
 
mkahil profile image
Mohammad Kahil

Love This! Clear, concise and very Informative!