Amazon bedrock can be accessed using the AWS console, and each of the foundational models can be used in a playground for testing.
For this post, I will be sharing how I used the Python SDK - Boto3 to interact with Bedrock, and to generate text.
As a note, you will need to request access to the foundational models. To do this, go to Bedrock in the AWS console, click on "Model Access" and click on the edit button. Then, select all the models you want to use and save. It will take a while for them to be available.
Once you have requested access to the model, you will need to configure a local aws-cli configuration to work with bedrock-runtime API. To do this, you can use the following command:
aws configure
I would also recommend creating separate specific IAM credentials to use with the Bedrocks API.
Below is a simple Python script to be used when accessing the Bedrock API, and using the Cohere Command foundational model.
import boto3
import json
bedrock = boto3.client(
service_name='bedrock-runtime',
region_name="us-east-1"
)
prompt = """
Write a blog post that I can share on Medium, or Linked In, on how to maximize the range, and battery life of an EV, as a new EV owner.
"""
body = json.dumps({
"prompt": prompt,
"max_tokens": 2000, #Maximum number of tokens to generate. Responses are not guaranteed to fill up to the maximum desired length.
"temperature": 0.75, #The higher the temperature, the crazier the text. Tunes the degree of randomness in a generation. Lower temperatures mean fewer random generations.
"p": 0.01, #If set to float less than 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
"k": 0, #If set to int > 1, only the top k tokens sorted in descending order according to their probability are kept for generation.
#stop_sequences: ["\n\n"] #Sequence of tokens where generation is stopped.
})
# Define the type of model that will be used for the inference
modelId = 'cohere.command-text-v14'
accept = 'application/json'
contentType = 'application/json'
# Call the Bedrock API
response = bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())
print(response_body['generations'][0]['text'])
This is another slightly different Python script, where we specify the AWS credentials to be used, along with using the Anthropic Claude model.
#Anthropic Claude v2.1
# Make sure that you have the latest version of boto3
#pip install --upgrade boto3
#print(boto3.__version__) --> should be at least 1.28
import boto3
import json
# function to call the bedrock API
def call_bedrock(prompt, assistant):
access_key = 'ABC-YourKeyHere-123'
access_secret = 'ABC-YourSecretHere'
bedrock = boto3.client(service_name='bedrock-runtime',
region_name='us-east-1',
aws_access_key_id=access_key,
aws_secret_access_key=access_secret)
body = json.dumps({
"prompt": f"\n\nHuman:{prompt}\n\nAssistant:{assistant}",
"max_tokens_to_sample": 1000, #Maximum number of tokens to generate. Responses are not guaranteed to fill up to the maximum desired length.
"temperature": 0.4, #The higher the temperature, the crazier the text. Tunes the degree of randomness in a generation. Lower temperatures mean fewer random generations.
"top_p": 1, #If set to float less than 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
# "top_k": 250, #If set to int > 1, only the top k tokens sorted in descending order according to their probability are kept for generation.
#stop_sequences: ["\n\n"] #Sequence of tokens where generation is stopped.
})
# Define the type of model that will be used for the inference
modelId = 'anthropic.claude-v2'
accept = 'application/json'
# accept = '*/*'
contentType = 'application/json'
# Call the Bedrock API
response = bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())
print(response_body.get('completion'))
# Call_bedrock function
call_bedrock("Write a blog post that I can share on Medium, or Linked In, on how to maximize the range, and battery life of an EV, as a new EV owner.")
With either Python script, the code is accessing the AWS Bedrock-runtime API to generate text/content as an output.
Walking through the code quickly, it imports the necessary libraries, including boto3 and JSON. The boto3 library is then used to interact with the AWS Services, and the JSON library helps us to serialize and deserialize the JSON data.
After we save the Python file we can run it using a terminal, or within Visual Studio Code which is what I often do.
python bedrock_demo1.py
There are many parameters to test with, temperature is one to play around with and experiment to see how the higher value above zero impacts the content being generated with either base foundational model.
Top comments (0)