DEV Community

Cover image for Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock: A Practical Guide
Milad Rezaeighale for AWS Community Builders

Posted on

Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock: A Practical Guide

In the rapidly evolving field of Generative AI, the ability to fine-tune and deploy custom models is a crucial skill that enables businesses to tailor solutions to their unique needs. Amazon Bedrock, a powerful service within the Amazon Web Services (AWS) ecosystem, simplifies this process by offering a robust platform for building, fine-tuning, and deploying large language models (LLMs). Whether you’re looking to enhance a model's performance for a specific task or deploy it at scale, Amazon Bedrock provides the tools and infrastructure to do so efficiently.

Amazon Bedrock provides a seamless environment for fine-tuning and deploying these models, simplifying what can often be a complex process. If you're new to the concept of fine-tuning or want to delve deeper into its mechanics, I highly recommend A Deep Dive into Fine-Tuning which offers an excellent explanation.

In this article, I will guide you through the process of fine-tuning a language model using Amazon Bedrock. We'll focus on the most critical sections of the code, providing a clear understanding of the key components and steps involved in the fine-tuning process. The goal is to highlight the essential elements so you can grasp how the general workflow is implemented, without diving into every line of code.

For those who want to dive directly into the code or explore it further, the complete implementation is available in my GitHub repository.

Use Case: Summarizing Doctor-Patient Dialogues

For this example, we'll focus on a dataset containing doctor-patient dialogues sourced from the ACI-Bench dataset. Our task is to train the model to summarize these dialogues into structured clinical notes. The foundation model selected for this fine-tuning is Cohere's command-light-text-v14, which excels at generating concise and coherent text summaries.

Objective: we will:

  1. Set up the necessary AWS resources.
  2. Prepare and upload finefune dataset to S3.
  3. Create and submit a fine-tuning job.
  4. Purchase provisioned throughput.
  5. Test our fine-tuned model.
  6. Clean up

Step 1: Set up the necessary AWS resources

Before we begin, we need to ensure we have the necessary AWS SDK installed and configured. We'll use boto3, the AWS SDK for Python, to interact with various AWS services:

import boto3
import json
import os
Enter fullscreen mode Exit fullscreen mode

Step 2: Prepare and upload finefune dataset to S3

In this step, we prepare the dataset by formatting it into the JSON Lines (JSONL) structure required for fine-tuning on Amazon Bedrock. Each line in the JSONL file must include a Prompt and a Completion field.

# Define output path for JSONL
output_file_name = 'clinical_notes_fine_tune.jsonl'
output_file_path = os.path.join('dataset', output_file_name)
output_dir = os.path.dirname(output_file_path)

# Prepare and save the dataset in the fine-tuning JSONL format
with open(output_file_path, 'w') as outfile:
    for _, row in train_dataset.iterrows():
        formatted_entry = {
            "completion": row['note'],  # Replace 'note' with the correct column name
            "prompt": f"Summarize the following conversation.\n\n{row['dialogue']}"  # Replace 'dialogue' as needed
        }
        json.dump(formatted_entry, outfile)
        outfile.write('\n')
    print(f"Dataset has been reformatted and saved to {output_file_path}.")
Enter fullscreen mode Exit fullscreen mode

The following is the format of the data converted into JSONL:

{
    "completion": "<Summarized clinical note>",
    "prompt": "Summarize the following conversation:\n\n<Doctor-patient dialogue>"
}
Enter fullscreen mode Exit fullscreen mode

To make the dataset accessible for fine-tuning, it needs to be uploaded to an Amazon S3 bucket. The code ensures that the S3 bucket exists, creating it if necessary. Once the bucket is verified, the fine-tuning dataset, saved in JSON Lines format, is uploaded to the specified bucket. This step is essential, as Amazon Bedrock accesses the dataset from S3 during the fine-tuning process.

# Define the file path and S3 details
bucket_name = 'bedrock-finetuning-bucket25112024'
s3_key = abstracts_file

# Specify the region
region = 'us-east-1'  # Change this if needed

# Initialize S3 client with the specified region
s3_client = boto3.client('s3', region_name=region)

# Check if the bucket exists
try:
    existing_buckets = s3_client.list_buckets()
    bucket_exists = any(bucket['Name'] == bucket_name for bucket in existing_buckets['Buckets'])

    if not bucket_exists:
        # Create the bucket based on the region
        try:
            if bucket_region == 'us-east-1':
                # For us-east-1, do not specify LocationConstraint
                s3_client.create_bucket(Bucket=bucket_name)
                print(f"Bucket {bucket_name} created successfully in us-east-1.")
            else:
                # For other regions, specify the LocationConstraint
                s3_client.create_bucket(
                    Bucket=bucket_name,
                    CreateBucketConfiguration={'LocationConstraint': bucket_region}
                )
                print(f"Bucket {bucket_name} created successfully in {bucket_region}.")
        except Exception as e:
            print(f"Error creating bucket: {e}")
            raise e
    else:
        print(f"Bucket {bucket_name} already exists.")

    # Upload the file to S3
    try:
        s3_client.upload_file(output_file_path, bucket_name, s3_key)
        print(f"File uploaded to s3://{bucket_name}/{s3_key}")
    except Exception as e:
        print(f"Error uploading to S3: {e}")

except Exception as e:
    print(f"Error: {e}")
Enter fullscreen mode Exit fullscreen mode

Step 3: Create and submit a fine-tuning job

With the dataset uploaded to Amazon S3 and the necessary resources in place, the next step is to create and submit the fine-tuning job. This involves specifying the pre-trained foundation model, the job details, and the fine-tuning parameters.

In this example, we fine-tune the Cohere command-light-text-v14 model to summarize medical conversations. Below is the configuration used to submit the job:

# Define the job parameters
base_model_id = "cohere.command-light-text-v14:7:4k"
job_name = "cohere-Summarizer-medical-finetuning-job-v1"
model_name = "cohere-Summarizer-medical-Tuned-v1"

# Submit the fine-tuning job
bedrock.create_model_customization_job(
    customizationType="FINE_TUNING",
    jobName=job_name,
    customModelName=model_name,
    roleArn=role_arn,
    baseModelIdentifier=base_model_id,
    hyperParameters={
        "epochCount": "3",  # Number of passes over the dataset
        "batchSize": "16",  # Number of samples per training step
        "learningRate": "0.00005",  # Learning rate for weight updates
    },
    trainingDataConfig={"s3Uri": f"s3://{bucket_name}/{s3_key}"},
    outputDataConfig={"s3Uri": f"s3://{bucket_name}/finetuned/"}
)
Enter fullscreen mode Exit fullscreen mode

Key Parameters:

  • Base Model: The pre-trained model (cohere.command-light-text-v14) serves as the foundation for customization.
  • Job Name and Model Name: These identifiers help track the fine-tuning job and the resulting fine-tuned model for future deployments.

Hyperparameters:

  • epochCount: Specifies the number of training cycles. For demonstration, three epoch is used, but more epochs may yield better results for larger datasets.
  • batchSize: Determines how many samples are processed in each training step. A value of 16 balances memory usage and training efficiency.
  • learningRate: Sets the pace at which the model learns. Lower values ensure stable training but may require more time to converge.

Training and Output Configuration:The trainingDataConfig points to the S3 location of the dataset.The outputDataConfig specifies where the fine-tuned model will be stored.

Considerations:
The parameters, especially the hyperparameters, can be adjusted to optimize the fine-tuning process:

  • Smaller datasets may benefit from lower batchSize values.
  • Complex tasks may require more epochs to achieve convergence.
  • Learning rates should be fine-tuned to balance training stability and speed.

This step officially kicks off the fine-tuning process, allowing Amazon Bedrock to handle the heavy lifting of training your model with the provided data and configuration.

The status of the fine-tuning job can be also seen:

status = bedrock.get_model_customization_job(jobIdentifier="cohere-Summarizer-medical-finetuning-job-v1")["status"]
print(f"Job status: {status}")
Enter fullscreen mode Exit fullscreen mode

The status of the fine-tuning job can be also seen in the Bedrock console:

Taining job in custom model - Amazon Bedrock

Step 4: Purchase provisioned throughput

To use the model for inference, you need to purchase "Provisioned Throughput." On Amazon Bedrock sidebar in your AWS console, go to "Custom Models" and then choose the "Models" tab, select the model you have trained, and then click on "Purchase Provisioned Throughput."

Purchase provisioned throughput

Give the provisioned throughput a name, select a commitment term (you can choose "No Commitment" for testing), and then click "Purchase Provisioned Throughput." You will be able to see the estimated price as well. Once this is set up, you'll be able to use the model for inference.

Commitment in Amazon Bedrock

To access your deployed model's endpoint, you'll need its ARN. Go to the "Provisioned Throughput" section under Inference in the sidebar. Select the name of your fine-tuned model, and on the new page, copy the ARN for use in the next step. Keep in mind that provisioning throughput may take a few minutes to complete.

Custom model's ARN

Step 5: Test our fine-tuned model

In the next step, we will make a request to the model for inference. Be sure to replace YOUR_MODEL_ARN with the ARN you copied earlier.

# Initialize Bedrock runtime client
bedrock_runtime = boto3.client(service_name="bedrock-runtime", region_name=bedrock_region)

# Define a prompt for model inference
prompt = """
[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?  
[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.  
[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?  
[patient] It’s mostly when I stand up quickly or after I've been walking for a while.  
[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?  
[patient] No shortness of breath, but I do feel my heart racing sometimes.  

[doctor] How about your medications? Are you taking them as prescribed?  
[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.  
[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.  
[patient] Okay, doctor.  

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?  
[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.  
[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.  
[patient] Yes, I’ll make sure to get back on track.  

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.  
[patient] I see. What does that mean for me?  
[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.  
[patient] That makes sense. Thank you, doctor.  

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?  
[patient] Yes, especially after long days at work.  
[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.  
[patient] Thank you, doctor.  

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?  
[patient] No, I think that’s all for now.  
[doctor] Great. See you in two weeks. 
"""

# Define the inference request body
body = {
    "prompt": prompt,
    "temperature": 0.5,
    "p": 0.9,
    "max_tokens": 80,
}

# Specify the ARN of the custom model
custom_model_arn = "YOUR_MODEL_ARN" #Put your model ARN here

# Invoke the custom model for inference
try:
    response = bedrock_runtime.invoke_model(
        modelId=custom_model_arn,
        body=json.dumps(body)
    )

    # Read and parse the response
    response_body = response['body'].read().decode('utf-8')
    result = json.loads(response_body)

    # Extract the summary from the response
    summary_text = result['generations'][0]['text']
    print("Extracted Summary:", summary_text)
except Exception as e:
    print(f"Error invoking model: {e}")
Enter fullscreen mode Exit fullscreen mode

I tested it with the following conversation to evaluate its ability to generate concise and meaningful summaries for medical dialogues. The input conversation is designed to reflect a real-world doctor-patient interaction, emphasizing symptoms, medication adherence, and a follow-up plan:

[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?

[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.

[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?

[patient] It’s mostly when I stand up quickly or after I've been walking for a while.

[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?

[patient] No shortness of breath, but I do feel my heart racing sometimes.

[doctor] How about your medications? Are you taking them as prescribed?

[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.

[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.

[patient] Okay, doctor.

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?

[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.

[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.

[patient] Yes, I’ll make sure to get back on track.

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.

[patient] I see. What does that mean for me?

[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.

[patient] That makes sense. Thank you, doctor.

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?

[patient] Yes, especially after long days at work.

[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.

[patient] Thank you, doctor.

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?

[patient] No, I think that’s all for now.

[doctor] Great. See you in two weeks.

You can also test the inference directly from the Playground in the Amazon Bedrock console. To do this, navigate to Chat/Text under the Playground section, select your fine-tuned model, and enter your desired prompt.

Playground in Amazon Bedrock

Input to the model:

[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?

[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.

[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?

[patient] It’s mostly when I stand up quickly or after I've been walking for a while.

[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?

[patient] No shortness of breath, but I do feel my heart racing sometimes.

[doctor] How about your medications? Are you taking them as prescribed?

[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.

[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.

[patient] Okay, doctor.

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?

[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.

[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.

[patient] Yes, I’ll make sure to get back on track.

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.

[patient] I see. What does that mean for me?

[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.

[patient] That makes sense. Thank you, doctor.

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?

[patient] Yes, especially after long days at work.

[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.

[patient] Thank you, doctor.

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?

[patient] No, I think that’s all for now.

[doctor] Great. See you in two weeks.

Model's Response:

Amazon Bedrock playground

Step 6. Cleanup

To avoid incurring additional costs, please ensure that you remove any provisioned throughput. You can remove provisioned throughput by navigating to the Provisioned Throughput section from the sidebar in the Amazon Bedrock console. Select the active provisioned throughput and delete it.

Conclusion

Fine-tuning and deploying custom AI models on Amazon Bedrock unlocks the potential to create tailored solutions for specific use cases, such as summarizing medical dialogues. This guide has walked you through every step of the process, from preparing your dataset and configuring fine-tuning parameters to testing your model and deploying it for real-world inference. By leveraging the robust infrastructure and tools provided by Amazon Bedrock, you can streamline the fine-tuning process and focus on delivering impactful AI-driven solutions.

The steps outlined in this article illustrate how even a relatively small, structured dataset can yield meaningful results with careful preparation and parameter tuning. Whether you're exploring summarization, classification, or other NLP tasks, Amazon Bedrock makes advanced model customization accessible and efficient.

As you begin your fine-tuning journey, remember to experiment with hyperparameters and test your model rigorously to ensure optimal performance. Lastly, always clean up unused resources to avoid unnecessary costs. For further exploration, check out the complete implementation on [https://github.com/miladrezaei-ai/bedrock-custom-model-finetuning].

With Amazon Bedrock, the possibilities for building intelligent, custom AI models are endless—empowering businesses to innovate and thrive in the evolving AI landscape.

Top comments (2)

Collapse
 
jasondunn profile image
Jason Dunn [AWS]

Excellent article, especially with all the great examples!

Collapse
 
miladrezaei profile image
Milad Rezaeighale

Thank you for your feedback, Jason!