Brain John

Posted on Feb 13 • Edited on May 27

Starters Guide: End-to-End Guide to Building with LLMs on SageMaker

#aws #llm #sagemaker #genai

Welcome to Starters Guide: End-to-End Guide to Building with LLMs on SageMaker! This guide is crafted for developers just starting with AWS, SageMaker, or generative AI. In this step-by-step walkthrough, you'll learn how to set up your AWS environment, understand key SageMaker components, and build with large language models (LLMs) within the environment while ensuring you gain the practical know-how and confidence to start your AI projects.

Prerequisites

Basic Technical Knowledge: You should have a foundational understanding of Python, cloud computing, and the basics of generative AI.
AWS Account Setup: Make sure you have an active AWS account. If you haven't set one up yet, please watch this video for a step-by-step guide on creating and activating your AWS account.

Setting Up Your AWS Environment

Before you start using Amazon SageMaker, it’s essential to have a well-configured, secure, and organized AWS environment. This section will guide you through accessing the AWS Management Console, locating SageMaker, and understanding best practices for setting up your environment.
Navigating the AWS Console
Follow these step-by-step instructions to access the AWS Management Console and locate the SageMaker service:
Access the AWS Management Console:

Step 1: Open your web browser and navigate to the AWS Management Console at https://aws.amazon.com/console/.
Step 2: Log in using your AWS account credentials. If you don’t have an account, refer to the AWS account setup guide provided earlier. The AWS console is the interface that provides a high-level overview and access to Amazon Web Services (AWS) services. The interface would look similar to this:

Locate the AWS Services

Step 3: Once logged in, you will see the AWS Management Console homepage, which displays a grid or list of AWS services.
Step 4: In the top search bar, type “SageMaker.” As you type, the console will filter services related to your search.
Step 5: From the search results, you might see multiple SageMaker-related services (e.g., “Amazon SageMaker” and “Amazon SageMaker Studio”). Click on “Amazon SageMaker” to access the service’s dedicated page.
Step 6: You are now in the SageMaker console, where you can begin setting up your domain, creating notebooks, and configuring other resources.

Note: As of December 3, 2024, Amazon SageMaker was renamed Amazon SageMaker AI, focusing on building, training, and deploying machine learning models, and the next generation of Amazon SageMaker has become an integrated platform with data, analytics, and AI capabilities.

Environment Best Practices

Setting up your AWS environment securely and in an organized manner is crucial for several reasons:

IAM Roles & Permissions: Apply the rule of least privileges. Use different roles when working on different tasks to reduce vulnerability.
Multi-Factor Authentication: Enable MFA for an extra layer of login security.
Credential Management: Avoid storing sensitive data like access keys directly, utilizing AWS Secrets Manager, or using environment variables.
Naming Conventions: Use consistent naming to easily identify and manage resources.
Resource Tagging: Resources can be tagged by project, environment, or cost center to simplify tracking and grouping.
Environment Isolation: Use separate AWS accounts or VPCs for different projects or stages to prevent cross-access.
Cost Management: Set billing alerts and monitor usage with tools like AWS Cost Explorer.
Documentation: Document your procedures to ensure consistency and efficiency.

Configuring Amazon SageMaker

In this section, we walkthrough configuring Amazon SageMaker to have a secure, well-structured environment and ready for our LLM projects. We will cover setting up a Unified Studio domain, providing access to Amazon Bedrock models, setting up a Virtual Private Cloud (VPC) and reviewing some additional settings. Click on ‘Create a Unified Studio domain’ to continue.

Creating a Unified Studio Domain

To get started with development within Amazon SageMaker, we need to create a unified studio domain. Think of this domain as a container/centralized workflow for organizing and managing users, projects, and resources. The domain manages various settings, configurations, and permissions.

First, you can decide to follow a quick or manual setup. I would recommend selecting the quick setup as it requires minimum configuration and is best for exploration purposes. Next head down to the ‘Quick Setup Settings’ to configure our domain. Key steps to set up a domain includes:

Domain Creation: Create a domain with with a custom name or use the default (domain-MM-DD-YYYY-HHMMSS generated at launch). This name is permanent and will be used to identify the workspace.
Domain Execution Role: This role enables SageMaker Studio to perform operations on your behalf across various AWS services.
Domain Service Role: This role is used by background services within SageMaker to function properly.
Data Encryption: By default, your data is automatically encrypted but if you prefer more control, you can switch to using your own encryption key through AWS Key Management Service (KMS).

Tip: If you’re setting up an environment specifically for this project or need a fresh set of permissions that adhere strictly to the least privilege principle and set new roles. If you already have roles configured with the appropriate permissions and want to maintain consistency across your projects then choose existing roles.

Access to Amazon Bedrock

Next, we need to grant access to Amazon Bedrock—a service that provides fully managed generative AI foundational models. We will be utilizing these connections for our LLM use case. How to grant access includes.

Initiate the Process: Within your SageMaker domain setup, look for the option to “Grant Access” to Amazon Bedrock models.
Enable Models: You will be prompted to either “Enable all models” or select specific models depending on your use case.
Provider-Specific Steps: Some model providers (e.g., Anthropic) may request additional information regarding your intended use case. Follow the instructions provided on-screen.
Confirmation: Once access is granted, verify that the models are enabled and appear in your SageMaker domain settings, as seen below.

VPC Configuration

A Virtual Private Cloud (VPC) is your own isolated section of the AWS cloud. It enables you to launch AWS resources in a virtual network that you define, ensuring enhanced security and controlled connectivity. This isolation is essential as it keeps foundation models provided by Amazon Bedrock isolated from our data.

Choosing or Creating a VPC

Create a New VPC: For dedicated, isolated environments, you can create a new VPC with default settings that AWS suggests.
Using an Existing VPC: If your organization already has a VPC configured for development, you can select it. Ensure that it meets the required security and connectivity standards.

Understanding Subnets

A subnet in Amazon VPC is a smaller section of your cloud network where you can place and organize your resources separately. Distributing your resources across at least three subnets in different Availability Zones enhances fault tolerance and connectivity.

Additional Settings

Aside from these primary settings, your environment will have the following configuration to integrate your environment:

Provisioning Role: Used to provision and manage resources (like S3 buckets for projects).
Manage Access Role: Grants permissions to manage access to shared data and projects across your organization.
Amazon S3 Bucket: SageMaker requires an S3 bucket for project storage of data and artifacts. The system will either use a default bucket or create a new one if necessary.
VPC and Subnets: As discussed, select the VPC and at least three subnets across different Availability Zones, but if you created a VPC, it will automatically create the respective subnets.

Tip: For exploration or initial setup, try sticking with AWS default setting as they are designed to work for most uses, then if your projects needs certain requirements like scalability, security, performance or cost management, the configuration can then be revisited.

We are ready to create our domain. Next load the unified studio and familiarize yourself with the interface.

Note: When you launch the SageMaker Studio instance, it essentially spins up an EC2 instance behind the scene to run the Jupyter notebook environment. SageMaker manages the EC2 instance for you and handles provisions and scaling of the underlying infrastructure based on your project needs. It is also best practices to terminate (delete) an EC2 instance once the project completion and the instance is no longer needed to avoid unnecessary charges and ensure proper resource management.

Building Your LLM Application

This section will develop an LLM application using a JupyterLab notebook instance in SageMaker. We will build a simple naive retrieval-augmented generation (RAG) pipeline with LlamaIndex. The RAG pipeline sources its extra knowledge base specifically from a sample of AWS documentation about SageMaker Partner AI Apps. We then convert the document into embeddings, index the content in a vector store, and execute queries using Bedrock models connected through LlamaIndex.

Launching development instance

Create a New Project:

With the SageMaker Studio already opened, click on the Projects tab.
Click Create New Project.
Enter a project name (e.g., simple-RAG) and a description for the project.
Click Continue to move the blueprint parameters customization page.
To meet your specific development and operational needs, the blueprint parameters are just underlying configuration—such as storage limits, log retention, and integration settings. We can go ahead and use the default parameters.
Review the selections, click ‘Create project’, and wait for your project environment to initialize.
Within your new project, you can start with the getting_started.ipynb and choose to launch a notebook instance via JupyterLab on the build tab.
For exploration purposes, this notebook instance will allow you to iteratively build and test your pipeline.

Developing the Application

The JupyterLab instance environment already comes with some packages pre-installed, you can use the command pip list on the terminal to see these packages. One important pre-installed package is boto3, which is the AWS Python’s Software Development Kit (SDK) to aid interactions with AWS services like connection to s3 bucket.

Before you begin development, install the following packages within your notebook environment. Open the terminal or a notebook cell and run:

pip install llama-index llama-index-llms-bedrock llama-index-embeddings-bedrock llama-index-readers-web

**LlamaIndex **is an open-source framework for data orchestration required for context augmentation in building LLM applications. It comes in Python and Typescript and will be using its Amazon Bedrock model connections to build our simple, naive RAG application as well as using a web page as our external knowledge base. Let’s import the relevant modules.

import logging
import sys
from IPython.display import Markdown, display
import os

import boto3
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core import VectorStoreIndex
from llama_index.core.settings import Settings
from llama_index.core.node_parser import SentenceSplitter

from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Next, we need to specify AWS credentials for authentication, authorization, security, and account association. To obtain an AWS access key, navigate to your AWS Management Console, click your Profile name, select Security credentials, navigate to the Access Keys section, click Create New Access Key, then click Show Access Key to reveal it, and finally save and download the key.

Next, we will specify those credentials as follows:

aws_access_key_id="AWS Access Key ID to use"
aws_secret_access_key="AWS Secret Access Key to use"
region_name="AWS Region to use, eg. us-east-1"

Next, we will utilize Meta’s llama3-70B model for LLM and Amazon’s Titan-embed-text embedding model. Also, we will be utilizing a sentence splitter for the text chunking strategy. These models and text chunking strategy can be specified as follows and saved within the LlamaIndex Settings configuration:

llm = Bedrock(model='meta.llama3-70b-instruct-v1:0', 
              context_size=8200,
              aws_access_key_id=aws_access_key_id,
              aws_secret_access_key=aws_secret_access_key,
              region_name=region_name)

embed_model = BedrockEmbedding(model_name='amazon.titan-embed-text-v1',
                               aws_access_key_id=aws_access_key_id,
                               aws_secret_access_key=aws_secret_access_key,
                               region_name=region_name)

# Configure global settings
Settings.text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=10)
Settings.llm = llm
Settings.embed_model = embed_model

Lastly, let's create a query engine that takes a natural language query and returns a rich context-aware response. We will be utilizing Amazon documentation as the external data source to improve LLM context via indexing and retriever as follows:

# Load HTML web page as a document
document = SimpleWebPageReader(html_to_text=True).load_data(['https://docs.aws.amazon.com/sagemaker/latest/dg/partner-apps.html'])

# Indexing document using vector store
index = VectorStoreIndex.from_documents(document, show_progress=True)

query_engine = index.as_query_engine()

Let’s process some queries as follows:

response = query_engine.query("What is the AWS AI partnership about?")
display(Markdown(f"<b>{response}</b>"))

response = query_engine.query("What is DeepChecks offering in the partnership?")
display(Markdown(f"<b>{response}</b>"))

Final Notes

This guide demonstrates how to set up a secure and organized AWS environment, configure Amazon SageMaker with a unified Studio domain, and integrate services like Amazon Bedrock to access generative AI models. Building on this robust AWS setup, the guide details how to develop a simple RAG application using open-source tools like LlamaIndex.

DEV Community