DEV Community: Hassan Sherwani

Building Serverless Agentic Workflows with Amazon Bedrock

Hassan Sherwani — Sat, 30 Nov 2024 02:15:21 +0000

In the rapidly evolving landscape of AI-driven automation, serverless computing stands as a transformative force, enabling developers to craft scalable and efficient applications without the constraints of managing infrastructure. Amazon Bedrock, a fully managed service from AWS, unlocks the potential of serverless computing
and advanced AI models, empowering the creation of intelligent workflows led by AI agents. In this blog, we will delve into the course on Serverless Agentic Workflows with Amazon Bedrock and share coding examples to inspire your journey.

What is Amazon Bedrock?

Unlock the potential of Amazon Bedrock, a fully managed service that connects developers to a diverse range of foundation models (FMs) from top AI providers like AI21 Labs, Anthropic, Stability AI, and Amazon Titan. Thanks to its serverless environment, Bedrock allows you to effortlessly build and scale AI-driven applications without the hassle of infrastructure management. Focus on innovation and creativity while Bedrock takes care of the technical complexities.

Enhance your workflows with Serverless Agentic workflow, allowing you to automate intricate tasks using intelligent AI agents. These versatile agents can seamlessly interact with external systems, make informed decisions, invoke APIs, and pull data when necessary. Experience the future of productivity by harnessing the power of AI to elevate your projects.

Key Concepts from the Course

The course "Serverless Agentic Workflows with Amazon Bedrock" focuses on building intelligent workflows using Amazon Bedrock. Here are the main lessons covered in the course:

Creating Your First Serverless Agent:

Learn to initialize an AI agent and integrate it with your workflows.

Connecting to External Services:

Make the agent interact with real-world systems, such as CRMs or databases.

Equipping the Agent with a Code Interpreter:

Enable the agent to perform complex calculations and make data-driven decisions.

Implementing Guardrails:

Add security and ethical guardrails to control the agent’s behavior.

Connecting to Support Documents:

Use Retrieval-Augmented Generation (RAG) to allow your agent to query documents and resolve issues autonomously.
Let’s break down these lessons with coding examples!

1. Creating Your First Serverless Agent

In this section, we’ll initialize an AI agent using Amazon Bedrock. The agent can be used to handle tasks like customer support, product recommendation, or simple Q&A.

Here’s how you can set up your first serverless agent;

Prerequisites

AWS account
Amazon Bedrock service access
Basic knowledge of Python Let's initialize an AI Agent

import boto3

# Initialize the Amazon Bedrock client
client = boto3.client('bedrock')

# Set up the parameters for the agent
parameters = {
    'modelId': 'AmazonTitan',
    'input': 'Hello, how can I assist you today?'
}

# Call the model to initialize the agent
response = client.invoke_model(**parameters)

# Output the agent's response
print(response['output'])

This code effectively initializes an Amazon Titan model by utilizing the invoke_model function. It sends a straightforward query: "Hello, how can I assist you today?" The printed response from the agent illustrates its capability to generate dynamic and relevant answers, which showcases the model's potential for providing useful interactions.

2. Connecting to External Services

To make the AI agent more powerful, we can integrate it with external services like a CRM or ticketing system. For example, we can connect to a database to fetch customer records.

Fetching Data from a Database

import boto3

# Set up DynamoDB client
dynamodb = boto3.client('dynamodb')

def fetch_customer_info(customer_id):
    response = dynamodb.get_item(
        TableName='CustomerRecords',
        Key={'customer_id': {'S': customer_id}}
    )
    return response.get('Item', {})

# Fetch customer info
customer_id = '12345'
customer_info = fetch_customer_info(customer_id)

print(customer_info)

In this example, the code connects to a DynamoDB table called CustomerRecords and fetches information for a specific customer. This integration allows the agent to retrieve and use real-time customer data to make informed decisions.

3. Equipping the Agent with a Code Interpreter

The agent can be equipped to execute code or perform calculations, which helps it make data-driven decisions.

# Initialize the Bedrock client
client = boto3.client('bedrock')

# Code for calculating product prices
def calculate_price(base_price, discount):
    return base_price - (base_price * discount / 100)

# Implement logic with agent
def agent_with_code_interpreter():
    base_price = 100  # Sample base price
    discount = 10     # Sample discount percentage
    final_price = calculate_price(base_price, discount)
    return f"The final price after a {discount}% discount is: ${final_price}"

# Call the agent
response = client.invoke_model(
    modelId='AmazonTitan',
    input=agent_with_code_interpreter()
)

print(response['output'])

Here, the agent performs a simple price calculation, reducing a base price by a specified discount percentage. This demonstrates how agents can use custom code logic to enhance their decision-making ability.

4. Implementing Guardrails

Guardrails ensure that the AI agent behaves responsibly. For example, you can add logic to prevent the agent from revealing sensitive information or making unethical decisions.

def guardrails(agent_output):
    prohibited_keywords = ['password', 'credit card']
    for keyword in prohibited_keywords:
        if keyword in agent_output:
            return "Sorry, I cannot provide that information."
    return agent_output

# Example agent output
agent_output = "Here is your password: 12345"

# Apply guardrails
safe_output = guardrails(agent_output)
print(safe_output)

In this code, the agent’s output is filtered for sensitive information like passwords. If any such information is detected, the agent responds with a safe message.

5. Connecting to Support Documents with Retrieval-Augmented Generation (RAG)

Using RAG, your agent can query a database of support documents to retrieve the most relevant information when a customer asks a question.

# Initialize the Bedrock client
client = boto3.client('bedrock')

# Define a function to query support documents
def query_support_documents(query):
    response = client.retrieve_documents(
        query=query,
        document_store='SupportDocsStore'
    )
    return response['documents'][0]  # Return the first relevant document

# Query the support documents
query = "How do I reset my password?"
document = query_support_documents(query)

print(document)

In this code, the agent queries a document store (e.g., a collection of FAQ articles) to find the best document that answers the user's question about resetting a password.

Conclusion

This blog discusses building a Serverless Agentic workflow with Amazon Bedrock. We highlight key course lessons and provide coding examples showing how AI agents can interact with external services, perform calculations, and retrieve information. Amazon Bedrock combines serverless architecture with AI models, making creating scalable applications for customer support and document processing easier.

For those wanting to learn more, the Serverless Agentic Workflows with Amazon Bedrock course offers a step-by-step guide on creating and managing these workflows. It covers automating customer interactions, integrating with external services, and establishing ethical guidelines for AI agents.

References:

Serverless Agentic Workflows with Amazon Bedrock Course: https://learn.deeplearning.ai/courses/serverless-agentic-workflows-with-amazon-bedrock/lesson/1/introduction. -Amazon bedrock agents: https://aws.amazon.com/bedrock/agents/

Harnessing Multi-Agent Systems with CrewAI: Concepts, Coding, and Real-World Applications

Hassan Sherwani — Sat, 23 Nov 2024 04:16:56 +0000

Introduction

In the exciting world of AI and automation, Multi-Agent Systems (MAS) are making a big impact by spreading tasks among smart agents. These systems offer great benefits like scalability, teamwork, and efficiency, helping us tackle complex problems in various fields. In this blog, we’ll dive into the basics of multi-agent systems, highlight the amazing features of CrewAI, and chat about how these ideas can be used in areas like health insurance, finance, human resources, and real estate. Let’s explore together!

The Concept of Multi-Agent Systems

A Multi-Agent System (MAS) consists of multiple autonomous entities called agents, which collaborate or compete to achieve a common goal. These systems are inspired by distributed problem-solving and emulate human teamwork by dividing large tasks into manageable subtasks.

Key Characteristics of MAS

Autonomy:

Each agent operates independently, making decisions without external intervention.

Communication:

Agents share data or task progress through defined protocols.

Collaboration:

Agents coordinate their actions to fulfill a common objective.

Adaptability:

MAS can adjust to dynamic environments and task requirements.

Advantages of MAS

Scalability:

Tasks are distributed across agents, reducing bottlenecks.

Modularity:

Agents can be added, removed, or replaced without impacting the system's functionality.

Efficiency:

Parallel execution of subtasks optimizes resource usage.

Examples in Action

In logistics, MAS can optimize delivery routes by assigning tasks to delivery agents.
In robotics, teams of autonomous robots collaborate to assemble complex machinery.

What is CrewAI?

CrewAI is a powerful Python-based framework that streamlines the creation and execution of multi-agent systems. It empowers developers to define agents, assign tasks, and orchestrate their interactions within a crew with ease. By harnessing the capabilities of large language models (LLMs) like GPT-3.5 and GPT-4, CrewAI delivers unparalleled efficiency and effectiveness in managing complex systems.

Key Components of CrewAI

Agents:

Autonomous entities with defined roles, goals, and tools.

Tasks:

Descriptions of specific objectives assigned to agents.

Crew:

The orchestrator that binds agents and tasks, enabling collaborative execution.

Technical Implementation of MAS using CrewAI

Let’s dive into the process of building a CrewAI-powered MAS for tailoring job applications, as demonstrated in the provided notebook.

Step 1: Define the Agents

Each agent is designed to handle a specific part of the process.

from crewai import Agent

# Researcher Agent
researcher = Agent(
    role="Tech Job Researcher",
    goal="Analyze job postings to extract key qualifications and skills.",
    tools=[scrape_tool, search_tool],
    verbose=True,
    backstory="Expert in identifying job requirements."
)

# Profiler Agent
profiler = Agent(
    role="Personal Profiler for Engineers",
    goal="Compile detailed applicant profiles.",
    tools=[scrape_tool, search_tool, read_resume, semantic_search_resume],
    verbose=True,
    backstory="Skilled in creating comprehensive profiles from diverse data."
)

Step 2: Define the Tasks

Tasks specify what each agent must accomplish.

from crewai import Task

# Research Task
research_task = Task(
    description="Analyze job postings to extract required skills.",
    expected_output="A structured list of job requirements.",
    agent=researcher,
    async_execution=True
)

# Profile Task
profile_task = Task(
    description="Create a detailed profile using applicant information.",
    expected_output="A comprehensive professional profile document.",
    agent=profiler,
    async_execution=True
)

Step 3: Create the Crew

Combine agents and tasks into a cohesive crew.

from crewai import Crew

job_application_crew = Crew(
    agents=[researcher, profiler],
    tasks=[research_task, profile_task],
    verbose=True
)

Step 4: Execute the Crew

Provide inputs and initiate the workflow.


inputs = {
    'job_posting_url': 'https://example.com/job-posting',
    'github_url': 'https://github.com/user',
    'personal_writeup': "An experienced software engineer with expertise in AI."
}

result = job_application_crew.kickoff(inputs=inputs)

Case Studies/ Application in Real World

1. Health Insurance

Problem: Automating claim approvals with MAS.

Agents:
Claim Validator: Verifies the authenticity of claims using insurance databases.
Medical Expert: Assesses the relevance of diagnoses.
Fraud Detector: Identifies fraudulent claims using anomaly detection.
Workflow:
Extract claim details.
Validate against policy documents.
Approve or flag claims for manual review.
Outcome: Faster claim processing and reduced fraud.

2. Finance

Problem: Portfolio optimization for investment clients.

Agents:
Market Analyst: Monitors real-time market data.
Risk Assessor: Evaluates portfolio risks.
Investment Strategist: Recommends asset allocations.
Workflow:
Collect client preferences and financial goals.
Generate optimal portfolios using agent collaboration.
Provide actionable insights to clients.
Outcome: Improved investment returns and client satisfaction.

3. Human Resources

Problem: Streamlining candidate selection.

Agents:
Job Matcher: Analyzes resumes against job descriptions.
Interviewer: Generates targeted interview questions.
Skill Evaluator: Assesses candidate skills through semantic search.
Workflow:
Parse and analyze candidate resumes.
Generate evaluation reports.
Assist interviewers with data-driven insights.
Outcome: Enhanced recruitment efficiency.

4. Real Estate

Problem: Personalized property recommendations.

Agents:
Market Researcher: Gathers property listings.
Buyer Profiler: Analyzes buyer preferences.
Price Evaluator: Predicts property prices using historical data.
Workflow:
Collect buyer requirements.
Match properties based on preferences.
Provide a ranked list of suitable properties.
Outcome: Accelerated property discovery.

Future of MAS and CrewAI

The flexibility and modularity of Multi-Agent Systems (MAS) provide exciting opportunities for diverse applications across various fields. By utilizing tools like CrewAI, developers can efficiently prototype and deploy intelligent systems, fostering innovation. Looking ahead, the integration of advanced technologies such as reinforcement learning, the Internet of Things (IoT), and blockchain can significantly expand the capabilities of MAS, paving the way for more powerful and effective solutions.

Conclusion

Multi-Agent Systems (MAS), powered by CrewAI, deliver unmatched efficiency in task automation and problem-solving. By mastering the underlying theory and harnessing it creatively, we have the potential to revolutionize industries such as healthcare and real estate. Our compelling case studies prove that MAS is not just an abstract idea; it is an effective, practical solution to real-world challenges.

CrewAI is just one powerful tool in our arsenal. In our upcoming blogs, we will confidently explore a variety of exciting options, including Lang-graph, Autogen, AWS's built-in agent, and more. Stay tuned for insights that will elevate your understanding!

References

CrewAI : https://learn.crewai.com/\
CrewAI Github: https://github.com/crewAIInc/crewAI

Exploring the Exciting Possibilities of NVIDIA Megatron LM: A Fun and Friendly Code Walkthrough with PyTorch & NVIDIA Apex!

Hassan Sherwani — Sat, 26 Oct 2024 00:50:53 +0000

In the extensive realm of GenAI, large language models (LLMs) have captured remarkable attention for their capacity to execute tasks such as text generation, translation, and even intricate reasoning. NVIDIA's Megatron LM stands out as a superior tool in this domain, specifically crafted to adeptly train massive models with billions of parameters.
This write-up will attempt to explore NVIDIA Megatron LM, its architecture configuration, its uses in various applications, and a code walkthrough for training your own Megratron LM.

A Friendly Intro to NVIDIA Megatron LM?

NVIDIA Megatron LM is a framework designed for training large transformer models that are optimized for distributed GPU architectures. It is built to scale across hundreds or thousands of GPUs, allowing efficient handling of models with billions of parameters. This makes it ideal for advanced natural language processing (NLP) tasks.

One of Megatron's core advantages is its ability to split training across GPUs and nodes, enabling faster training times and the ability to train very large models that would otherwise be computationally infeasible.

Key Features of Megatron LM

1. Scalable Training

Megatron supports data, model, and pipeline parallelism, which allows for efficient training of large models.

2. Mixed-Precision Training

Megatron uses NVIDIA’s AMP (Automatic Mixed Precision) to enhance training performance by reducing memory usage and accelerating computations.

3. Optimized for GPUs

Leveraging NVIDIA’s latest GPUs (such as A100 or V100), Megatron is tuned for maximum performance.

4. Transformer-based Architecture

Like many modern language models (e.g., GPT-3), Megatron is built on the transformer architecture, which has revolutionized the Natural Language domain.

Getting Started with NVIDIA Megatron LM

Now that you have a high-level understanding of Megatron LM, let's explore how to use it in practice.

Step 1: Setting Up the Environment

In order to train Megatron models, you will need access to a system with multiple GPUs. The recommended setup is a machine with an NVIDIA GPU and a minimum of 16GB of memory. You can use cloud providers such as AWS, Azure, or Google Cloud to set up instances with NVIDIA GPUs.

First, let's install the necessary libraries, which include PyTorch and NVIDIA's Apex library for mixed-precision training.

# Install necessary dependencies
sudo apt update
sudo apt install python3-pip

# Install PyTorch with GPU support
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

# Clone Megatron LM repository
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM

# Install Megatron LM dependencies
pip3 install -r requirements.txt

# Install NVIDIA Apex for mixed-precision training
git clone https://github.com/NVIDIA/apex
cd apex
pip3 install -v --disable-pip-version-check --no-cache-dir ./

Step 2: Preprocessing the Data

Megatron requires tokenized input data in a specific format, and datasets can be preprocessed using the provided tokenization scripts. In this example, we'll use an easy-to-go dataset that is, English Wikipedia.

# Download English Wikipedia data
wget https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
bzip2 -d enwiki-latest-pages-articles.xml.bz2

# Run preprocessing
python tools/preprocess_data.py \
  --input enwiki-latest-pages-articles.xml \
  --output-prefix my-wikipedia-data \
  --vocab-file gpt2-vocab.json \
  --merge-file gpt2-merges.txt \
  --dataset-impl mmap \
  --tokenizer-type GPT2BPETokenizer \
  --workers 4

This command tokenizes the dataset and converts it into a suitable template for training with the Megatron LM model.

Step 3: Configuring the Model

Megatron LM provides a highly customizable setup. For example, you can adjust the number of transformer layers, model size, hidden size, and other parameters. Let's set up a simple transformer model with a small number of layers for demonstration purposes. In traditional machine learning workflows, we usually use a configuration pipeline, so our goal is to adhere to best practices.

# Configuration of an LLM model
python pretrain_gpt.py \
    --num-layers 12 \
    --hidden-size 768 \
    --num-attention-heads 12 \
    --micro-batch-size 4 \
    --global-batch-size 16 \
    --seq-length 1024 \
    --max-position-embeddings 1024 \
    --train-iters 10000 \
    --lr 0.0001 \
    --min-lr 1e-5 \
    --lr-decay-style cosine \
    --lr-decay-iters 320000 \
    --lr-warmup-fraction 0.01 \
    --adam-beta1 0.9 \
    --adam-beta2 0.95 \
    --adam-eps 1e-08 \
    --weight-decay 1e-2 \
    --clip-grad 1.0 \
    --tokenizer-type GPT2BPETokenizer \
    --vocab-file gpt2-vocab.json \
    --merge-file gpt2-merges.txt \
    --data-path ./my-wikipedia-data \
    --save ./checkpoints \
    --save-interval 1000 \
    --log-interval 100 \
    --fp16 \
    --tensor-model-parallel-size 1

In this configuration:

num-layers defines the number of transformer layers.
hidden-size sets the size of the hidden layers in each transformer block.
global-batch-size specifies the overall batch size across all GPUs.
lr and lr-decay-style define the learning rate and its decay over time.
The model will checkpoint every 1,000 iterations, allowing you to resume training from the last checkpoint.

Step 4: Launching the Training Process

Once the model is set up, you can start training by executing the pretraining script, which is capable of handling both single-node and multi-node GPU setups.

python pretrain_gpt.py \
    --tensor-model-parallel-size 4 \
    --num-layers 24 \
    --hidden-size 1024 \
    --num-attention-heads 16 \
    --micro-batch-size 4 \
    --global-batch-size 32 \
    --seq-length 1024 \
    --train-iters 20000 \
    --lr 0.0001 \
    --data-path ./my-wikipedia-data \
    --save ./checkpoints \
    --fp16

This setup will automatically distribute the training across 4 GPUs using model parallelism. The training process may take days or weeks, depending on the model size and GPU power. To get better computation, one might enhance GPU size or use parallel processing using RAPIDS(Refer to my blog):

Nvidia Integration with Databricks: Parallel processing for efficient ML solutions | by Hassan Sherwani | Oct, 2024 | Medium

In the ever-evolving landscape of artificial intelligence(AI) and data science, speed and scalability are key. As models grow larger and…

medium.com

Step 5: Fine-Tuning the Model

After pretraining, you might want to fine-tune the model for specific tasks such as text classification or question answering. Fine-tuning involves loading the pre-trained weights and further training on a smaller, task-specific dataset.

python tools/finetune_gpt.py \
    --pretrained-checkpoint ./checkpoints \
    --task TASK_NAME \
    --data-path ./task-specific-data \
    --num-layers 24 \
    --hidden-size 1024 \
    --num-attention-heads 16 \
    --seq-length 1024 \
    --train-iters 5000 \
    --lr 0.00001 \
    --global-batch-size 16 \
    --fp16

Replace TASK_NAME with the name of the task (e.g., text generation, classification, Q&A chatbot etc), and the data path should point to the relevant dataset.

Conclusion

NVIDIA Megatron LM is a powerful tool for training massive language models, offering unparalleled scalability and performance. By following the steps outlined in this blog, you can start building and training your own large language models, fine-tuning them for specific NLP tasks, and leveraging the cutting-edge advancements in the AI field.

With frameworks like Megatron LM, we are entering an era where language models can be used for truly transformative applications. These applications include real-time translation and generating human-like responses in conversation. Whether you are a researcher or a developer, experimenting with Megatron can lead to new possibilities in AI-driven innovation.

Stay tuned for more!

References

Nvidia Debuts Enterprise-Focused 530B Megatron Large Language Model and Framework at Fall GTC21: https://www.aiwire.net/2021/11/09/nvidia-debuts-enterprise-focused-530b-megatron-large-language-model-and-framework-at-fall-gtc21/
Nvidia Megatron core: https://docs.nvidia.com/megatron-core/index.html
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism: https://arxiv.org/abs/1909.08053
Hugging Face documentation: https://huggingface.co/docs/accelerate/en/usage_guides/megatron_lm
Apex Documentation: https://github.com/NVIDIA/apex