<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Benito Martin</title>
    <description>The latest articles on DEV Community by Benito Martin (@benitomartin).</description>
    <link>https://dev.to/benitomartin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2936204%2F6de59e30-f07e-4c46-bd0c-1ea12b96f675.jpg</url>
      <title>DEV Community: Benito Martin</title>
      <link>https://dev.to/benitomartin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/benitomartin"/>
    <language>en</language>
    <item>
      <title>Automate Vector Database Update with AWS and CircleCI</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Tue, 15 Jul 2025 09:07:55 +0000</pubDate>
      <link>https://dev.to/benitomartin/automate-vector-database-update-with-aws-and-circleci-27dd</link>
      <guid>https://dev.to/benitomartin/automate-vector-database-update-with-aws-and-circleci-27dd</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Managing vector databases efficiently is crucial for applications that rely on similarity search, AI-powered recommendations, and large-scale data retrieval. As data sources grow, keeping vector databases updated with fresh embeddings becomes a challenge. Manually updating embeddings for new documents is inefficient and error-prone, making automation essential.&lt;/p&gt;

&lt;p&gt;In this guide, you will explore how to build a fully automated pipeline for processing and updating a vector database using AWS Lambda and CircleCI. The solution involves extracting text from PDFs, generating embeddings with OpenAI, and storing them in &lt;a href="https://cloud.zilliz.com/signup?redirect=/orgs" rel="noopener noreferrer"&gt;Zilliz Cloud&lt;/a&gt;, a managed vector database. You will also set up &lt;a href="https://aws.amazon.com/" rel="noopener noreferrer"&gt;AWS infrastructure&lt;/a&gt; (S3, ECR, and Lambda) and implement a CI/CD pipeline with &lt;a href="https://circleci.com/" rel="noopener noreferrer"&gt;CircleCI&lt;/a&gt; to automate deployment and updates.&lt;/p&gt;

&lt;p&gt;What You Will Learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;How to manage vector databases and automate embeddings creation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Building an &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt; function to process and update embeddings&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Using Docker to containerize the AWS Lambda function for efficient execution&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Setting up CircleCI to automate testing and deployment&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Implementing best practices for AWS IAM roles and security&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end of this tutorial, you will have a fully automated workflow to process and update vector embeddings seamlessly.&lt;/p&gt;

&lt;p&gt;This tutorial assumes some familiarity with Python, AWS, and Docker. You can check out the complete &lt;a href="https://github.com/benitomartin/embeddings-aws-circleci" rel="noopener noreferrer"&gt;source code&lt;/a&gt; on GitHub, but this guide will walk you through the process step by step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, ensure that you have the following requirements in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS Account&lt;/strong&gt;: &lt;a href="https://signin.aws.amazon.com/signup?request_type=register" rel="noopener noreferrer"&gt;Sign up&lt;/a&gt; for an AWS account if you do not already have one. You will use AWS Lambda and &lt;a href="https://aws.amazon.com/ecr/" rel="noopener noreferrer"&gt;Elastic Container Registry (ECR)&lt;/a&gt; for deployment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS CLI Installed and Configured&lt;/strong&gt;: Install the &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html" rel="noopener noreferrer"&gt;AWS Command Line Interface (CLI)&lt;/a&gt; and configure it with your AWS credentials. You can follow the AWS CLI setup guide.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Basic Knowledge of LangChain or Vector Databases&lt;/strong&gt;:Understanding the fundamentals of LangChain and Vector Databases will help you design the architecture of the pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Familiarity with AWS Lambda and Docker&lt;/strong&gt;: You should know the basics of AWS Lambda and Docker, as you will use them to package and deploy the application.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GitHub and CircleCI Accounts&lt;/strong&gt;: Create accounts on GitHub and CircleCI to manage the version control and automate the &lt;a href="https://circleci.com/ci-cd/" rel="noopener noreferrer"&gt;CI/CD pipeline&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAI API Key&lt;/strong&gt;: To access OpenAI’s GPT models, you will need an API key. You can sign up for an API key on the &lt;a href="https://platform.openai.com/signup" rel="noopener noreferrer"&gt;OpenAI website&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Zilliz Cloud Account&lt;/strong&gt;: Sign up for a &lt;a href="https://cloud.zilliz.com/signup?redirect=/orgs" rel="noopener noreferrer"&gt;Zilliz Cloud&lt;/a&gt; account to host your vector database and get a free cluster that provides the URI endpoint and Token to interact with it.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once you have these prerequisites in place, you will be ready to set up the automated pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up the Project Structure
&lt;/h2&gt;

&lt;p&gt;Before diving into implementation, you need to structure your project efficiently. A well-organized project makes development, testing, and deployment smoother, especially when dealing with cloud services and CI/CD automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Project Organization and Key Components
&lt;/h3&gt;

&lt;p&gt;Your project will include the following key components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── .circleci/
│   └── config.yml
├── data/
│   └── 1706.03762v7.pdf
├── src/
│   ├── create_collection.py
│   ├── drop_collection.py
│   ├── insert_documents.py
│   └── __init__.py
├── aws_lambda/
|   ├── __init__.py
│   └── lambda_function.py
├── scripts/
│   ├── build_deploy.sh
│   ├── create_roles.sh
│   ├── create_image.sh
│   └── create_lambda.sh
├── tests/
│   ├── test_collection_exists.py
│   ├── test_lambda_function.py
│   └── test_collection_mock.py
├── Dockerfile
└── pyproject.toml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Installing Dependencies with UV Package Manager
&lt;/h3&gt;

&lt;p&gt;First, clone the repository containing the project code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/benitomartin/embeddings-aws-circleci
&lt;span class="nb"&gt;cd &lt;/span&gt;embeddings-aws-circleci
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Note: The repository you just cloned already contains all the necessary code snippets referenced throughout this tutorial. There's no need to recreate files from scratch. Simply verify that the contents match as you follow along. Feel free to adapt the structure and implementation to suit your own project requirements.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Next, install the dependencies using the UV Package Manager. If you do not have it installed, you can follow the &lt;a href="https://docs.astral.sh/uv/getting-started/installation/" rel="noopener noreferrer"&gt;installation guide&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--all-extras&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These commands will install all the necessary dependencies for the project that are listed in the &lt;code&gt;pyproject.toml&lt;/code&gt; file and activate the virtual environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Environment Configuration
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file in the root directory of your project and add the following environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-zilliz-uri
&lt;span class="nv"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-zilliz-token
&lt;span class="nv"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-collection-name
&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-bucket-name
&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-openai-key
&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-aws-region
&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-access-key
&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-secret-key
&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-account-id
&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-ecr-repo-name
&lt;span class="nv"&gt;LAMBDA_IMAGE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-image-name
&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-lambda-name
&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-role-name
&lt;span class="nv"&gt;ROLE_POLICY_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-policy-name
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace the placeholders with your actual values.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating the Vector Database Infrastructure
&lt;/h2&gt;

&lt;p&gt;To efficiently store and retrieve embeddings, you need to set up a vector database. This section will guide you through configuring Zilliz Cloud (Milvus), defining a schema, and optimizing the database for fast vector searches.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up Zilliz Cloud Collection
&lt;/h3&gt;

&lt;p&gt;Zilliz Cloud is a managed version of Milvus, a high-performance vector database. You will create a collection to store extracted text and corresponding vector embeddings.&lt;/p&gt;

&lt;p&gt;In order to create a collection, you need to follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Sign up and create a free Cluster in Zilliz Cloud.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Get the connection details:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;URI&lt;/strong&gt;: Found in the cluster settings (public endpoint).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Token&lt;/strong&gt;: Required for authentication.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Set environment variables in your &lt;code&gt;.env&lt;/code&gt; file and provide a collection name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-zilliz-uri
&lt;span class="nv"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-zilliz-token
&lt;span class="nv"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-collection-name
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga1zq0ki32z7plrjkzk0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga1zq0ki32z7plrjkzk0.png" alt="Zilliz cluster" width="800" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Collection
&lt;/h3&gt;

&lt;p&gt;Once you have the connection details, you can create a collection in Zilliz Cloud. The collection will store the extracted text and corresponding vector embeddings.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;src&lt;/code&gt; folder, you can create a &lt;code&gt;create_collection.py&lt;/code&gt; script, with several functions to define the schema and create the collection:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;create_schema&lt;/code&gt;: Defines the schema (&lt;code&gt;create_schema&lt;/code&gt;), which includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;id&lt;/code&gt;: Auto-generated primary key (&lt;code&gt;INT64&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pdf_text&lt;/code&gt;: Extracted text stored as a &lt;code&gt;VARCHAR&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;my_vector&lt;/code&gt;: Vector embeddings stored as &lt;code&gt;FLOAT_VECTOR&lt;/code&gt; (default dimension: &lt;code&gt;1536&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;create_collection&lt;/code&gt;Creates the collection in Zilliz Cloud, with the defined schema. It optimizes vector search by setting up an AUTOINDEX with COSINE similarity, ensuring efficient retrieval.&lt;br&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pymilvus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Define the schema for the Milvus collection.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;auto_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;enable_dynamic_field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datatype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DataType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INT64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;is_primary&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datatype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DataType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;65535&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datatype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DataType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FLOAT_VECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a new Milvus collection with the specified parameters.

    Args:
        collection_name (str, optional): Name of the collection. Defaults to env var COLLECTION_NAME.
        uri (str, optional): Zilliz Cloud URI. Defaults to env var ZILLIZ_CLOUD_URI.
        token (str, optional): Zilliz token. Defaults to env var ZILLIZ_TOKEN.
        dimension (int, optional): Vector dimension. Defaults to 1536.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Use environment variables as fallback
&lt;/span&gt;    &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing required parameters: collection_name, uri, or token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Connect to Zilliz Cloud (Milvus)
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Create schema
&lt;/span&gt;    &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Prepare index parameters
&lt;/span&gt;    &lt;span class="n"&gt;index_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prepare_index_params&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;index_params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AUTOINDEX&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COSINE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Create collection
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;index_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Create collection
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Creating collection...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Collection created successfully.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once your Zilliz Cloud cluster is ready and &lt;code&gt;.env&lt;/code&gt; is configured, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run src/create_collection.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a collection in your Zilliz Cloud cluster. In case you need to delete the collection, you can create a &lt;code&gt;drop_collection.py&lt;/code&gt; script in the &lt;code&gt;src&lt;/code&gt; folder to drop the collection and recreate it again with the previous script.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pymilvus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;drop_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Drop a Milvus collection.

    Args:
        collection_name (str, optional): Name of the collection. Defaults to env var COLLECTION_NAME.
        uri (str, optional): Zilliz Cloud URI. Defaults to env var ZILLIZ_CLOUD_URI.
        token (str, optional): Zilliz token. Defaults to env var ZILLIZ_TOKEN.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Use environment variables as fallback
&lt;/span&gt;    &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing required parameters: collection_name, uri, or token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Connect to Zilliz Cloud (Milvus)
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Drop the collection
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Drop collection
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dropping collection...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;drop_collection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Collection dropped successfully.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To drop the collection, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run src/drop_collection.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Implementing the PDF Processing Pipeline
&lt;/h2&gt;

&lt;p&gt;To store and search text efficiently, you need to process PDFs, extract the text, convert it into embeddings, and store them in Zilliz Cloud for fast retrieval.&lt;/p&gt;

&lt;p&gt;Make sure to set the &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; environment variable in your &lt;code&gt;.env&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;Then, create a &lt;code&gt;insert_documents.py&lt;/code&gt; script in the &lt;code&gt;src&lt;/code&gt; folder. This script will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Load the text from PDFs using &lt;code&gt;PyPDFLoader&lt;/code&gt; from LangChain to get a Document object&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Split the text into manageable chunks to ensure accurate embeddings using &lt;code&gt;CharacterTextSplitter&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Generate vector embeddings using OpenAI&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Store the text and embeddings in Zilliz Cloud using &lt;code&gt;MilvusClient&lt;/code&gt; for efficient similarity searches.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PyPDFLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pymilvus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Process a PDF file and generate embeddings for its content.

    Args:
        pdf_path (str): Path to the PDF file.
        chunk_size (int, optional): Size of text chunks. Defaults to 512.
        chunk_overlap (int, optional): Overlap between chunks. Defaults to 100.

    Returns:
        List[dict]: List of dictionaries containing text and embeddings.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDF file not found at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Load and process PDF
&lt;/span&gt;    &lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PyPDFLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Split text
&lt;/span&gt;    &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate embeddings
&lt;/span&gt;    &lt;span class="n"&gt;openai_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Prepare data for insertion
&lt;/span&gt;    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;
        &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_documents&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;insert_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Insert documents from a PDF file into a Milvus collection.

    Args:
        pdf_path (str): Path to the PDF file.
        collection_name (str, optional): Name of the collection. Defaults to env var COLLECTION_NAME.
        uri (str, optional): Zilliz Cloud URI. Defaults to env var ZILLIZ_CLOUD_URI.
        token (str, optional): Zilliz token. Defaults to env var ZILLIZ_TOKEN.
        chunk_size (int, optional): Size of text chunks. Defaults to 512.
        chunk_overlap (int, optional): Overlap between chunks. Defaults to 100.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Use environment variables as fallback
&lt;/span&gt;    &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing required parameters: collection_name, uri, or token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Connect to Zilliz Cloud (Milvus)
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Process PDF and get data
&lt;/span&gt;    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;process_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Insert data
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Verify collection load state
&lt;/span&gt;    &lt;span class="n"&gt;load_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_load_state&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Collection load state: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;load_state&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Insert documents
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Inserting documents...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;insert_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/1706.03762v7.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Documents inserted successfully.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run the script, use the following command. You can find a sample PDF file in the &lt;code&gt;data&lt;/code&gt; folder but feel free to use your own.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run src/insert_documents.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script will process the PDF, generate embeddings, and store them in your Zilliz Cloud cluster collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzskssf09tmbz92ziffx2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzskssf09tmbz92ziffx2.png" alt="Zilliz embeddings collection" width="800" height="354"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating IAM Roles and Policies
&lt;/h2&gt;

&lt;p&gt;Now that you have a working pipeline, you need to set up AWS Lambda to trigger the pipeline when a new PDF is uploaded to an S3 bucket.&lt;/p&gt;

&lt;p&gt;To deploy AWS Lambda functions, you need first to create specific &lt;a href="https://aws.amazon.com/iam/" rel="noopener noreferrer"&gt;IAM roles&lt;/a&gt; and permissions. You can create the following &lt;code&gt;create_roles.sh&lt;/code&gt; script under the &lt;code&gt;scripts&lt;/code&gt; folder. This script automates the process of creating an IAM role with the necessary policy &lt;code&gt;AWSLambdaExecute&lt;/code&gt; for AWS Lambda to execute the function and access S3.&lt;/p&gt;

&lt;p&gt;Before running the script,make sure to set the &lt;code&gt;ROLE_NAME&lt;/code&gt; and &lt;code&gt;AWS_REGION&lt;/code&gt; environment variables in your &lt;code&gt;.env&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;AWS Lambda will assume this role when executing the function, which allows it to access the S3 bucket, as defined in the &lt;code&gt;AWSLambdaExecute&lt;/code&gt; policy. It will also have access to CloudWatch Logs for logging purposes, which will help you monitor and debug the function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Exit immediately if a command exits with a non-zero status&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="c"&gt;# Load environment variables from .env file&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; allexport
&lt;span class="nb"&gt;source&lt;/span&gt; .env
&lt;span class="nb"&gt;set&lt;/span&gt; +o allexport

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Environment variables loaded."&lt;/span&gt;


&lt;span class="c"&gt;# Create a new IAM role with Lambda and S3 full access&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Checking IAM role..."&lt;/span&gt;

&lt;span class="c"&gt;# Check if the role exists&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; aws iam get-role &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Creating new IAM role for Lambda with S3 access..."&lt;/span&gt;

    &lt;span class="c"&gt;# Fix: Remove space after = and use proper JSON formatting&lt;/span&gt;
    &lt;span class="nv"&gt;ASSUME_ROLE_POLICY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{
        "Version": "2012-10-17",
        "Statement": [{
            "Effect": "Allow",
            "Principal": {
                "Service": "lambda.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }]
    }'&lt;/span&gt;

    &lt;span class="c"&gt;# Create the IAM role&lt;/span&gt;
    aws iam create-role &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--assume-role-policy-document&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ASSUME_ROLE_POLICY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;


    &lt;span class="c"&gt;# Add Lambda execution policy. Provides Put, Get access to S3 and full access to CloudWatch Logs.&lt;/span&gt;
    aws iam attach-role-policy &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::aws:policy/AWSLambdaExecute  &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"IAM role created and policy attached."&lt;/span&gt;

    &lt;span class="c"&gt;# Wait for role to propagate&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Waiting for role to propagate..."&lt;/span&gt;
    &lt;span class="nb"&gt;sleep &lt;/span&gt;20

&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"IAM role &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; already exists. Skipping role creation."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To execute the script, use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run scripts/create_roles.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building the AWS Lambda Function
&lt;/h2&gt;

&lt;p&gt;The AWS Lambda function is the core component that automates the entire process of handling PDF uploads, generating embeddings, and storing them in Zilliz Cloud. The function is triggered by an S3 event, processes the uploaded PDF, and stores the resulting data in your Milvus collection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Handler Implementation
&lt;/h3&gt;

&lt;p&gt;Now you can create the &lt;code&gt;lambda_function.py&lt;/code&gt; file below and save it in the &lt;code&gt;aws_lambda&lt;/code&gt; folder. This file contains the implementation of the AWS Lambda function. In this case, the AWS Lambda function is triggered by an S3 event whenever a new PDF file is uploaded in an S3 bucket. It processes the event, extracts the file, generates embeddings, and inserts the data into the Zilliz Cloud collection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PyPDFLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pymilvus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;

&lt;span class="c1"&gt;# Global variables for reuse across invocations
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;openai_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;init_clients&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Initialize global clients if not already initialized&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;openai_embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Initializing Milvus client...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;openai_embeddings&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Initializing OpenAI embeddings...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;openai_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;openai_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Initializing text splitter...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received event: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Initialize clients
&lt;/span&gt;        &lt;span class="nf"&gt;init_clients&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# Validate event structure
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No records found in event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No records found in event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;

        &lt;span class="c1"&gt;# Get bucket and file info from S3 event
&lt;/span&gt;        &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing file &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; from bucket &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Verify bucket
&lt;/span&gt;        &lt;span class="n"&gt;expected_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;expected_bucket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid bucket. Expected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expected_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, got &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid bucket. Expected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expected_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, got &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;# Download PDF
&lt;/span&gt;        &lt;span class="n"&gt;local_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/tmp/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Downloading file to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Process PDF
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Loading and splitting PDF...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PyPDFLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Split PDF into &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; chunks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Prepare and insert data
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generating embeddings and preparing data...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;openai_embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_documents&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Inserting &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; records into collection &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Cleanup
&lt;/span&gt;        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing completed successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing document: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;traceback&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Traceback: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;traceback&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_exc&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Main Features of the Lambda Function:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;S3 Event Processing&lt;/strong&gt;: The AWS Lambda function is triggered by an S3 event when a new PDF is uploaded to the designated bucket.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Client Initialization&lt;/strong&gt;: The function initializes the Milvus client for storing embeddings, the OpenAI embeddings client, and the text splitter for chunking the PDF text.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Text Processing&lt;/strong&gt;: The PDF text is extracted using PyPDFLoader, then split into smaller chunks to ensure proper embedding generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generating and Storing Embeddings&lt;/strong&gt;: The OpenAI embeddings are generated for each chunk of text, and the resulting data is inserted into the specified Milvus collection in Zilliz Cloud.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Error Handling&lt;/strong&gt;: The function includes error handling to catch and log any exceptions that occur during the processing of the PDF.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AWS Lambda Containerization with Docker
&lt;/h3&gt;

&lt;p&gt;Once the AWS Lambda function is ready, it needs to be containerized using Docker. As AWS Lambda works better with &lt;code&gt;requirements.txt&lt;/code&gt; instead of &lt;code&gt;pyproject.toml&lt;/code&gt;, you need to create a &lt;code&gt;requirements.txt&lt;/code&gt; file from your &lt;code&gt;pyproject.toml&lt;/code&gt; file in the root directory of your project with the following dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;langchain-community
langchain_milvus
boto3
langchain-openai
pypdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Dockerfile below sets up the environment for the AWS Lambda function, including the necessary dependencies and the function code. You can save this file in the root directory of your project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/python:3.12.2025.04.01.18&lt;/span&gt;

&lt;span class="c"&gt;# Set the working directory to /var/task&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${LAMBDA_TASK_ROOT}&lt;/span&gt;

&lt;span class="c"&gt;# Copy requirements first to leverage Docker cache&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt ./&lt;/span&gt;

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Copy source code&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; aws_lambda/lambda_function.py ./lambda_function.py&lt;/span&gt;

&lt;span class="c"&gt;# Command to run the Lambda handler function&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; [ "lambda_function.lambda_handler" ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly to the creation of the IAM Role, the creation of the ECR repository and the Docker image can be automated using a shell script. Make sure the coresponding environment variables are set in the &lt;code&gt;.env&lt;/code&gt; file. Save the script below in the &lt;code&gt;scripts&lt;/code&gt; folder.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Exit immediately if a command exits with a non-zero status&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="c"&gt;# Load environment variables from .env file&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; allexport
&lt;span class="nb"&gt;source&lt;/span&gt; .env
&lt;span class="nb"&gt;set&lt;/span&gt; +o allexport

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Environment variables loaded."&lt;/span&gt;

&lt;span class="c"&gt;# Check if the ECR repository exists, create it if it does not&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; aws ecr describe-repositories &lt;span class="nt"&gt;--repository-names&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Repository &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; does not exist. Creating..."&lt;/span&gt;
    aws ecr create-repository &lt;span class="nt"&gt;--repository-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Repository &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; created."&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Repository &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; already exists."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Build Docker image&lt;/span&gt;
&lt;span class="c"&gt;# To make your image compatible with Lambda, you must use the --provenance=false option.&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Building Docker image &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_IMAGE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
docker buildx build &lt;span class="nt"&gt;--platform&lt;/span&gt; linux/amd64 &lt;span class="nt"&gt;--provenance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_IMAGE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Authenticate Docker to your Amazon ECR registry&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Authenticating Docker to ECR..."&lt;/span&gt;
aws ecr get-login-password &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; | docker login &lt;span class="nt"&gt;--username&lt;/span&gt; AWS &lt;span class="nt"&gt;--password-stdin&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.dkr.ecr.&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.amazonaws.com

&lt;span class="c"&gt;# Tag the Docker image&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Tagging Docker image..."&lt;/span&gt;
docker tag &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_IMAGE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.dkr.ecr.&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.amazonaws.com/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest

&lt;span class="c"&gt;# Push the Docker image to Amazon ECR&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Pushing Docker image to ECR..."&lt;/span&gt;
docker push &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.dkr.ecr.&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.amazonaws.com/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Docker image pushed to ECR."&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Image created successfully."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can run the script with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run scripts/create_image.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pushing the AWS Lambda Function
&lt;/h3&gt;

&lt;p&gt;Once the Docker image is built and pushed to ECR, you can create the AWS Lambda function. As this function is triggered by an S3 event, you need to create an S3 bucket first to store your PDFs. This can be done through the AWS Management Console or the AWS CLI with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-bucket &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--bucket&lt;/span&gt; embeddings-&lt;span class="si"&gt;$(&lt;/span&gt;uuidgen | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; - | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'[:upper:]'&lt;/span&gt; &lt;span class="s1"&gt;'[:lower:]'&lt;/span&gt; &lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--region&lt;/span&gt; eu-central-1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--create-bucket-configuration&lt;/span&gt; &lt;span class="nv"&gt;LocationConstraint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;eu-central-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a new S3 bucket with a unique name as required by AWS. Make sure to update the &lt;code&gt;PDF_BUCKET_NAME&lt;/code&gt; environment variable in the &lt;code&gt;.env&lt;/code&gt; file with the name of the bucket you just created.&lt;br&gt;
If your default region is us-east-1, do not include the --create-bucket-configuration flag. Instead, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-bucket &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bucket&lt;/span&gt; embeddings-&lt;span class="si"&gt;$(&lt;/span&gt;uuidgen | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; - | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'[:upper:]'&lt;/span&gt; &lt;span class="s1"&gt;'[:lower:]'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that the S3 bucket is created, you can create the AWS Lambda function using the following script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Exit immediately if a command exits with a non-zero status&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="c"&gt;# Load environment variables from .env file&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; allexport
&lt;span class="nb"&gt;source&lt;/span&gt; .env
&lt;span class="nb"&gt;set&lt;/span&gt; +o allexport

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Environment variables loaded."&lt;/span&gt;

&lt;span class="c"&gt;# Check if the Lambda function exists&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; aws lambda get-function &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Lambda function &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; does not exist. Creating..."&lt;/span&gt;
    aws lambda create-function &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--package-type&lt;/span&gt; Image &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--code&lt;/span&gt; &lt;span class="nv"&gt;ImageUri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.dkr.ecr.&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.amazonaws.com/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--role&lt;/span&gt; arn:aws:iam::&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:role/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--timeout&lt;/span&gt; 900 &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--memory-size&lt;/span&gt; 3072 &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--environment&lt;/span&gt; &lt;span class="s2"&gt;"Variables={
            PDF_BUCKET_NAME=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            OPENAI_API_KEY=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            ZILLIZ_CLOUD_URI=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            ZILLIZ_TOKEN=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            COLLECTION_NAME=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
        }"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;

&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Lambda function &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; already exists. Updating..."&lt;/span&gt;
    aws lambda update-function-code &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--image-uri&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.dkr.ecr.&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.amazonaws.com/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_ECR_REPOSITORY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:latest

    &lt;span class="c"&gt;# Wait for role to propagate&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Waiting lambda function to update code..."&lt;/span&gt;
    &lt;span class="nb"&gt;sleep &lt;/span&gt;20

    aws lambda update-function-configuration &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--timeout&lt;/span&gt; 900 &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--memory-size&lt;/span&gt; 3072 &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--environment&lt;/span&gt; &lt;span class="s2"&gt;"Variables={
            PDF_BUCKET_NAME=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            OPENAI_API_KEY=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            ZILLIZ_CLOUD_URI=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            ZILLIZ_TOKEN=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;,
            COLLECTION_NAME=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
        }"&lt;/span&gt;


&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Check and add S3 trigger to Lambda if it doesn't exist&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; aws lambda get-policy &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s2"&gt;"S3InvokeFunction"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Adding S3 trigger permission to Lambda..."&lt;/span&gt;
    aws lambda add-permission &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--function-name&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--statement-id&lt;/span&gt; S3InvokeFunction &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--action&lt;/span&gt; lambda:InvokeFunction &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--principal&lt;/span&gt; s3.amazonaws.com &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--source-arn&lt;/span&gt; arn:aws:s3:::&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Waiting for permission to propagate..."&lt;/span&gt;
    &lt;span class="nb"&gt;sleep &lt;/span&gt;20
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"S3 trigger permission already exists for Lambda. Skipping..."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Check and configure S3 bucket notification if it doesn't exist&lt;/span&gt;
&lt;span class="nv"&gt;CURRENT_NOTIFICATIONS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3api get-bucket-notification-configuration &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CURRENT_NOTIFICATIONS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Configuring S3 bucket notification..."&lt;/span&gt;
    aws s3api put-bucket-notification-configuration &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--notification-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
            "LambdaFunctionConfigurations": [{
                "LambdaFunctionArn": "arn:aws:lambda:'&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;':'&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCOUNT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;':function:'&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;'",
                "Events": ["s3:ObjectCreated:*"]
            }]
        }'&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"S3 bucket notification already configured. Skipping..."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script checks if the AWS Lambda function already exists and creates it if it does not.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If it does not exist, it creates it with the necessary configuration, including the previously created IAM role, environment variables, and the Docker image. Additionally, it adds a permission to the AWS Lambda function to be invoked by the S3 bucket and configures the S3 bucket to trigger the AWS Lambda function when a new object is created.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If it does exist, it updates the function code and configuration.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The environment variables are loaded from the &lt;code&gt;.env&lt;/code&gt; file and stored in the AWS Lambda function's environment variables. This allows the AWS Lambda function to access the necessary resources and configurations.&lt;/p&gt;

&lt;p&gt;Also to be sure the AWS Lambda function is invoked, the S3 bucket is configured to trigger the AWS Lambda function when a new object is created, like uploading a new PDF.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6awp54t9j8cbhwiq801p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6awp54t9j8cbhwiq801p.png" alt="AWS Lambda S3 invocation" width="800" height="292"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can run the script with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run scripts/create_lambda.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing and Quality Assurance
&lt;/h2&gt;

&lt;p&gt;Testing and ensuring good code quality are essential steps in any software development pipeline, particularly when deploying to cloud services such as AWS Lambda. It is important to ensure that your code works as expected and is clean, efficient, and type-safe.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unit Testing with Pytest
&lt;/h3&gt;

&lt;p&gt;Unit tests ensure that each part of the code behaves as expected. Under the &lt;code&gt;tests&lt;/code&gt; directory you can create test following tests files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;test_collection_exists.py&lt;/code&gt;: Verifies that the collection exists in Zilliz Cloud before attempting to insert embeddings.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

  &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
  &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pymilvus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MilvusClient&lt;/span&gt;


  &lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;milvus_client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
      &lt;span class="c1"&gt;# Initialize Milvus client with environment variables for URI and token
&lt;/span&gt;      &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MilvusClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
      &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Close the connection after the test
&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_check_collection_existence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;milvus_client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="c1"&gt;# Step 1: Get list of all collections in the Milvus instance
&lt;/span&gt;      &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;milvus_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_collections&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

      &lt;span class="c1"&gt;# Step 2: Assert that the collection name exists in the list of collections
&lt;/span&gt;      &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;collection_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Collection &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; does not exist in Milvus.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;test_collection_mock.py&lt;/code&gt;: Uses mocks to test the collection existence and dropping functionality.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
  &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;unittest.mock&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patch&lt;/span&gt;

  &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;


  &lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
      &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pymilvus.MilvusClient&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mock_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="n"&gt;client_instance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="n"&gt;mock_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_instance&lt;/span&gt;
          &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;client_instance&lt;/span&gt;


  &lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
      &lt;span class="n"&gt;env_vars&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_CLOUD_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fake-uri&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test_collection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ZILLIZ_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fake-token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_vars&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
          &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;env_vars&lt;/span&gt;


  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_drop_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.drop_collection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;drop_collection&lt;/span&gt;

      &lt;span class="c1"&gt;# Call drop collection
&lt;/span&gt;      &lt;span class="nf"&gt;drop_collection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

      &lt;span class="c1"&gt;# Verify the drop_collection method was called with correct parameters
&lt;/span&gt;      &lt;span class="n"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop_collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assert_called_once_with&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;


  &lt;span class="nd"&gt;@pytest.mark.parametrize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;collection_exists&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_collection_existence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collection_exists&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="n"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;list_collections&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;collection_exists&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="c1"&gt;# Check if collection exists
&lt;/span&gt;      &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mock_milvus_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_collections&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;collection_exists&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
      &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;mock_env_vars&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;test_lambda_function.py&lt;/code&gt;: Tests the AWS Lambda function locally with an actual PDF file stored in the S3 bucket.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can upload a pdf file to your S3 bucket with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;your-file.pdf s3://your-bucket-name/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
  &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aws_lambda.lambda_function&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lambda_handler&lt;/span&gt;

  &lt;span class="c1"&gt;# Set up test event
&lt;/span&gt;  &lt;span class="n"&gt;TEST_BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDF_BUCKET_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;TEST_FILE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1706.03762v7.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

  &lt;span class="n"&gt;test_event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TEST_BUCKET&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TEST_FILE&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
              &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_lambda_handler&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
      &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the lambda_handler function with an actual S3 file.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
      &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unexpected response: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run the tests, you can use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run pytest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is set up correctly, you should see the tests passing.&lt;/p&gt;

&lt;p&gt;You will see an output similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;================================================================&lt;/span&gt; &lt;span class="nb"&gt;test &lt;/span&gt;session starts 

tests/test_collection_exists.py::test_check_collection_existence PASSED
tests/test_collection_mock.py::test_drop_collection PASSED
tests/test_collection_mock.py::test_collection_existence[True]  result: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'test_collection'&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
PASSED
tests/test_collection_mock.py::test_collection_existence[False]  result: &lt;span class="o"&gt;[]&lt;/span&gt;
PASSED
tests/test_lambda_function.py::test_lambda_handler Received event: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"Records"&lt;/span&gt;: &lt;span class="o"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;"s3"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"bucket"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"embeddings-8213c13740654398b076090eac96473e"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;, &lt;span class="s2"&gt;"object"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;: &lt;span class="s2"&gt;"1706.03762v7.pdf"&lt;/span&gt;&lt;span class="o"&gt;}}}]}&lt;/span&gt;
Initializing Milvus client...
Initializing OpenAI embeddings...
Initializing text splitter...
Processing file 1706.03762v7.pdf from bucket embeddings-8213c13740654398b076090eac96473e
Downloading file to /tmp/1706.03762v7.pdf
Loading and splitting PDF...
Split PDF into 15 chunks
Generating embeddings and preparing data...
Inserting 15 records into collection pdf_embeddings
Processing completed successfully
PASSED

&lt;span class="o"&gt;================================================================&lt;/span&gt; 5 passed &lt;span class="k"&gt;in &lt;/span&gt;16.73s &lt;span class="o"&gt;=================================================================&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Quality Assurance with Ruff and MyPy
&lt;/h3&gt;

&lt;p&gt;Ruff and MyPy are static analysis tools that help ensure your code is clean, efficient, and type-safe.&lt;/p&gt;

&lt;p&gt;Ruff is a linter that checks for code style and syntax errors. MyPy is a static type checker that ensures your code is type-safe.&lt;/p&gt;

&lt;p&gt;With the following commands, you can run Ruff and MyPy to check your code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run ruff check &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--fix&lt;/span&gt; &lt;span class="nt"&gt;--exit-non-zero-on-fix&lt;/span&gt;
uv run mypy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is set up correctly, you should see no errors or warnings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhc8sbbmqws58le84jg9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhc8sbbmqws58le84jg9.png" alt="Quality assurance with Ruff and MyPy" width="800" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing CI/CD with CircleCI
&lt;/h2&gt;

&lt;p&gt;Continuous Integration and Continuous Deployment (CI/CD) are essential practices for automating the testing, building, and deployment of your applications. CircleCI provides a platform to automate your development workflows, including code testing, Docker image building, and deployment to AWS Lambda.&lt;/p&gt;

&lt;p&gt;To configure your pipeline, you’ll need a &lt;code&gt;.circleci/config.yml&lt;/code&gt; file in a &lt;code&gt;.circleci&lt;/code&gt; directory at the root of your project. This configuration file defines your jobs, workflows, and execution steps for building, testing, and deploying your Lambda function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2.1&lt;/span&gt;

&lt;span class="na"&gt;orbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;aws-cli&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/aws-cli@5.2.0&lt;/span&gt;
  &lt;span class="na"&gt;docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/docker@2.8.2&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build-deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cimg/python:3.12&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;checkout&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install UV&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;curl -LsSf https://astral.sh/uv/install.sh | sh&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create venv and install dependencies&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv sync --all-extras&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run ruff&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv run ruff check . --fix --exit-non-zero-on-fix&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run MyPy&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv run mypy&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run tests&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv run pytest&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create .env file&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;echo "ZILLIZ_CLOUD_URI=${ZILLIZ_CLOUD_URI}" &amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "ZILLIZ_TOKEN=${ZILLIZ_TOKEN}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "COLLECTION_NAME=${COLLECTION_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "PDF_BUCKET_NAME=${PDF_BUCKET_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "OPENAI_API_KEY=${OPENAI_API_KEY}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_REGION=${AWS_REGION}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_ACCOUNT_ID=${AWS_ACCOUNT_ID}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "REPOSITORY_NAME=${REPOSITORY_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "IMAGE_NAME=${IMAGE_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "LAMBDA_FUNCTION_NAME=${LAMBDA_FUNCTION_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "ROLE_NAME=${ROLE_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "ROLE_POLICY_NAME=${ROLE_POLICY_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;aws-cli/setup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;profile_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;setup_remote_docker&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy to AWS&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;chmod +x scripts/build_deploy.sh&lt;/span&gt;
            &lt;span class="s"&gt;./scripts/build_deploy.sh&lt;/span&gt;

&lt;span class="na"&gt;workflows&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build-deploy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The file can be broken down into the following components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Orbs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;aws-cli&lt;/code&gt;: The AWS CLI orb simplifies the setup of AWS CLI to interact with AWS services.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docker&lt;/code&gt;: The CircleCI Docker orb handles setting up the Docker environment.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Jobs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;build-deploy&lt;/code&gt;: This job is responsible for building and deploying the AWS Lambda function. It includes steps for checking out the code, installing dependencies, running tests, and deploying the function to AWS Lambda. As you need to execute multiple scripts in sequence, you can use a single bash script &lt;code&gt;build_deploy.sh&lt;/code&gt; to do so and save it in the &lt;code&gt;scripts&lt;/code&gt; directory.
&lt;/li&gt;
&lt;/ul&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Exit immediately if a command fails&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="c"&gt;# Define script paths&lt;/span&gt;
&lt;span class="nv"&gt;SCRIPT1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"scripts/create_roles.sh"&lt;/span&gt;
&lt;span class="nv"&gt;SCRIPT2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"scripts/create_image.sh"&lt;/span&gt;
&lt;span class="nv"&gt;SCRIPT3&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"scripts/create_lambda.sh"&lt;/span&gt;

&lt;span class="c"&gt;# Ensure scripts are executable&lt;/span&gt;
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x &lt;span class="nv"&gt;$SCRIPT1&lt;/span&gt; &lt;span class="nv"&gt;$SCRIPT2&lt;/span&gt; &lt;span class="nv"&gt;$SCRIPT3&lt;/span&gt;

&lt;span class="c"&gt;# Run the scripts sequentially&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running Script 1..."&lt;/span&gt;
&lt;span class="nv"&gt;$SCRIPT1&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running Script 2..."&lt;/span&gt;
&lt;span class="nv"&gt;$SCRIPT2&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running Script 3..."&lt;/span&gt;
&lt;span class="nv"&gt;$SCRIPT3&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"All scripts executed successfully!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Workflows&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;deploy&lt;/code&gt; workflow triggers the &lt;code&gt;build-deploy&lt;/code&gt; job when a push is made to the main branch.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Once you have committed the configuration file, &lt;a href="https://circleci.com/blog/pushing-a-project-to-github/" rel="noopener noreferrer"&gt;push it to your GitHub repository&lt;/a&gt;, and visit the &lt;a href="https://app.circleci.com/" rel="noopener noreferrer"&gt;CircleCI dashboard&lt;/a&gt; to set up your project.&lt;/p&gt;

&lt;p&gt;Select your repository and click &lt;strong&gt;Set Up Project&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ecfxru14hdee8v2smdo.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ecfxru14hdee8v2smdo.PNG" alt="CircleCI project setup" width="800" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, choose the appropriate branch to trigger the first pipeline. You can select the branch you want to use for your CI/CD pipeline. In this case, you can choose the &lt;code&gt;main&lt;/code&gt; branch and click &lt;strong&gt;Set Up Project&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4kufsizo2fq0a5g1rme.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4kufsizo2fq0a5g1rme.PNG" alt="CircleCI branch trigger" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If this is your first time triggering a build on CircleCI for this project, note that the initial pipeline will fail.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qq6dl5ztxo29zi6qpgu.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qq6dl5ztxo29zi6qpgu.PNG" alt="CircleCI pipeline fail" width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is expected behavior as environment variables are required for the pipeline to run successfully. CircleCI does not allow you to configure them until the project has been initialized by that first triggered build.&lt;/p&gt;

&lt;p&gt;After the initial failure, open the Project Settings, go to the Environment Variables section, and add all the required &lt;a href="https://circleci.com/docs/set-environment-variable/" rel="noopener noreferrer"&gt;environment variables&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hteubbffokkxr3b21p7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hteubbffokkxr3b21p7.png" alt="CircleCI environment variables" width="800" height="722"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the variables are saved, re-run the pipeline. It should now complete successfully and deploy your AWS Lambda function. From this point onward, CircleCI will retain your environment variables, and you won't need to configure then again unless you introduce new ones.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzrb2cgt9iz7srdyar4y.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzrb2cgt9iz7srdyar4y.PNG" alt="CircleCI build and deploy pipeline" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To confirm that the deployment works as expected, upload a PDF to the configured S3 bucket. The Lambda function should be automatically tiggered by the S3 event.&lt;/p&gt;

&lt;p&gt;To upload a PDF to the S3 bucket, you can use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;your-file.pdf s3://your-bucket-name/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To monitor the logs of the AWS Lambda function, you can use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws logs &lt;span class="nb"&gt;tail&lt;/span&gt; /aws/lambda/your-lambda-function &lt;span class="nt"&gt;--follow&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is set up correctly, you should see the logs of the AWS Lambda function and you can check of the PDF was processed correctly by checking your Zilliz Cloud collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4x24zfpabrw22dzav1xi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4x24zfpabrw22dzav1xi.png" alt="AWS Lambda invocation" width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleaning up
&lt;/h2&gt;

&lt;p&gt;If you do not need the respurces anymore, make sure you delete them to avoid unnecessary charges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this blog, you have walked through building and automating a serverless PDF processing pipeline using AWS Lambda, Docker, and CircleCI. The automated process involves triggering AWS Lambda functions via S3 events, generating embeddings using OpenAI, and storing them in Milvus on Zilliz Cloud. The CI/CD pipeline powered by CircleCI ensures that the code is automatically tested, built into a Docker image, and deployed to AWS Lambda, streamlining the development and deployment process.&lt;/p&gt;

&lt;p&gt;Using Docker with AWS Lambda provides a consistent environment for your AWS Lambda function, ensuring that dependencies and configurations are maintained across different stages. The CircleCI pipeline automates testing, building, and deployment, reducing manual intervention and enabling fast and reliable updates to the AWS Lambda function. These tools work together to ensure efficiency, scalability, and security.&lt;/p&gt;

&lt;p&gt;Looking ahead, potential improvements could include enhancing the error handling and logging in the AWS Lambda function, adding more comprehensive testing coverage, saving the environment variables in AWS Secrets Manager, and introducing monitoring and alerting to track the performance of the AWS Lambda function. As the pipeline evolves, it can scale to handle more complex workflows and integrations, ensuring continued success and reliability in production.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>cicd</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>End-to-end testing and deployment of a multi-agent AI system with Docker, LangGraph, and CircleCI</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Sat, 31 May 2025 09:38:13 +0000</pubDate>
      <link>https://dev.to/aws-builders/end-to-end-testing-and-deployment-of-a-multi-agent-ai-system-with-docker-langgraph-and-circleci-4gee</link>
      <guid>https://dev.to/aws-builders/end-to-end-testing-and-deployment-of-a-multi-agent-ai-system-with-docker-langgraph-and-circleci-4gee</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7aqrv5oaafdo584ytxu7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7aqrv5oaafdo584ytxu7.png" alt="CircleCI" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Multi-agent AI systems are transforming how intelligent applications are built. By orchestrating multiple specialized agents that collaborate to solve complex tasks, these systems enable more dynamic and efficient workflows. However, deploying such a system reliably and at scale requires a structured approach to testing, packaging, and automation.&lt;/p&gt;

&lt;p&gt;In this tutorial, you will walk through the process of building, testing, and deploying a multi-agent AI system using &lt;a href="https://www.langchain.com/langgraph" rel="noopener noreferrer"&gt;LangGraph&lt;/a&gt;, &lt;a href="https://www.docker.com/" rel="noopener noreferrer"&gt;Docker&lt;/a&gt;, &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, and &lt;a href="https://circleci.com/" rel="noopener noreferrer"&gt;CircleCI&lt;/a&gt;. You will develop a research-driven AI workflow where different agents,such as fact-checking, summarization, and search agents, work together seamlessly. You will package this application into a Docker container, deploy it to AWS Lambda, and automate the entire pipeline using CircleCI.&lt;/p&gt;

&lt;p&gt;By the end of this guide, you will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand how LangGraph enables stateful multi-agent interactions.&lt;/li&gt;
&lt;li&gt;Learn how to containerize the application for scalable cloud deployment.&lt;/li&gt;
&lt;li&gt;Set up end-to-end testing for agent reliability and AWS Lambda functionality.&lt;/li&gt;
&lt;li&gt;Use CircleCI to automate testing and deployment with every Git push.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This tutorial assumes some familiarity with Python, AWS, and Docker. You can check out the complete &lt;a href="https://github.com/benitomartin/multiagent-langgraph-circleci" rel="noopener noreferrer"&gt;source code&lt;/a&gt; on GitHub, but this guide will walk you through the process step by step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, ensure that you have the following requirements in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS Account&lt;/strong&gt;: &lt;a href="https://signin.aws.amazon.com/signup?request_type=register" rel="noopener noreferrer"&gt;Sign up&lt;/a&gt; for an AWS account if you do not already have one. You will use AWS Lambda and &lt;a href="https://aws.amazon.com/ecr/" rel="noopener noreferrer"&gt;Elastic Container Registry (ECR)&lt;/a&gt; for deployment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS CLI Installed and Configured&lt;/strong&gt;: Install the &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html" rel="noopener noreferrer"&gt;AWS Command Line Interface (CLI)&lt;/a&gt; and configure it with your AWS credentials. You can follow the AWS CLI setup guide.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS Bedrock&lt;/strong&gt;: You will be using &lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;AWS Bedrock&lt;/a&gt; Anthropic models. Therefore, you will need to request access to these models to be able to use them in your application. Specifically you will be using &lt;strong&gt;Claude 3 Haiku&lt;/strong&gt;. Under AWS Bedrock, go to &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html" rel="noopener noreferrer"&gt;model access and request access to the model&lt;/a&gt;. Once you see "Access granted" in green, you can invoke the model.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnci8ni3y8mm8phqwujdl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnci8ni3y8mm8phqwujdl.png" alt="AWS Bedrock Haiku" width="800" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Basic Knowledge of LangChain or LangGraph&lt;/strong&gt;:Understanding the fundamentals of LangChain and LangGraph will help you design the multi-agent workflow efficiently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Familiarity with AWS Lambda and Docker&lt;/strong&gt;: You should know the basics of AWS Lambda and Docker, as you will use them to package and deploy the application.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GitHub and CircleCI Accounts&lt;/strong&gt;: Create accounts on GitHub and CircleCI to manage the version control and automate the &lt;a href="https://circleci.com/ci-cd/" rel="noopener noreferrer"&gt;CI/CD pipeline&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAI API Key&lt;/strong&gt;: To access OpenAI’s GPT models, you will need an API key. You can sign up for an API key on the &lt;a href="https://platform.openai.com/signup" rel="noopener noreferrer"&gt;OpenAI website&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Serper API Key&lt;/strong&gt;: To perform Google Search queries programmatically, obtain a free API key from &lt;a href="https://serper.dev/" rel="noopener noreferrer"&gt;Serper&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;uv&lt;/strong&gt;: Install &lt;a href="https://docs.astral.sh/uv/getting-started/installation/" rel="noopener noreferrer"&gt;&lt;code&gt;uv&lt;/code&gt;&lt;/a&gt; to manage dependencies and virtual environments. Set up instructions can be found in the "Installing Dependencies" section below.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once you have these prerequisites in place, you will be ready to set up the multi-agent project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up a Multiagent Project
&lt;/h2&gt;

&lt;p&gt;Before you start building the multi-agent system, you need to set up the project environment, install dependencies, and understand the role of LangGraph in managing multi-agent workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview of LangGraph
&lt;/h3&gt;

&lt;p&gt;LangGraph is a framework built on &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; that enables structured multi-agent interactions. It allows you to define agent workflows, manage stateful decision-making, and create directed graphs where agents collaborate based on predefined logic.&lt;/p&gt;

&lt;p&gt;Key features of LangGraph include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stateful Agent Orchestration&lt;/strong&gt;: Agents can remember past interactions and modify behavior accordingly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph-Based Workflow Execution&lt;/strong&gt;: Define agent interactions using nodes and edges, making execution flow transparent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decision-Making and Conditional Branching&lt;/strong&gt;: Implement custom logic that determines how agents respond to inputs dynamically.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this project, you will use LangGraph to design an AI research workflow where agents handle tasks such as searching for information, summarizing content, fact-checking, and generating reports.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Environment
&lt;/h3&gt;

&lt;p&gt;First, clone the repository containing the project code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/benitomartin/multiagent-langgraph-circleci.git
&lt;span class="nb"&gt;cd &lt;/span&gt;multiagent-langgraph-circleci
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, run the following commands to install dependencies and set up the virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--all-extras&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These command will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install the dependencies defined in &lt;code&gt;pyproject.toml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Automatically create a virtual environment (&lt;code&gt;.venv&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Activates the virtual environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, creeate &lt;code&gt;.env&lt;/code&gt; file in the root directory of your repository and add the required environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SERPER_API_KEY=your_serper_key_here                
OPENAI_API_KEY=your_openai_key_here
REPOSITORY_NAME=langgraph-ecr-docker-repo
LAMBDA_FUNCTION_NAME=langgraph-lambda-function     
ROLE_NAME=lambda-bedrock-role                      
ROLE_POLICY_NAME=LambdaBedrockPolicy  
IMAGE_NAME=langgraph-lambda-image                  
AWS_REGION=your_aws_region                          
AWS_ACCESS_KEY_ID=your_aws_access_key              
AWS_SECRET_ACCESS_KEY=your_aws_secret_key          
AWS_ACCOUNT_ID=your_aws_account_id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These variables will be used for API authentication, AWS service configurations, and deployment settings. During deployment, the AWS ECR repository, the AWS Lambda function, the IAM Role and Policy will be created automatically using a bash script.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Project Structure
&lt;/h3&gt;

&lt;p&gt;The project has the following directory structure. If you cloned the repository, you should already have this structure in place. If you are setting up the project manually, create these directories and files as needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;multiagent-langgraph-circleci/
├── .env                          
├── .gitignore                     
├── README.md                      
├── pyproject.toml                
├── uv.lock                     
├── build_deploy.sh                
├── Dockerfile                    
├── requirements.txt               
├── lambda_function/
│   ├── lambda_handler.py          
│   └── requirements.txt           
└── src/
    ├── agents/                    
    │   ├── __init__.py
    │   ├── fact_checking_agent.py
    │   ├── report_generation_agent.py
    │   ├── search_agent.py
    │   ├── stop_workflow_agent.py
    │   └── summarization_agent.py
    ├── graph/
    │   └── research_graph.py      
    ├── models/
    │   ├── __init__.py           
    │   └── schemas.py             
    └── utils/                      
        ├── chain_builder.py
        ├── error_handler.py
        └── prompt_templates.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure ensures modularity and maintainability, allowing for seamless multi-agent interactions within the LangGraph framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining and Creating Agents
&lt;/h2&gt;

&lt;p&gt;In this section, you will define the agents responsible for handling different tasks in the multi-agent workflow, and understand the role of Pydantic models in structuring data for inter-agent communication. The agents will be designed to interact with each other, share data, and perform specific operations, such as search, summarization, fact-checking, and report generation.&lt;/p&gt;

&lt;p&gt;The logic of this multi-agentworkflow is shown in the following figure:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjriwdgll2o798nrtedbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjriwdgll2o798nrtedbr.png" alt="langgraph workflow" width="289" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Schemas for Agent Communication
&lt;/h3&gt;

&lt;p&gt;Before defining the agents themselves, you will need to set up Pydantic models to define the data structure for agent communication. These models will ensure that data exchanged between agents is validated and formatted correctly.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;schemas.py&lt;/code&gt; file contains the definitions of several models that will be used by the agents to manage the state and data during the process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="n"&gt;summarized_content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;fact_checked_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;final_report&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;fact_check_attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;summarization_attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;search_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SearchResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The title of the search result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The URL of the search result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A brief excerpt or summary of the search result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;main_points&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List of key points from the search results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;benefits&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List of specific benefits of the search results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conclusion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A concise conclusion about the search results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FactCheckResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;is_accurate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Whether the summary is factually accurate based on the search results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List of inaccuracies or inconsistencies found in the summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;corrected_facts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List of corrections for any identified issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Confidence score from 0.0 to 1.0 indicating reliability of the fact check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinalReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The final research report generated from the summary and fact-check results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ResearchState&lt;/code&gt; model tracks the overall state of the research workflow. It includes information such as the search query, results, summary, fact-check status, and errors. The other models (&lt;code&gt;SearchResult&lt;/code&gt;, &lt;code&gt;Summary&lt;/code&gt;, &lt;code&gt;FactCheckResult&lt;/code&gt;, and &lt;code&gt;FinalReport&lt;/code&gt;) define the data structures for the results that each agent will produce or consume during the workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prompt Templates for Agents
&lt;/h3&gt;

&lt;p&gt;To guide the agents' actions, you will use &lt;a href="https://python.langchain.com/docs/how_to/#prompt-templates" rel="noopener noreferrer"&gt;prompt templates&lt;/a&gt;. These templates define the instructions that will be passed to the large language model (LLM) to guide its response.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;prompt_templates.py&lt;/code&gt; file, you can define the prompts used by the summarization, fact-checking, and report generation agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PromptTemplates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Centralized class for all prompt templates used in the research workflow.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;summarization_prompt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a summarization agent. Summarize the following search results:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{results}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Provide a structured summary of the key information about the benefits &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;and main points.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fact_checking_prompt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a fact-checking agent. Review the following summary and verify it &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;against the original search results:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summary: {summary}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Original results: {original_results}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Identify any inaccuracies or inconsistencies. Provide a confidence score &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;indicating how reliable your fact check is.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;report_generation_prompt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a report generation agent. Create a comprehensive research report &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;based on the following information:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Original query: {query}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content summary: {summary}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Format the report with markdown, including appropriate headings, bullet &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;points, and sections. The report should be informative, well-structured, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;and directly address the original query.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each prompt template corresponds to a specific agent task in the workflow. The &lt;code&gt;summarization_prompt&lt;/code&gt; will guide the summarization agent to generate summaries, while the &lt;code&gt;fact_checking_prompt&lt;/code&gt; will check the confidence of the summary and the &lt;code&gt;report_generation_prompt&lt;/code&gt; will guide the report generation agent to create reports.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building Chains with the Agents
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;chain_builder.py&lt;/code&gt; file is responsible for linking the prompt templates with the LLMs that execute the tasks. The &lt;code&gt;ChainBuilder&lt;/code&gt; class will combine the respective input variables, a prompt template and an LLM to execute the required steps.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChainBuilder&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_vars&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;structured_llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_structured_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;input_variables&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;input_vars&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;structured_llm&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ChainBuilder&lt;/code&gt; helps in &lt;a href="https://python.langchain.com/docs/how_to/structured_output/" rel="noopener noreferrer"&gt;structuring&lt;/a&gt; the agents interaction with the LLM, ensuring that the output is structured according to the Pydantic model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error Handling
&lt;/h3&gt;

&lt;p&gt;To ensure smooth execution of the agents, you can use an error-handling utility in the &lt;code&gt;error_handler.py&lt;/code&gt; file to log and add errors as well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ErrorHandler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This utility is especially useful for tracking issues that may arise during agent execution and can be used to update the state with error messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Search Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;SearchAgent&lt;/code&gt; is responsible for querying Google and retrieving search results. It uses the &lt;strong&gt;GoogleSerperAPIWrapper&lt;/strong&gt; to interface with the Google Serper API. The following code defines the agent behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.utilities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleSerperAPIWrapper&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SearchResult&lt;/span&gt;  

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SearchAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;serper_api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleSerperAPIWrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serper_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;serper_api_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;max_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search agent error: No query provided&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;
            &lt;span class="n"&gt;raw_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;results&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Convert raw search results to instances of the SearchResult model
&lt;/span&gt;            &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="nc"&gt;SearchResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;link&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;organic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;

            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search agent found &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; results with max results equal to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search Results: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

            &lt;span class="c1"&gt;# Return the list of SearchResult objects
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;  
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search agent error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This agent is tasked with performing searches based on the given query, collecting the results, and then formatting them as &lt;code&gt;SearchResult&lt;/code&gt; instances.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Summarize Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;SummarizationAgent&lt;/code&gt; is responsible for summarizing the search results retrieved by the &lt;code&gt;SearchAgent&lt;/code&gt;. It uses the language model to summarize key points, benefits, and conclusions based on the search results provided.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Summary&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.chain_builder&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChainBuilder&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.prompt_templates&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplates&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SummarizationAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChainBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;search_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarization agent executing with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; search results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Increment the summarization attempt counter
&lt;/span&gt;        &lt;span class="n"&gt;summarization_attempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarization_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarization_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summarization_attempts&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
            &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarization agent error: No search results to summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;summary_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PromptTemplates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summarization_prompt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;input_vars&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Summary&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Format the search results as a string for the prompt
&lt;/span&gt;            &lt;span class="n"&gt;formatted_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;URL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Snippet: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;search_results&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;

            &lt;span class="c1"&gt;# Invoke the chain with the formatted results
&lt;/span&gt;            &lt;span class="n"&gt;summary_obj&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summary_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;formatted_results&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

            &lt;span class="c1"&gt;# Format the summary object into a string
&lt;/span&gt;            &lt;span class="n"&gt;summary_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# Summary&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;summary_str&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;## Key Points&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;main_points&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;summary_str&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;## Benefits&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;benefits&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="n"&gt;summary_str&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;## Conclusion&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary_obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conclusion&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_str&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
            &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarization agent error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;SummarizationAgent&lt;/code&gt; uses the &lt;code&gt;ChainBuilder&lt;/code&gt; to create a chain from the &lt;code&gt;summarization_prompt&lt;/code&gt; template and the LLM. It then formats the search results of the &lt;code&gt;SearchAgent&lt;/code&gt;into a string and passes them to the language model for summarization. The result is structured into key points, benefits, and conclusions.&lt;/p&gt;

&lt;p&gt;Note that the summarization attempts counter is incremented each time the agent is executed. This happen if in the next agent, the &lt;code&gt;FactCheckingAgent&lt;/code&gt;, the confidence score from the fact-checking process is below a certain threshold.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Fact-Checking Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;FactCheckingAgent&lt;/code&gt; is designed to validate the accuracy of a generated summary by comparing it with the original search results. It uses the&lt;code&gt;fact_checking_prompt&lt;/code&gt; template to cross-reference the summary with the search results. If the fact-checking process determines that the confidence score is low, it triggers additional search retries with an increased number of results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FactCheckResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.chain_builder&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChainBuilder&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.error_handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ErrorHandler&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.prompt_templates&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplates&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FactCheckingAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add_max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChainBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;  &lt;span class="c1"&gt;# Store the passed confidence threshold
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;                    &lt;span class="c1"&gt;# Store the max_retries value
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_max_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;add_max_results&lt;/span&gt;            &lt;span class="c1"&gt;# Store the add_max_results value
&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;search_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

        &lt;span class="c1"&gt;# Increment fact-checking attempt counter
&lt;/span&gt;        &lt;span class="n"&gt;fact_check_attempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_check_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_check_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_attempts&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact-checking agent error: Missing data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;

        &lt;span class="n"&gt;fact_check_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PromptTemplates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fact_checking_prompt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;input_vars&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;original_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FactCheckResult&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;results_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;fact_check_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;original_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact-checking agent completed review. Accurate: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fact_check_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_accurate&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Confidence score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fact_check_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;fact_check_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;retry_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;max_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retry Count:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Max Results:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

                    &lt;span class="c1"&gt;# Only increase max_results if we're NOT about to hit the retry cap
&lt;/span&gt;                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retrying search number &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_max_results&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Increasing max_results to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;max_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_checked_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_results&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrorHandler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact-checking agent error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;FactCheckingAgent&lt;/code&gt;uses the &lt;code&gt;ChainBuilder&lt;/code&gt; to construct the chain with the &lt;code&gt;FactCheckResult&lt;/code&gt; prompt. It checks the accuracy of the summary from the &lt;code&gt;SummarizationAgent&lt;/code&gt; by comparing it with the search results from the &lt;code&gt;SearchAgent&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Generate Report Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;ReportGenerationAgent&lt;/code&gt; is responsible for generating the final research report based on the query and summary, only after it passes the fact checking process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FinalReport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.chain_builder&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChainBuilder&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.error_handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ErrorHandler&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.utils.prompt_templates&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplates&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ReportGenerationAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChainBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report generation agent error: Missing required content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;

        &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain_builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PromptTemplates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;report_generation_prompt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;input_vars&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FinalReport&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;final_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="n"&gt;final_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;final_report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

            &lt;span class="c1"&gt;# Print the final report
&lt;/span&gt;            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;======= FINAL REPORT =======&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final_report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;============================&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;final_report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrorHandler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report generation agent error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ReportGenerationAgent&lt;/code&gt; generates the final research report using the &lt;code&gt;ChainBuilder&lt;/code&gt; and &lt;code&gt;report_generation_prompt&lt;/code&gt;, combining the search query and the summary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Stop Workflow Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;StopWorkflowAgent&lt;/code&gt; is responsible for stopping the workflow if there is any errors and displaying the final confidence score in case it has been calculated.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StopWorkflowAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Stops the workflow and displays the final confidence score&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_checked_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;N/A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Add the message to the errors list
&lt;/span&gt;        &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building the Agentic Graph
&lt;/h2&gt;

&lt;p&gt;In the previous sections, you have defined individual agents and their respective roles. Now, you need to organize them into an &lt;strong&gt;Agentic Graph&lt;/strong&gt; that manages the flow of the entire multi-agent system. This graph allows the agents to interact in a sequence based on their results, making the workflow adaptive and capable of handling real-world scenarios such as retries and failure conditions.&lt;/p&gt;

&lt;p&gt;Here is a step-by-step explanation of each part of the Agentic Graph construction, under the file &lt;code&gt;research_graph.py&lt;/code&gt; and how it integrates with the agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration Settings
&lt;/h3&gt;

&lt;p&gt;The configuration settings, under &lt;code&gt;settings.py&lt;/code&gt; load essential environment variables, such as API keys for different services, such as the Serper API, OpenAI and AWS variables. It also sets the LLM models for the Summarization and Fact-Checking agents, as well as the confidence threshold, number of search results and maximum retries for the Fact-Checking agent. You can select the models that best suit your needs but for this example you will use &lt;code&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/code&gt; from AWS Bedrock for the summarization agent and &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/" rel="noopener noreferrer"&gt;&lt;code&gt;gpt-4o-mini&lt;/code&gt;&lt;/a&gt; for the fact-checking agent. This last one will act as LLM-as-a-judge, to be sure that the summary is accurate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="c1"&gt;# Load environment variables from .env file
&lt;/span&gt;&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# API Keys
&lt;/span&gt;&lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Model Settings
&lt;/span&gt;&lt;span class="n"&gt;FACT_CHECK_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;SUMMARIZATION_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Workflow Settings
&lt;/span&gt;&lt;span class="n"&gt;CONFIDENCE_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;
&lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;

&lt;span class="c1"&gt;# Validate required environment variables
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SERPER_API_KEY environment variable is not set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY environment variable is not set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Importing Required Libraries and Modules
&lt;/h3&gt;

&lt;p&gt;To construct and run the agentic graph, you need to import several modules together with the defined agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ChatBedrock&lt;/strong&gt; and &lt;strong&gt;ChatOpenAI&lt;/strong&gt;: For working with AWS and OpenAI-based models, respectively.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;StateGraph&lt;/strong&gt;: To build and manage the workflow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ResearchState&lt;/strong&gt;: This schema keeps track of the research flow and stores state information.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_aws&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatBedrock&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config.settings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;FACT_CHECK_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SUMMARIZATION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CONFIDENCE_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.fact_checking_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FactCheckingAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.report_generation_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ReportGenerationAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.search_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SearchAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.stop_workflow_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StopWorkflowAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.summarization_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SummarizationAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.models.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Initializing Agents
&lt;/h3&gt;

&lt;p&gt;Each agent is initialized based on the configuration and APIs available. This step ensures that each agent has the required models and settings to operate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_research_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serper_api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                         &lt;span class="n"&gt;openai_api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CONFIDENCE_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;add_max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="c1"&gt;# Initialize different models for Summarization and Fact-Checking agents
&lt;/span&gt;    &lt;span class="n"&gt;fact_check_llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FACT_CHECK_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;openai_api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;summarization_llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatBedrock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SUMMARIZATION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Initialize agents
&lt;/span&gt;    &lt;span class="n"&gt;search_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SearchAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serper_api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;summarization_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SummarizationAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summarization_llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fact_checking_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FactCheckingAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fact_check_llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add_max_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;report_generation_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ReportGenerationAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summarization_llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;stop_workflow_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StopWorkflowAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, each agent is initialized with the corresponding models and configurations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SearchAgent&lt;/strong&gt;: Handles the search task using the Serper API.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SummarizationAgent&lt;/strong&gt;: Summarizes results using AWS Bedrock.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FactCheckingAgent&lt;/strong&gt;: Fact-checks results using OpenAI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ReportGenerationAgent&lt;/strong&gt;: Generates the final research report.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;StopWorkflowAgent&lt;/strong&gt;: Ends the workflow if there is any error.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Defining the Graph
&lt;/h3&gt;

&lt;p&gt;After the agents are initialized, you define the &lt;strong&gt;StateGraph&lt;/strong&gt; that represents the flow of the research process. You set the &lt;strong&gt;SearchAgent&lt;/strong&gt; as the entry point of the graph.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;StateGraph&lt;/strong&gt;: This is the heart of the agentic graph, defining how the agents interact and progress.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ResearchState&lt;/strong&gt;: This schema holds all the data and state required by the system.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="c1"&gt;# Define graph state
&lt;/span&gt;    &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Set entry point
&lt;/span&gt;    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_entry_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Adding Nodes
&lt;/h3&gt;

&lt;p&gt;Next, you add each agent as a node in the state graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="c1"&gt;# Add nodes
&lt;/span&gt;    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summarization_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fact_checking_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;report_generation_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stop_workflow_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each agent is associated with its corresponding execute method, which will be invoked during the workflow. These nodes represent the tasks or steps in the research process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding Conditional Edges
&lt;/h3&gt;

&lt;p&gt;Conditional edges dictate the flow from one agent to the next. Each function checks the current state and returns the next agent based on conditions. For instance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If there are errors at any point, the workflow stops.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If the confidence score from the fact-checking is low, it retries the search with more search results,up to a maximum number of retries.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After these conditions are defined, you add the conditional edges between agents and compile the graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_search_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_summarization_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_fact_check_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;fact_check_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_checked_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
        &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fact_check_result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Maximum retry attempts (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) reached. Stopping workflow.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Go back to search if retries are available
&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Proceed to report generation
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_report_complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_stop_workflow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;

    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_search_complete&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_summarization_complete&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fact Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_fact_check_complete&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_report_complete&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop Workflow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_stop_workflow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running the Research Graph
&lt;/h3&gt;

&lt;p&gt;Finally, the graph is executed by invoking it with the research query. You also need to parse command-line arguments to define the confidence threshold, maximum retries, and additional results per retry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Run research graph with custom parameters&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the benefits of using AWS Cloud Services?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Research query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--confidence-threshold&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CONFIDENCE_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence score threshold (0-1)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--max-retries&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Maximum number of retries&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--add-max-results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Number of additional results per retry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_research_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence_threshold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;add_max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_max_results&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After you set up the graph you can invoke it with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run src/graph/research_graph.py &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"What are the benefits of using CircleCI?"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--confidence-threshold&lt;/span&gt; 0.85 &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--max-retries&lt;/span&gt; 3 &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--add-max-results&lt;/span&gt; 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will output the final report based on the structure defined in the &lt;code&gt;schemas.py&lt;/code&gt; file. If the confidence score is below the threshold, it will retry the search up to the maximum number of retries adding more results to the next search.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2hoyrwkntkvby286t00k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2hoyrwkntkvby286t00k.png" alt="LangGraph local invoke" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the confidence score is below the threshold, it will retry the search up to the maximum number of retries adding more results to the next search and only display the final report if the confidence score is above the threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dockerizing the Application for AWS Lambda
&lt;/h2&gt;

&lt;p&gt;To deploy your multi-agent research application on &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, you need to package it in a way that ensures compatibility, portability, and ease of deployment. Using Docker simplifies this process by allowing you to create a containerized environment that includes all dependencies, ensuring your application runs seamlessly on AWS Lambda.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Lambda Function Handler
&lt;/h3&gt;

&lt;p&gt;The Lambda handler function acts as the entry point for your application when invoked by AWS Lambda. This function processes incoming requests, executes the agentic graph, and returns results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config.settings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CONFIDENCE_SCORE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.graph.research_graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;build_research_graph&lt;/span&gt;

&lt;span class="c1"&gt;# Set up logging
&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basicConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Log environment variables (masked for security)
&lt;/span&gt;    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SERPER_API_KEY present: %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY present: %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_ACCESS_KEY_ID present: %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_SECRET_ACCESS_KEY present: %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_DEFAULT_REGION present: %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="c1"&gt;# Extract parameters from event with defaults
&lt;/span&gt;    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the benefits of using AWS Cloud Services?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CONFIDENCE_SCORE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;add_max_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;add_max_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ADD_MAX_RESULTS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Validate parameters
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;confidence_score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Confidence score must be between 0 and 1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Max retries must be non-negative&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;add_max_results&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Additional max results must be positive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Build the research graph with custom parameters
&lt;/span&gt;    &lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_research_graph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;add_max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;add_max_results&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Run the graph
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_checked_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_check_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarization_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Writing the Dockerfile
&lt;/h3&gt;

&lt;p&gt;Your Dockerfile packages the application with all required dependencies, making it ready for deployment on AWS Lambda. Since AWS Lambda works best with a &lt;code&gt;requirements.txt&lt;/code&gt; file, you need to transfer dependencies from &lt;code&gt;pyproject.toml&lt;/code&gt; to &lt;code&gt;requirements.txt&lt;/code&gt;. You can do this using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv pip freeze &lt;span class="nt"&gt;--exclude-editable&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, create the Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/python:3.12&lt;/span&gt;

&lt;span class="c"&gt;# Set the working directory to /var/task&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${LAMBDA_TASK_ROOT}&lt;/span&gt;

&lt;span class="c"&gt;# Copy requirements first to leverage Docker cache&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt ./&lt;/span&gt;

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Copy source code and config&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; lambda_function/lambda_handler.py ./lambda_handler.py&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; src ./src&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; config ./config&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; .env ./.env&lt;/span&gt;

&lt;span class="c"&gt;# Command to run the Lambda handler function&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; [ "lambda_handler.lambda_handler" ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing the Docker Image Locally
&lt;/h3&gt;

&lt;p&gt;Before deploying your containerized Lambda function to AWS, you should test it locally to ensure it behaves as expected.&lt;/p&gt;

&lt;p&gt;Run the following command to build the image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; langgraph-lambda-function &lt;span class="nb"&gt;.&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use Docker to run the container and expose it on port 9000. Set up AWS credentials as environment variables to simulate execution in AWS Lambda:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 9000:8080 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;YOUR_AWS_ACCESS_KEY_ID&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;YOUR_AWS_SECRET_ACCESS_KEY&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;YOUR_AWS_DEFAULT_REGION&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    langgraph-lambda-function
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;YOUR_AWS_ACCESS_KEY_ID&lt;/code&gt;, &lt;code&gt;YOUR_AWS_SECRET_ACCESS_KEY&lt;/code&gt;, and &lt;code&gt;YOUR_AWS_DEFAULT_REGION&lt;/code&gt; with your AWS credentials before running the above command.&lt;/p&gt;

&lt;p&gt;Now, use &lt;code&gt;curl&lt;/code&gt; to send a test event to the running container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"http://localhost:9000/2015-03-31/functions/function/invocations"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"query": "What are the benefits of using AWS Cloud Services?", "confidence_threshold": 0.9, "max_retries": 2, "add_max_results": 3}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything works as expected, you should see the final report as in the previous section where you run the research graph locally. Now you are ready to deploy your application to AWS Lambda.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing Tests for the Application
&lt;/h2&gt;

&lt;p&gt;Testing is essential for validating the correctness of your multi-agent workflow and ensuring your AWS Lambda function behaves as expected. To be sure the application works correctly, you can write &lt;a href="https://circleci.com/blog/unit-testing-vs-integration-testing/" rel="noopener noreferrer"&gt;unit tests&lt;/a&gt; for the Lambda handler function under &lt;code&gt;test_lambda_handler.py&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;lambda_function.lambda_handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lambda_handler&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_lambda_handler&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Create a mock event to simulate an AWS Lambda invocation
&lt;/span&gt;    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the capital of France?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Call the lambda_handler function
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Print the response for testing
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Response:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;


    &lt;span class="c1"&gt;# Assertions to validate the response
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Additionally you can also write some agents like the &lt;code&gt;SearchAgent&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agents.search_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SearchAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config.settings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_search_agent&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Tests the SearchAgent with a mock query.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Initialize the agent
&lt;/span&gt;    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SearchAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Test search execution
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is CircleCI?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Assertions
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By running the tests with &lt;a href="https://circleci.com/blog/pytest-python-testing/" rel="noopener noreferrer"&gt;pytest&lt;/a&gt;, you can ensure that your application functions correctly and that the multi-agent workflow is executed as expected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;uv run pytest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deploying to AWS Lambda with CircleCI
&lt;/h2&gt;

&lt;p&gt;Now it is time to automate the process of deploying your Dockerized application to AWS Lambda using CircleCI. CircleCI is a powerful CI/CD tool that can automate testing, building, and deploying your Lambda function. To simplify the process, you can use a shell script (&lt;code&gt;build_deploy.sh&lt;/code&gt;) to handle the deployment steps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Build and Deploy Script
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;build_deploy.sh&lt;/code&gt; script automates the process of building, testing, and deploying your Dockerized Lambda function. It includes all the necessary steps to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build the Docker image.&lt;/li&gt;
&lt;li&gt;Push it to &lt;a href="https://aws.amazon.com/ecr/" rel="noopener noreferrer"&gt;Amazon Elastic Container Registry (ECR)&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Deploy it to AWS Lambda with the required &lt;a href="https://aws.amazon.com/iam/" rel="noopener noreferrer"&gt;IAM&lt;/a&gt; permissions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that you need to create the &lt;code&gt;assume-role.json&lt;/code&gt; file in advance to execute the script. This file is required to define the IAM role assumption policy for Lambda to assume the role with permissions to access the required AWS resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lambda.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To execute the script, you need to make it executable and then run it. You can find the script in the repository &lt;a href="https://github.com/benitomartin/multiagent-langgraph-circleci/blob/main/build_deploy.sh" rel="noopener noreferrer"&gt;here&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x build_deploy.sh
./build_deploy.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the breakdown of the script:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Exit on Errors (set -e)&lt;/strong&gt;: The script is configured to exit immediately if any command fails, ensuring that errors are caught early.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Loading Environment Variables&lt;/strong&gt;: Loads the environment variables from a &lt;code&gt;.env&lt;/code&gt; file. These variables contain sensitive information such as AWS keys, region, repository name, etc. The script uses these to configure AWS interactions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ECR Repository Check &amp;amp; Creation&lt;/strong&gt;: Check if the Amazon Elastic Container Registry (ECR) repository exist. If not, it creates a new repository where the Docker image will be pushed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generating &lt;code&gt;requirements.txt&lt;/code&gt;&lt;/strong&gt;: It generates the requirements file containing the dependencies of the Lambda function. This file is essential for ensuring the Lambda function is packaged correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Docker Image Build&lt;/strong&gt;: The Docker image for the Lambda function is built using the &lt;code&gt;docker buildx build&lt;/code&gt; command. This creates a platform-specific image compatible with AWS Lambda.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ECR Authentication&lt;/strong&gt;: The script logs into AWS ECR using &lt;code&gt;aws ecr get-login-password&lt;/code&gt;. This is required to push the Docker image to the ECR repository.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tagging Docker Image&lt;/strong&gt;: The image is tagged with the repository URI to ensure it is associated with the correct repository in ECR.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pushing Docker Image to ECR&lt;/strong&gt;: The tagged Docker image is pushed to the ECR repository.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Creating IAM Role (Lambda Execution Role)&lt;/strong&gt;: The script checks if an IAM role for Lambda exists. If it does not, it creates a new role with the necessary permissions for Lambda to execute and access AWS resources like Bedrock.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lambda Function Creation/Update&lt;/strong&gt;: The script checks if the Lambda function already exists. If not, it creates a new Lambda function using the Docker image stored in ECR. If the function already exists, it updates the Lambda function with the new Docker image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Completion&lt;/strong&gt;: Once the deployment is finished, the script prints Deployment complete, indicating that the process was successful.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Testing the Deployment
&lt;/h3&gt;

&lt;p&gt;To test the deployed Lambda function, you can invoke it with the following command and adding your AWS Region:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda invoke &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--function-name&lt;/span&gt; langgraph-lambda-function &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--payload&lt;/span&gt; &lt;span class="s1"&gt;'{"query": "What are the benefits of using AWS Cloud Services?"}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--region&lt;/span&gt; &amp;lt;your_region&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cli-binary-format&lt;/span&gt; raw-in-base64-out &lt;span class="se"&gt;\&lt;/span&gt;
    response.json &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;cat &lt;/span&gt;response.json | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response will contain the Lambda's output, including the generated report in the following structured format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ezrss81ie21wi16zh6w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ezrss81ie21wi16zh6w.png" alt="LangGraph aws lambda invoke" width="800" height="249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CircleCI Configuration
&lt;/h3&gt;

&lt;p&gt;To automate the deployment pipeline using CircleCI, you need to define a CircleCI config file (&lt;code&gt;config.yml&lt;/code&gt;). This file will automate all the tasks like building the Docker image, running tests, and deploying the image to AWS.&lt;/p&gt;

&lt;p&gt;Before CircleCI can deploy your serverless application to AWS, you need to configure your &lt;a href="https://circleci.com/docs/set-environment-variable/" rel="noopener noreferrer"&gt;environment variables&lt;/a&gt; in the CircleCI project &lt;a href="https://circleci.com/docs/create-project/" rel="noopener noreferrer"&gt;settings&lt;/a&gt;. In your CircleCI account, set up a project and link it to your GitHub repository. Then, under project settings, add the environment variables that you have defined in your &lt;code&gt;.env&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1n3gz7ja49rfnrjqu26o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1n3gz7ja49rfnrjqu26o.png" alt="environment variables circleci" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is the breakdown of the &lt;code&gt;config.yml&lt;/code&gt; file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orbs&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;aws-cli&lt;/code&gt;: The AWS CLI orb simplifies the setup of AWS CLI to interact with AWS services.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docker&lt;/code&gt;: The CircleCI Docker orb handles setting up the Docker environment.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Jobs&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;build-deploy&lt;/code&gt;: This job defines all the steps necessary to deploy your application, from installing all required dependencies, including &lt;code&gt;uv&lt;/code&gt;, AWS CLI, setting up the environment variables, running tests, running docker, till building and deploying the lambda package with all the necessary permissions.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Workflows&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;The deploy workflow ensures that the build-deploy job is executed in sequence whenever changes are pushed to the repository.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2.1&lt;/span&gt;

&lt;span class="na"&gt;orbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;aws-cli&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/aws-cli@5.2.0&lt;/span&gt;
  &lt;span class="na"&gt;docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/docker@2.8.2&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build-deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cimg/python:3.12&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;checkout&lt;/span&gt;  

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install UV&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;curl -LsSf https://astral.sh/uv/install.sh | sh&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create venv and install dependencies&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv sync --all-extras&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run ruff&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv run ruff check --fix --unsafe-fixes .&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run tests&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;uv run pytest&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create .env file&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;echo "SERPER_API_KEY=${SERPER_API_KEY}" &amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "OPENAI_API_KEY=${OPENAI_API_KEY}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_REGION=${AWS_REGION}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "AWS_ACCOUNT_ID=${AWS_ACCOUNT_ID}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "REPOSITORY_NAME=${REPOSITORY_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "IMAGE_NAME=${IMAGE_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "LAMBDA_FUNCTION_NAME=${LAMBDA_FUNCTION_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "ROLE_NAME=${ROLE_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;
            &lt;span class="s"&gt;echo "ROLE_POLICY_NAME=${ROLE_POLICY_NAME}" &amp;gt;&amp;gt; .env&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;setup_remote_docker&lt;/span&gt;


      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;aws-cli/setup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;profile_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy to AWS&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;chmod +x build_deploy.sh&lt;/span&gt;
            &lt;span class="s"&gt;./build_deploy.sh&lt;/span&gt;

&lt;span class="na"&gt;workflows&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build-deploy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the config file and the environment variables are set up, you can commit and push your changes to GitHub. CircleCI will automatically trigger the deployment pipeline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Folb4a1082ahplj3eqr8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Folb4a1082ahplj3eqr8a.png" alt="CI/CD build deploy" width="800" height="612"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can invoke the deployed Lamba function again, adding for example the additional parameters, like &lt;code&gt;confidence-threshold&lt;/code&gt;, &lt;code&gt;max-retries&lt;/code&gt;, and &lt;code&gt;add-max-results&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda invoke &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--function-name&lt;/span&gt; langgraph-lambda-function &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--payload&lt;/span&gt; &lt;span class="s1"&gt;'{"query": "What are the benefits of using CircleCI?", "confidence-threshold": 0.85, "max-retries": 3, "add-max-results": 2}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--region&lt;/span&gt; &amp;lt;your_aws_region&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cli-binary-format&lt;/span&gt; raw-in-base64-out &lt;span class="se"&gt;\&lt;/span&gt;
    response.json &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;cat &lt;/span&gt;response.json | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0gaa0v81glu9f755ov6s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0gaa0v81glu9f755ov6s.png" alt="LangGraph aws lambda invoke retry" width="800" height="273"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleaning up
&lt;/h2&gt;

&lt;p&gt;If you do not need the app anymore, make sure you delete the resources to avoid unnecessary charges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Automating AWS Lambda deployments using CircleCI streamlines the entire development pipeline, ensuring efficient, reliable, and repeatable deployments. By leveraging CircleCI’s &lt;a href="(https://circleci.com/ci-cd/)"&gt;Continuous Integration and Continuous Deployment (CI/CD)&lt;/a&gt; capabilities, you can integrate automated testing, dependency management, and security best practices into your workflows. This minimizes human errors and allows you to focus on writing and improving code rather than manually managing deployments.&lt;/p&gt;

&lt;p&gt;Furthermore, incorporating AWS services like Elastic Container Registry (ECR) and Identity and Access Management (IAM) ensures secure and scalable deployments. The use of environment variables and automated role assignments enhances security while maintaining flexibility. By structuring the deployment pipeline in a modular and reusable way, this approach not only simplifies the deployment of Lambda functions but also ensures adaptability for future enhancements.&lt;/p&gt;

&lt;p&gt;Furthermore, integrating LangGraph improves workflow orchestration by enabling structured, multi-agent interactions in AI-driven applications. This allows for better coordination between AI agents, improving decision-making and automation within the system, ultimately leading to more dynamic and intelligent application behavior.&lt;/p&gt;

&lt;p&gt;With this automated pipeline in place, you can deploy updates confidently, knowing that every change undergoes thorough testing before reaching production. This methodology aligns with modern DevOps practices, fostering team collaboration, reducing downtime, and accelerating software delivery.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Safely Delete All AWS S3 Buckets Using a Bash Script</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Mon, 31 Mar 2025 12:22:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/how-to-safely-delete-all-aws-s3-buckets-using-a-bash-script-1og2</link>
      <guid>https://dev.to/aws-builders/how-to-safely-delete-all-aws-s3-buckets-using-a-bash-script-1og2</guid>
      <description>&lt;p&gt;Have you ever faced the situation where you realized you have too many S3 buckets in your account and you want to delete all? But AWS will not allow you to delete all at once in the console, especially if there is still content in them.&lt;/p&gt;

&lt;p&gt;The following Bash script automates and speeds up this process while ensuring that both object versions and delete markers are properly removed before bucket deletion. In this blog post, I will break down how this script works and the precautions it takes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Purpose of the Script
&lt;/h2&gt;

&lt;p&gt;This script performs the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;1. Lists all S3 buckets in your AWS account.&lt;/li&gt;
&lt;li&gt;2. Asks for user confirmation before proceeding with deletion.&lt;/li&gt;
&lt;li&gt;3. Iterates over each bucket and deletes all object versions and delete markers.&lt;/li&gt;
&lt;li&gt;Deletes the empty bucket from S3.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxq6l2e9rgw4udoy7pjar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxq6l2e9rgw4udoy7pjar.png" alt="Image description" width="800" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's go through the script step by step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Script Breakdown
&lt;/h2&gt;

&lt;p&gt;The script starts with the shebang (&lt;code&gt;#!/bin/bash&lt;/code&gt;), which ensures it runs in a Bash shell.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 1: Listing All S3 Buckets
&lt;/h3&gt;

&lt;p&gt;This command lists all S3 buckets using &lt;code&gt;aws s3 ls&lt;/code&gt; and extracts only the bucket names using &lt;code&gt;awk '{print $3}'&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;buckets&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3 &lt;span class="nb"&gt;ls&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $3}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$buckets&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"No S3 buckets found."&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If no buckets exist, the script exits gracefully.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: User Confirmation
&lt;/h3&gt;

&lt;p&gt;The script prompts the user for confirmation to prevent accidental deletions. If the user does not type &lt;code&gt;yes&lt;/code&gt;, the operation is aborted.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"WARNING: This will permanently delete all S3 buckets and their contents."&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Do you want to proceed? (yes/no)"&lt;/span&gt;
&lt;span class="nb"&gt;read &lt;/span&gt;confirmation

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$confirmation&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"yes"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Operation canceled."&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Deleting Object Versions
&lt;/h3&gt;

&lt;p&gt;Each bucket is processed in a loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;bucket &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nv"&gt;$buckets&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deleting all object versions from bucket: &lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Object versions are retrieved using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;versions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3api list-object-versions &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'Versions[*].[Key, VersionId]'&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text 2&amp;gt;/dev/null&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If object versions exist, they are deleted in a loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$versions&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; key versionId&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deleting: &lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt; (version: &lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
        aws s3api delete-object &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--key&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--version-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;fi
  done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Deleting Delete Markers
&lt;/h3&gt;

&lt;p&gt;Similarly, delete markers (for versioned buckets) are retrieved and removed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;markers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3api list-object-versions &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'DeleteMarkers[*].[Key, VersionId]'&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text 2&amp;gt;/dev/null&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$markers&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; key versionId&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deleting marker: &lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt; (version: &lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
        aws s3api delete-object &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--key&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--version-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$versionId&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;fi
  done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Deleting the Bucket
&lt;/h3&gt;

&lt;p&gt;After clearing all contents, the bucket itself is removed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deleting bucket: &lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
aws s3 rb s3://&lt;span class="nv"&gt;$bucket&lt;/span&gt; &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command forcefully removes the bucket from S3.&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Step: Script Completion
&lt;/h3&gt;

&lt;p&gt;Once all buckets are deleted, a success message is displayed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;done
&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"All S3 buckets deleted successfully."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This script provides a safe and efficient way to delete all S3 buckets while ensuring that all object versions and delete markers are removed. The user confirmation step helps prevent accidental deletions, making it a useful tool for cloud administrators. Additionally, the script can be further customized if you just want to delete specific buckets.&lt;/p&gt;

&lt;p&gt;If you plan to use this script, ensure you have the appropriate AWS permissions and double-check before running it to avoid unintended data loss!&lt;/p&gt;

&lt;p&gt;The script repository can be found &lt;a href="https://github.com/benitomartin/delete-all-aws-s3-buckets" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>s3</category>
      <category>tutorial</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Build and Deploy a Serverless API with AWS SAM, Lambda, Gateway and Docker</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Wed, 19 Mar 2025 21:17:00 +0000</pubDate>
      <link>https://dev.to/aws-builders/build-and-deploy-a-serverless-api-with-aws-sam-lambda-gateway-docker-4gb</link>
      <guid>https://dev.to/aws-builders/build-and-deploy-a-serverless-api-with-aws-sam-lambda-gateway-docker-4gb</guid>
      <description>&lt;p&gt;In this tutorial, you will create a &lt;strong&gt;serverless API&lt;/strong&gt; that generates a random number between two values using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ AWS Lambda &lt;/li&gt;
&lt;li&gt;✅ AWS SAM (Serverless Application Model)&lt;/li&gt;
&lt;li&gt;✅ API Gateway&lt;/li&gt;
&lt;li&gt;✅ Docker (for local testing)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This tutorial is beginner friendly and only requires basic knowledge of AWS Services.&lt;/p&gt;

&lt;p&gt;You can find the whole code repository &lt;a href="https://github.com/benitomartin/sam-lambda-aws-api" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🧱 Project Overview
&lt;/h2&gt;

&lt;p&gt;The app will expose a single &lt;strong&gt;POST endpoint&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/generate-random&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;You send a JSON like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"min"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"max"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the API will return a random number between &lt;code&gt;min&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"random_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📦 Prerequisites
&lt;/h2&gt;

&lt;p&gt;To get started, make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python installed&lt;/li&gt;
&lt;li&gt;AWS CLI (configured with credentials)&lt;/li&gt;
&lt;li&gt;AWS SAM CLI&lt;/li&gt;
&lt;li&gt;Docker running locally&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📁 Project Structure
&lt;/h2&gt;

&lt;p&gt;Here is the directory structure of the project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── lambda/
│   ├── lambda_function.py
├── .gitignore
├── .python-version
├── LICENSE
├── README.md
├── pyproject.toml
├── template.yaml
└── uv.lock
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧑‍💻 Setup and Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Create the Repository
&lt;/h3&gt;

&lt;p&gt;First, create a new repository (&lt;code&gt;sam-lambda-aws-api&lt;/code&gt;) in your GitHub account and clone it locally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/yourusername/sam-lambda-aws-api.git
&lt;span class="nb"&gt;cd &lt;/span&gt;sam-lambda-aws-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Write the Lambda Function
&lt;/h3&gt;

&lt;p&gt;Create a new folder named &lt;code&gt;lambda&lt;/code&gt; in the root of your project. Inside this folder, create the &lt;code&gt;lambda_function.py&lt;/code&gt; file. This file will contain the Python code for the Lambda function that handles generating the random number:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;min_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;min&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;max_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;random_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;random_number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random_number&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since you are using Python’s built-in &lt;code&gt;json&lt;/code&gt; and &lt;code&gt;random&lt;/code&gt; modules, there are no additional dependencies for this simple example. For more complex project, you can add external dependencies to a &lt;code&gt;requirements.txt&lt;/code&gt; file and save this file in the &lt;code&gt;lambda&lt;/code&gt; folder to package the AWS Lambda function with the required dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Define Your SAM Template
&lt;/h3&gt;

&lt;p&gt;Create the &lt;code&gt;template.yaml&lt;/code&gt; file. This is the SAM template that defines the Lambda function, API Gateway, and other AWS resources needed for the app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;AWSTemplateFormatVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2010-09-09'&lt;/span&gt;
&lt;span class="na"&gt;Transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless-2016-10-31&lt;/span&gt;
&lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="s"&gt;Random Number Generator API&lt;/span&gt;

&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;RandomNumberGeneratorFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Function&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda_function.lambda_handler&lt;/span&gt;
      &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python3.11&lt;/span&gt;
      &lt;span class="na"&gt;CodeUri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda/&lt;/span&gt;
      &lt;span class="na"&gt;Events&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;Api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Api&lt;/span&gt;
          &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/generate-random&lt;/span&gt;
            &lt;span class="na"&gt;Method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;post&lt;/span&gt;
            &lt;span class="na"&gt;RestApiId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;ApiGateway&lt;/span&gt;

  &lt;span class="na"&gt;ApiGateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Api&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;StageName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dev&lt;/span&gt;
      &lt;span class="na"&gt;EndpointConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;REGIONAL&lt;/span&gt;
      &lt;span class="na"&gt;Cors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;AllowMethods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt; &lt;span class="c1"&gt;# Used methods: "'POST,OPTIONS'"&lt;/span&gt;
        &lt;span class="na"&gt;AllowHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'Content-Type,Authorization'"&lt;/span&gt;
        &lt;span class="na"&gt;AllowOrigin&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt;

&lt;span class="na"&gt;Outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ApiEndpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;https://${ApiGateway}.execute-api.${AWS::Region}.amazonaws.com/dev/generate-random&lt;/span&gt;
    &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;API Gateway endpoint URL for the Random Number Generator&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is what the components do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;RandomNumberGeneratorFunction&lt;/code&gt;: This resource represents your Lambda function. It uses the &lt;code&gt;lambda_function.lambda_handler&lt;/code&gt; as the entry point.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;ApiGateway&lt;/code&gt;: The API Gateway is defined to expose the Lambda function as a REST API at the &lt;code&gt;/generate-random&lt;/code&gt; path using the POST method.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cors&lt;/code&gt;: The CORS configuration allows requests from any origin ('*'), with any method, and with the specified headers. This is useful for development, but in production, you should restrict these to only necessary values.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Outputs&lt;/code&gt;: This section outputs the URL of the deployed API, which you will use to make requests.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Build and Validate the SAM Application
&lt;/h3&gt;

&lt;p&gt;Now, you need to build the application and validate the template.&lt;/p&gt;

&lt;p&gt;Run the following commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam build
sam validate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a &lt;code&gt;.aws-sam&lt;/code&gt; directory with the built artifacts, ensuring your project is correctly packaged.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧪 Local Testing with Docker
&lt;/h2&gt;

&lt;p&gt;You can test the Lambda function locally using Docker (make sure it is running). SAM CLI can emulate AWS Lambda functions on your local machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Invoke Lambda Locally
&lt;/h3&gt;

&lt;p&gt;The following command will create a local AWS Lambda Docker image, invoke the function and return the response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam &lt;span class="nb"&gt;local &lt;/span&gt;invoke RandomNumberGeneratorFunction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"random_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Start the API Locally
&lt;/h3&gt;

&lt;p&gt;Alternatively, you can start the API locally and send requests to it. Here also Docker must be running and again a local AWS Lambda image will be created.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam &lt;span class="nb"&gt;local &lt;/span&gt;start-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will spin up a local API Gateway on your machine. The API will be available at &lt;code&gt;http://127.0.0.1:3000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now, you can test the API by sending a POST request using &lt;code&gt;curl&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3000/generate-random &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"min": 1, "max": 100}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"random_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing Error Handling
&lt;/h3&gt;

&lt;p&gt;Let's also test how our API handles invalid inputs:&lt;/p&gt;

&lt;p&gt;Test with min &amp;gt; max&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3000/generate-random &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"min": 100, "max": 1}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"empty range for randrange() (100, 2, -98)"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test with non-integer values&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3000/generate-random &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"min": "abc", "max": 100}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invalid literal for int() with base 10: 'abc'"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 Deploying to AWS
&lt;/h2&gt;

&lt;p&gt;Once you have tested the app locally, it is time to deploy it to AWS.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy Using SAM
&lt;/h3&gt;

&lt;p&gt;To deploy the app, run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SAM CLI will prompt you to configure the deployment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vfb68lsw24nox9284of.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vfb68lsw24nox9284of.jpg" alt="Deployment via sam deploy --guided" width="800" height="297"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;SAM will package, deploy, and set up the necessary resources. After the deployment is complete, you will receive an API Gateway URL, which also contains the StageName &lt;code&gt;dev&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzgp6miad8x5yhgpxjzmp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzgp6miad8x5yhgpxjzmp.png" alt="Deployed endpoint test" width="800" height="115"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Test the Deployed API
&lt;/h3&gt;

&lt;p&gt;Once deployed, you can test the API by sending a request to the deployed endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://hi8mgf8ssl.execute-api.eu-central-1.amazonaws.com/dev/generate-random &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"min": 1, "max": 100}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"random_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;84&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧹 Clean Up
&lt;/h2&gt;

&lt;p&gt;To avoid incurring unnecessary costs, remember to delete the resources after you are done:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam delete
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will remove the stack and all associated resources from AWS.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 Conclusion
&lt;/h2&gt;

&lt;p&gt;In this tutorial, you have successfully built a serverless random number generator API using AWS Lambda, AWS SAM, and API Gateway. This serverless architecture offers several key advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost-effectiveness&lt;/strong&gt;: You only pay for the compute time you consume, with no charges when your code is not running.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scalability&lt;/strong&gt;: The application automatically scales based on demand.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reduced operational overhead&lt;/strong&gt;: No servers to manage means less time spent on infrastructure maintenance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fast deployment&lt;/strong&gt;: The AWS SAM CLI makes it easy to develop, test, and deploy serverless applications.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While this example is simple, it demonstrates the fundamental concepts of serverless architecture that can be applied to more complex applications. The same patterns you used here, defining resources in SAM templates, implementing Lambda functions, and exposing them through API Gateway, can be expanded to build sophisticated serverless applications.&lt;/p&gt;

&lt;p&gt;Happy building your serverless applications! 🚀&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>docker</category>
      <category>apigateway</category>
    </item>
    <item>
      <title>Deep Dive into uv Dockerfiles by Astral: Image Size, Performance &amp; Best Practices</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Tue, 18 Mar 2025 12:25:22 +0000</pubDate>
      <link>https://dev.to/benitomartin/deep-dive-into-uv-dockerfiles-by-astral-image-size-performance-best-practices-202c</link>
      <guid>https://dev.to/benitomartin/deep-dive-into-uv-dockerfiles-by-astral-image-size-performance-best-practices-202c</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjz9a2sha6f0t70c5ok9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjz9a2sha6f0t70c5ok9.png" alt="Docker + uv" width="800" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dockerizing Python applications is still a complex task, despite all the tools we have at our disposal. Dependency management, environment consistency, image size, and build speed can all become pain points, especially as projects grow.&lt;/p&gt;

&lt;p&gt;Enter &lt;code&gt;uv&lt;/code&gt; by Astral: a modern tool that aims to simplify and speed up Python dependency resolution. It introduces a fresh approach to Docker workflows, but it comes with configuration choices that may be unclear if you’re not familiar with how they affect image structure and behavior.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;uv&lt;/code&gt; GitHub &lt;a href="https://github.com/astral-sh/uv-docker-example" rel="noopener noreferrer"&gt;repository &lt;/a&gt; provides three example Dockerfiles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single-Stage&lt;/strong&gt;: Single-stage build that keeps uv in the final image&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standalone&lt;/strong&gt;: Multi-stage build using uv-managed Python, removes &lt;code&gt;uv&lt;/code&gt; in the final stage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Stage&lt;/strong&gt;: Multi-stage build using system Python, removes &lt;code&gt;uv&lt;/code&gt; in the final stage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each approach has trade-offs, some are faster, some leaner, and some easier to understand. In this post, we’ll compare all three, break down their Dockerfile internals, analyze image size and history, and help you decide which is best for production-grade images.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Are These Dockerfiles?
&lt;/h2&gt;

&lt;p&gt;Before we dive into pros, cons, and differences, let’s quickly describe each Dockerfile.&lt;/p&gt;

&lt;h3&gt;
  
  
  Single-Stage Dockerfile
&lt;/h3&gt;

&lt;p&gt;An example of using a single stage with uv pre-installed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Use a Python image with uv pre-installed&lt;/span&gt;
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim

&lt;span class="c"&gt;# Install the project into `/app`&lt;/span&gt;
WORKDIR /app

&lt;span class="c"&gt;# Enable bytecode compilation&lt;/span&gt;
ENV &lt;span class="nv"&gt;UV_COMPILE_BYTECODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1

&lt;span class="c"&gt;# Copy from the cache instead of linking since it's a mounted volume&lt;/span&gt;
ENV &lt;span class="nv"&gt;UV_LINK_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;copy

&lt;span class="c"&gt;# Install the project's dependencies using the lockfile and settings&lt;/span&gt;
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;uv.lock,target&lt;span class="o"&gt;=&lt;/span&gt;uv.lock &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml,target&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-install-project&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;

&lt;span class="c"&gt;# Then, add the rest of the project source code and install it&lt;/span&gt;
ADD &lt;span class="nb"&gt;.&lt;/span&gt; /app
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;

&lt;span class="c"&gt;# Place executables in the environment at the front of the path&lt;/span&gt;
ENV &lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/app/.venv/bin:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Reset the entrypoint, don't invoke `uv`&lt;/span&gt;
ENTRYPOINT &lt;span class="o"&gt;[]&lt;/span&gt;

&lt;span class="c"&gt;# Run the FastAPI application by default&lt;/span&gt;
CMD &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"fastapi"&lt;/span&gt;, &lt;span class="s2"&gt;"dev"&lt;/span&gt;, &lt;span class="s2"&gt;"--host"&lt;/span&gt;, &lt;span class="s2"&gt;"0.0.0.0"&lt;/span&gt;, &lt;span class="s2"&gt;"src/uv_docker_example"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Code Explained&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim&lt;/strong&gt;: This line specifies the base image. It pulls a pre-configured image that includes Python 3.12 and the uv tool. The &lt;code&gt;bookworm-slim&lt;/code&gt; tag indicates it is a minimal Debian-based image (Bookworm) optimized for size (slim variant).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;WORKDIR /app&lt;/strong&gt;: Sets the working directory inside the container to /app. All subsequent commands will be executed relative to this directory.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_COMPILE_BYTECODE=1&lt;/strong&gt;: Sets an environment variable to enable Python bytecode compilation (creates &lt;code&gt;.pyc&lt;/code&gt; files). This can speed up application startup times by avoiding re-compiling Python files each time the container starts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_LINK_MODE=copy&lt;/strong&gt;: Configures the &lt;code&gt;uv&lt;/code&gt; tool to copy files into the container instead of creating symbolic links (symlinks). This is useful when using mounted volumes, preventing potential issues in production due to symlinks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN — mount=type=cache…&lt;/strong&gt;: This command installs the dependencies for the project using the &lt;code&gt;uv sync&lt;/code&gt; command. The &lt;code&gt;--mount=type=cache&lt;/code&gt; optimizes caching by storing cached files in a location outside the container, and &lt;code&gt;--mount=type=bind&lt;/code&gt; allows the mounting of the &lt;code&gt;uv.lock&lt;/code&gt; and &lt;code&gt;pyproject.toml&lt;/code&gt; files from the host, ensuring they are synchronized.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ADD . /app&lt;/strong&gt;: This copies the project’s source code into the container at /app.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN uv sync — frozen — no-dev&lt;/strong&gt;: This second &lt;code&gt;RUN&lt;/code&gt; command installs the local project.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV PATH=”/app/.venv/bin:$PATH”&lt;/strong&gt;: This adds the &lt;code&gt;.venv&lt;/code&gt; directory, which contains the virtual environment’s executables, to the PATH environment variable. This ensures the container will use the right Python interpreter and dependencies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENTRYPOINT []&lt;/strong&gt;: Resets the entrypoint to an empty array. This means no default executable is set, giving flexibility to specify the command in the &lt;code&gt;CMD&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CMD&lt;/strong&gt; starts the FastAPI development server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Standalone Dockerfile
&lt;/h3&gt;

&lt;p&gt;An example of using standalone Python builds with multi-stage images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First, build the application in the `/app` directory&lt;/span&gt;
FROM ghcr.io/astral-sh/uv:bookworm-slim AS builder
ENV &lt;span class="nv"&gt;UV_COMPILE_BYTECODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;UV_LINK_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;copy

&lt;span class="c"&gt;# Configure the Python directory so it is consistent&lt;/span&gt;
ENV UV_PYTHON_INSTALL_DIR /python

&lt;span class="c"&gt;# Only use the managed Python version&lt;/span&gt;
ENV &lt;span class="nv"&gt;UV_PYTHON_PREFERENCE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;only-managed

&lt;span class="c"&gt;# Install Python before the project for caching&lt;/span&gt;
RUN uv python &lt;span class="nb"&gt;install &lt;/span&gt;3.12

WORKDIR /app
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;uv.lock,target&lt;span class="o"&gt;=&lt;/span&gt;uv.lock &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml,target&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-install-project&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;

ADD &lt;span class="nb"&gt;.&lt;/span&gt; /app
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;

&lt;span class="c"&gt;# Then, use a final image without uv&lt;/span&gt;
FROM debian:bookworm-slim

&lt;span class="c"&gt;# Copy the Python version&lt;/span&gt;
COPY &lt;span class="nt"&gt;--from&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;builder &lt;span class="nt"&gt;--chown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;python:python /python /python

&lt;span class="c"&gt;# Copy the application from the builder&lt;/span&gt;
COPY &lt;span class="nt"&gt;--from&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;builder &lt;span class="nt"&gt;--chown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;app:app /app /app

&lt;span class="c"&gt;# Place executables in the environment at the front of the path&lt;/span&gt;
ENV &lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/app/.venv/bin:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Run the FastAPI application by default&lt;/span&gt;
CMD &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"fastapi"&lt;/span&gt;, &lt;span class="s2"&gt;"dev"&lt;/span&gt;, &lt;span class="s2"&gt;"--host"&lt;/span&gt;, &lt;span class="s2"&gt;"0.0.0.0"&lt;/span&gt;, &lt;span class="s2"&gt;"/app/src/uv_docker_example"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Code Explained&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FROM ghcr.io/astral-sh/uv:bookworm-slim AS builder&lt;/strong&gt;: This line specifies the first (builder) stage of the Dockerfile. It uses the same &lt;code&gt;uv&lt;/code&gt; image as the base, but marks this stage as &lt;code&gt;builder&lt;/code&gt;. This approach helps modularize the build process by separating build dependencies from runtime dependencies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_COMPILE_BYTECODE=1&lt;/strong&gt;: These environment variables enable bytecode compilation and configure the &lt;code&gt;uv&lt;/code&gt; tool to copy files, just as in the previous Dockerfile.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_LINK_MODE=copy&lt;/strong&gt;: Configures the uv tool to copy files into the container instead of creating symbolic links (symlinks). This is useful when using mounted volumes, preventing potential issues in production due to symlinks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_PYTHON_INSTALL_DIR /python&lt;/strong&gt;: Specifies where Python should be installed within the container. This ensures Python is placed in a consistent directory during the build process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_PYTHON_PREFERENCE=only-managed&lt;/strong&gt;: Instructs &lt;code&gt;uv&lt;/code&gt; to use only the managed Python version, ensuring consistency between builds and avoiding issues with mismatched Python versions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN uv python install 3.12&lt;/strong&gt;: Installs Python 3.12 inside the builder image. This step is necessary because we need Python available for the application in the next stages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;WORKDIR /app&lt;/strong&gt;: Similar to the previous Dockerfile, sets the working directory to &lt;code&gt;/app&lt;/code&gt; for the subsequent commands.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN&lt;/strong&gt; — mount=type=cache…: Installs dependencies using &lt;code&gt;uv sync&lt;/code&gt; in a cached manner, like the previous Dockerfile.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ADD . /app&lt;/strong&gt;: Adds source code to the container.&lt;br&gt;
RUN uv sync — frozen — no-dev: This second RUN command installs the project, like in the previous image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FROM debian:bookworm-slim&lt;/strong&gt;: Switches to a much lighter image (&lt;code&gt;debian:bookworm-slim&lt;/code&gt;), which is used as the base for the final runtime image. This reduces the size of the final container since it excludes build tools like uv, reducing the final image size.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;COPY — from=builder /python /python&lt;/strong&gt;: Copies the Python installation from the builder stage into the final image, ensuring that the Python installation is preserved but without the build tools&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;COPY — from=builder /app /app&lt;/strong&gt;: Copies the application code from the builder stage to the runtime stage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV PATH=”/app/.venv/bin:$PATH”&lt;/strong&gt;: Adds the &lt;code&gt;.venv&lt;/code&gt; directory to the PATH environment variable, ensuring the container uses the correct Python and dependencies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CMD&lt;/strong&gt;: Starts the FastAPI development server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Multi-Stage Dockerfile
&lt;/h3&gt;

&lt;p&gt;An example using multi-stage image builds to create a final image without &lt;code&gt;uv&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First, build the application in the `/app` directory.&lt;/span&gt;
&lt;span class="c"&gt;# See `Dockerfile` for details.&lt;/span&gt;
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder
ENV &lt;span class="nv"&gt;UV_COMPILE_BYTECODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;UV_LINK_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;copy

&lt;span class="c"&gt;# Disable Python downloads, because we want to use the system interpreter&lt;/span&gt;
&lt;span class="c"&gt;# across both images. If using a managed Python version, it needs to be&lt;/span&gt;
&lt;span class="c"&gt;# copied from the build image into the final image; see `standalone.Dockerfile`&lt;/span&gt;
&lt;span class="c"&gt;# for an example.&lt;/span&gt;
ENV &lt;span class="nv"&gt;UV_PYTHON_DOWNLOADS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0

WORKDIR /app
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;uv.lock,target&lt;span class="o"&gt;=&lt;/span&gt;uv.lock &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;bind&lt;/span&gt;,source&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml,target&lt;span class="o"&gt;=&lt;/span&gt;pyproject.toml &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-install-project&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;
ADD &lt;span class="nb"&gt;.&lt;/span&gt; /app
RUN &lt;span class="nt"&gt;--mount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cache,target&lt;span class="o"&gt;=&lt;/span&gt;/root/.cache/uv &lt;span class="se"&gt;\&lt;/span&gt;
    uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt; &lt;span class="nt"&gt;--no-dev&lt;/span&gt;


&lt;span class="c"&gt;# Then, use a final image without uv&lt;/span&gt;
FROM python:3.12-slim-bookworm
&lt;span class="c"&gt;# It is important to use the image that matches the builder, as the path to the&lt;/span&gt;
&lt;span class="c"&gt;# Python executable must be the same, e.g., using `python:3.11-slim-bookworm`&lt;/span&gt;
&lt;span class="c"&gt;# will fail.&lt;/span&gt;

&lt;span class="c"&gt;# Copy the application from the builder&lt;/span&gt;
COPY &lt;span class="nt"&gt;--from&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;builder &lt;span class="nt"&gt;--chown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;app:app /app /app

&lt;span class="c"&gt;# Place executables in the environment at the front of the path&lt;/span&gt;
ENV &lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/app/.venv/bin:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Run the FastAPI application by default&lt;/span&gt;
CMD &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"fastapi"&lt;/span&gt;, &lt;span class="s2"&gt;"dev"&lt;/span&gt;, &lt;span class="s2"&gt;"--host"&lt;/span&gt;, &lt;span class="s2"&gt;"0.0.0.0"&lt;/span&gt;, &lt;span class="s2"&gt;"/app/src/uv_docker_example"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Code Explained&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder&lt;/strong&gt;: Begins the first stage (builder) using the uv image with Python 3.12 pre-installed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_COMPILE_BYTECODE=1&lt;/strong&gt;: These environment variables enable bytecode compilation and configure the uv tool to copy files, just as in the previous Dockerfile.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV UV_PYTHON_DOWNLOADS=0&lt;/strong&gt;: Disables downloading Python from external sources, forcing the use of the pre-installed system Python.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;WORKDIR /app&lt;/strong&gt;: Sets the working directory inside the container to /app.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN — mount=type=cache…&lt;/strong&gt;: Installs dependencies using uv sync in a cached manner, like the previous Dockerfile.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ADD . /app&lt;/strong&gt;: Adds source code to the container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RUN uv sync — frozen — no-dev&lt;/strong&gt;: This second RUN command installs the project, like in the previous image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FROM python:3.12-slim-bookworm&lt;/strong&gt;: Switches to a slim Python 3.12 image for the final runtime image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;COPY — from=builder /app /app&lt;/strong&gt;: Copies the application from the builder stage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ENV PATH=”/app/.venv/bin:$PATH”&lt;/strong&gt;: Adds the .venv directory to the PATH environment variable, ensuring the container uses the correct Python and dependencies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CMD&lt;/strong&gt;: Starts the FastAPI development server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What are the Main Differences Between the Images?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Single-Stage Dockerfile:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: All operations (dependency installation, code addition, environment setup) happen within one stage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: Typically, the largest, since both build and runtime dependencies (e.g., compilers, Python environment, etc.) remain in the final image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: It might take longer since all steps, including installation and code addition, happen within a single stage, requiring a complete rebuild when changes are made to any part of the build process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Modularity&lt;/strong&gt;: Less modular. Every change to the application or de****pendencies requires a rebuild of the entire image.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Standalone Dockerfile&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: This uses two stages: the builder stage (which uses the &lt;code&gt;uv&lt;/code&gt; image to install dependencies) and the final stage (which copies over the built application and Python environment from the builder).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: The final image contains only the runtime environment and the application code, excluding unnecessary build tools. However, it still needs Python to be installed from the builder image, which may result in a somewhat larger final image than the multistage approach.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: Slightly faster than the single-stage image because the standalone uses a dedicated builder stage for dependency installation, which allows better layer caching and separation of concerns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Modularity&lt;/strong&gt;: More modular. Each stage is responsible for a specific task, allowing you to optimize each step and cache parts of the build.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Multi-Stage Dockerfile&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: This uses two stages: one for building and one for the final application image. The builder stage installs dependencies and compiles the app, while the final stage copies only the necessary parts into a slim runtime image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: The smallest image size because it only includes the final necessary components: application code and runtime environment (no build tools, no redundant Python installation).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: Build times are comparable to the Standalone approach since both use multi-stage builds. However, rebuilds can be more efficient when only application code changes, as the Python environment in the final stage doesn’t need to be copied from the builder.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Modularity&lt;/strong&gt;: The final runtime image is as slim as possible, making it an excellent choice for production.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What are the Main Differences Between the Images?
&lt;/h2&gt;

&lt;p&gt;To validate the differences, a benchmark was run on image size and build time across three Docker strategies: Single-Stage, Standalone, and Multi-Stage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlpitb8lvrnri4gx9p9c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlpitb8lvrnri4gx9p9c.png" alt="Docker Benchmark Comparison" width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2arqczsf6bti0dy7p8f9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2arqczsf6bti0dy7p8f9.png" alt="Docker History" width="800" height="828"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In all three builds, the bulk of the image size (~2.89 GB) comes from the application itself and its installed dependencies, typically residing in &lt;code&gt;.venv/&lt;/code&gt;. This layer is identical across builds, which is why the overall image sizes are relatively close despite the structural differences.&lt;/p&gt;

&lt;p&gt;However, as the application grows (more dependencies, larger codebase, etc.), the differences between the image sizes (and also build time) will become more pronounced. The multi-stage approach will continue to offer the most significant reduction in size, while the single-stage build will likely see a larger overall size increase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-Stage&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What’s happening?&lt;/strong&gt;&lt;br&gt;
Everything, runtime, build tools, dependencies, gets packed into a single image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: 🟥 &lt;strong&gt;Largest&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: 🟥 &lt;strong&gt;Slowest&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;br&gt;
All dependencies (compilers, wheels, Python runtime, and your app) remain in the final image. There’s no separation between build environment and runtime environment.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Standalone&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What’s happening?&lt;/strong&gt;&lt;br&gt;
Uses two stages: builds with uv in the first stage, then copies both Python runtime and application to a clean final image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: 🟨 &lt;strong&gt;Medium&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: 🟨 &lt;strong&gt;Slightly faster&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;br&gt;
More lightweight than the single-stage version since it discards build tools. The final image contains just the application, and a full Python installation copied from the builder.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Multi-Stage&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What’s happening?&lt;/strong&gt;&lt;br&gt;
Uses two stages: builds with uv in the first stage, then copies only the application to a slim image with system Python already installed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Image Size&lt;/strong&gt;: 🟩 &lt;strong&gt;Smallest&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build Time&lt;/strong&gt;: 🟩 &lt;strong&gt;Most efficient for production&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Why?&lt;br&gt;
The final image has only what’s necessary to run the app, no build tools, no dev dependencies, just your app and the minimal system Python environment.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd8ruoniveaz7q799ciqh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd8ruoniveaz7q799ciqh.png" alt="Docker Type Comparison" width="800" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Which One Should You Use in Production?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Single-Stage&lt;/strong&gt;: Best for quick prototypes or local development of small applications. Not ideal for production due to larger image size and inclusion of unnecessary build tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standalone&lt;/strong&gt;: Excellent choice when you need precise control over Python versions across environments. Provides a good balance of clean runtime environment and consistent Python installation. Recommended for production when specific Python versions or configurations are critical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-Stage&lt;/strong&gt;: The optimal choice for production deployments where minimal image size and security are priorities. Creates the leanest possible container by using system Python in the final image and including only what’s necessary to run your application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6aqm4a2t239e7yqipyt8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6aqm4a2t239e7yqipyt8.png" alt="Docker Production Comparison" width="800" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Dockerizing Python applications has always required trade-offs , but with tools like &lt;code&gt;uv&lt;/code&gt;, we’re entering a new era of faster builds, cleaner environments, and more efficient dependency management. What stands out across the three approaches isn’t just their technical differences, but the flexibility they offer depending on your team’s priorities.&lt;/p&gt;

&lt;p&gt;The real takeaway? There’s no one-size-fits-all Dockerfile. Instead, &lt;code&gt;uv&lt;/code&gt; gives you the tools to fine-tune your container strategy, from quick dev spins to production-grade minimalism. Whether you’re chasing performance, reproducibility, or portability, uv-based Dockerfiles let you build with confidence and clarity.&lt;/p&gt;

&lt;p&gt;The whole code repository to reproduce the benchmark can be found &lt;a href="https://github.com/benitomartin/uv-docker-benchmark" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>python</category>
      <category>containers</category>
      <category>devops</category>
    </item>
    <item>
      <title>Building a serverless GenAI API with FastAPI, AWS, and CircleCI</title>
      <dc:creator>Benito Martin</dc:creator>
      <pubDate>Wed, 12 Mar 2025 19:46:10 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-a-serverless-genai-api-with-fastapi-aws-and-circleci-5gl9</link>
      <guid>https://dev.to/aws-builders/building-a-serverless-genai-api-with-fastapi-aws-and-circleci-5gl9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7aqrv5oaafdo584ytxu7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7aqrv5oaafdo584ytxu7.png" alt="CircleCI" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The advancement of AI has empowered businesses to incorporate intelligent automation into their applications. A &lt;a href="https://circleci.com/blog/serverless-vs-containers/" rel="noopener noreferrer"&gt;serverless&lt;/a&gt; Generative AI (GenAI) API enables developers to harness cutting-edge AI models without the burden of infrastructure management. This guide walks you through building a scalable and cost-effective GenAI API using &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;, a high-performance Python framework with built-in async support and seamless &lt;a href="https://aws.amazon.com/?nc2=h_lg" rel="noopener noreferrer"&gt;AWS&lt;/a&gt; integration. By deploying FastAPI on &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt; with &lt;a href="https://aws.amazon.com/api-gateway/" rel="noopener noreferrer"&gt;AWS API Gateway&lt;/a&gt;, you can create a fully managed, pay-per-use architecture that eliminates server maintenance.&lt;/p&gt;

&lt;p&gt;To simplify development and deployment, you will set up a &lt;a href="https://circleci.com/continuous-integration/" rel="noopener noreferrer"&gt;Continuous Integration and Continuous Deployment (CI/CD) pipeline&lt;/a&gt; with CircleCI, automating testing, building, and deployment. With CircleCI’s GitHub integration, you’ll achieve continuous delivery, reducing errors and accelerating development cycles. This combination of FastAPI, AWS Lambda, and CircleCI ensures a robust, scalable, and efficient GenAI API ready for real-world applications.&lt;/p&gt;

&lt;p&gt;You can check out the complete &lt;a href="https://github.com/benitomartin/genai-aws-circleci" rel="noopener noreferrer"&gt;source code&lt;/a&gt; on GitHub, but this tutorial will guide you to build it from scratch.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before diving into the process of building a serverless GenAI API, there are several prerequisites you need to have in place. Here is a breakdown of what you will need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS Account&lt;/strong&gt;: You will need an active AWS account to deploy the serverless application using AWS services like Lambda and API Gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CLI&lt;/strong&gt;: To install and configure the AWS CLI, follow the instructions in the &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html" rel="noopener noreferrer"&gt;AWS CLI documentation&lt;/a&gt;. Once installed, configure it with aws configure and provide your AWS access key, secret key, region, and output format.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Basic Understanding of RESTful APIs, &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;, and &lt;a href="https://en.wikipedia.org/wiki/Generative_artificial_intelligence" rel="noopener noreferrer"&gt;GenAI&lt;/a&gt; Models&lt;/strong&gt;: This project assumes a basic understanding of  &lt;a href="https://circleci.com/blog/what-is-api/#what-is-rest" rel="noopener noreferrer"&gt;RESTful APIs&lt;/a&gt;, FastAPI, and GenAI models. REST APIs enable communication between clients (like web or mobile apps) and servers, while FastAPI is a fast, modern Python framework for building APIs with automatic documentation generation. GenAI models, such as OpenAI’s GPT, generate human-like text and other outputs, and in this project, you will integrate OpenAI into the API to provide responses to user queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub and CircleCI Accounts&lt;/strong&gt;: You will need a GitHub account to host your project’s repository and a &lt;a href="https://circleci.com/" rel="noopener noreferrer"&gt;CircleCI account&lt;/a&gt; to automate testing and deployment through CI/CD.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI API Key&lt;/strong&gt;: To access OpenAI’s GPT models, you will need an API key. You can sign up for an API key on the &lt;a href="https://platform.openai.com/signup" rel="noopener noreferrer"&gt;OpenAI website&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up the FastAPI GenAI Server
&lt;/h2&gt;

&lt;p&gt;FastAPI is a modern, high-performance web framework for building APIs with Python. It is particularly well-suited for LLM-based APIs due to its speed, simplicity, and support for asynchronous operations, which enable handling multiple requests efficiently. For this project, you will integrate OpenAI’s &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/" rel="noopener noreferrer"&gt;GPT-4o-mini&lt;/a&gt; model via its &lt;a href="https://openai.com/index/openai-api/" rel="noopener noreferrer"&gt;API&lt;/a&gt; to generate AI-driven responses with minimal setup.&lt;br&gt;
Installing Dependencies and GenAI Libraries&lt;/p&gt;

&lt;p&gt;First, clone the repository containing the project code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/CIRCLECI-GWP/genai-aws-circleci.git
&lt;span class="nb"&gt;cd &lt;/span&gt;genai-aws-circleci
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, install the package manager &lt;a href="https://docs.astral.sh/uv/getting-started/installation/" rel="noopener noreferrer"&gt;uv from astral&lt;/a&gt;, a fast Python package manager, instead of &lt;code&gt;pip&lt;/code&gt;. Written in Rust, &lt;code&gt;uv&lt;/code&gt; is chosen for its speed, efficient dependency resolution, and built-in support for managing virtual environments. You can install it using the following command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-LsSf&lt;/span&gt; https://astral.sh/uv/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once you have installed &lt;code&gt;uv&lt;/code&gt;, run the following commands to install dependencies and activate the virtual environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv &lt;span class="nb"&gt;sync
source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;uv sync&lt;/code&gt; command will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install the dependencies defined in &lt;code&gt;pyproject.toml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Automatically create a virtual environment (&lt;code&gt;.venv&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, creeate &lt;code&gt;.env&lt;/code&gt; file in the root directory of your respository and add your &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; to the file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-openai-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Define Endpoints
&lt;/h3&gt;

&lt;p&gt;With dependencies installed, you can now define the FastAPI endpoints to interact with the GPT-4o-mini model. This implementation, found in &lt;code&gt;main.py&lt;/code&gt;, does not yet include AWS integration, which you will cover in the next section.&lt;/p&gt;

&lt;p&gt;Code Breakdown:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PromptRequest (Pydantic Model)&lt;/strong&gt;: Defines the expected structure of incoming requests. It ensures that each request contains a &lt;code&gt;prompt&lt;/code&gt; string.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;get_openai_api_key()&lt;/strong&gt;: Retrieves the OpenAI API key from the environment variables file. If the key is missing, it raises an &lt;code&gt;HTTPException&lt;/code&gt; to prevent unauthorized API calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;get_openai_client()&lt;/strong&gt;: Uses &lt;code&gt;get_openai_api_key()&lt;/code&gt; to fetch the API key and initialize the OpenAI client. If initialization fails, an exception is raised.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Root Endpoint (&lt;code&gt;/&lt;/code&gt;)&lt;/strong&gt;: A simple health check that confirms the API is running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate Endpoint (&lt;code&gt;/generate&lt;/code&gt;)&lt;/strong&gt;: Accepts a POST request containing a prompt, passes it to OpenAI’s GPT-4o-mini, and returns the generated response. It depends on &lt;code&gt;get_openai_client()&lt;/code&gt; to ensure a valid API connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI API Call&lt;/strong&gt;: Uses &lt;code&gt;chat.completions.create()&lt;/code&gt; to send the user’s prompt to OpenAI and returns the generated response.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Depends&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="c1"&gt;# Load environment variables
&lt;/span&gt;&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize FastAPI
&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Pydantic model to define expected structure of request
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PromptRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Model for request validation.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY not found in environment variables&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_openai_client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;HTTPException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to initialize OpenAI client: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;root&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Root endpoint to confirm API is running.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome to the GenAI API&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;PromptRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Depends&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_openai_client&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OpenAI API client not initialized.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No response received from OpenAI API.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Run the app with uvicorn
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uvicorn&lt;/span&gt;
    &lt;span class="n"&gt;uvicorn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main:app&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;reload&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Running the FastAPI application
&lt;/h3&gt;

&lt;p&gt;To run the FastAPI application, execute the command below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run main.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command will start the FastAPI server locally at &lt;code&gt;http://127.0.0.1:8000&lt;/code&gt;, and the &lt;code&gt;--reload&lt;/code&gt; option allows for hot-reloading during development. You can use a cURL command below to make a POST request to the &lt;code&gt;/generate&lt;/code&gt; endpoint with a prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; &lt;span class="s1"&gt;'POST'&lt;/span&gt; &lt;span class="s1"&gt;'http://127.0.0.1:8000/generate'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"prompt": "Tell me a fun fact about AI"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should receive a response like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"response"&lt;/span&gt;: &lt;span class="s2"&gt;"AI was first introduced as a field in 1956 at a conference at Dartmouth College. It was the birth of modern artificial intelligence!"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That marks the FastAPI-based GenAI API's readiness for local testing. Next, you will integrate AWS Lambda and API Gateway for serverless deployment.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying FastAPI to AWS Lambda
&lt;/h2&gt;

&lt;p&gt;To deploy the FastAPI GenAI server to &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, you will need to set up a few key components namely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://mangum.fastapiexpert.com/" rel="noopener noreferrer"&gt;Mangum&lt;/a&gt; for making FastAPI compatible with AWS Lambda&lt;/li&gt;
&lt;li&gt;The Lambda function handler&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/api-gateway/" rel="noopener noreferrer"&gt;AWS API Gateway&lt;/a&gt; to expose the FastAPI endpoints&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; added into &lt;a href="https://aws.amazon.com/secrets-manager/" rel="noopener noreferrer"&gt;AWS Secrets Manager&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mangum is a Python library that allows ASGI] applications (like FastAPI) to run on AWS Lambda. It acts as an adapter, making FastAPI compatible with AWS Lambda’s event-driven architecture and API Gateway.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Creating an AWS Lambda Function Handler with Magnum
&lt;/h3&gt;

&lt;p&gt;Once your FastAPI application is set up locally, you will need to wrap it in a handler that AWS Lambda can invoke when requests come in via API Gateway. This is where Mangum comes in. Modify your &lt;code&gt;main.py&lt;/code&gt; by importing Mangum and wrapping the FastAPI app. Add the handler right after defining your endpoints.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;mangum&lt;/span&gt;  

&lt;span class="c1"&gt;# Create the handler for AWS Lambda
&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mangum&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Mangum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When your app is running in AWS, you need to ensure that the &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; is accessed securely. You can add it into AWS Secrets Manager and update the &lt;code&gt;main.py&lt;/code&gt; so that depending on where you run the app, the corresponding &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; is used.&lt;/p&gt;

&lt;p&gt;The command below securely stores the &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; in AWS Secrets Manager, ensuring that sensitive credentials are not hardcoded in the application.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;create-secret&lt;/code&gt;: Creates a new secret in AWS Secrets Manager.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--name&lt;/code&gt;: Specifies the unique name of the secret.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--description&lt;/code&gt;: Provides a brief description of the secret.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--secret-string&lt;/code&gt;: Stores the actual secret as a JSON object, where &lt;code&gt;YOUR_OPENAI_API_KEY&lt;/code&gt; should be replaced with the actual API key.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws secretsmanager create-secret &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; openai/api_key &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--description&lt;/span&gt; &lt;span class="s2"&gt;"OpenAI API Key for GenAI API"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--secret-string&lt;/span&gt; &lt;span class="s1"&gt;'{"OPENAI_API_KEY":"YOUR_OPENAI_API_KEY"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once stored, the application can retrieve this secret dynamically.&lt;/p&gt;

&lt;p&gt;Then, update the &lt;code&gt;get_openai_api_key&lt;/code&gt; function in the &lt;code&gt;main.py&lt;/code&gt; file to allow retrieval of the key from the &lt;code&gt;.env&lt;/code&gt; file when running locally and from AWS Secrets Manager when running on Lambda.&lt;/p&gt;

&lt;p&gt;Code Breakdown:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;If running on AWS Lambda (detected via &lt;code&gt;AWS_LAMBDA_FUNCTION_NAME&lt;/code&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It fetches the API key securely from AWS Secrets Manager.&lt;/li&gt;
&lt;li&gt;A Secrets Manager client is created, and the stored secret (&lt;code&gt;openai/api_key&lt;/code&gt;) is retrieved and parsed.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;If running locally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It loads the API key from the &lt;code&gt;.env&lt;/code&gt; file via environment variables.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="c1"&gt;# Check if running locally or in Lambda
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Running in Lambda, get key from AWS Secrets Manager
&lt;/span&gt;        &lt;span class="n"&gt;secret_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Create a Secrets Manager client
&lt;/span&gt;            &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;secretsmanager&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Get the secret API Key
&lt;/span&gt;            &lt;span class="n"&gt;get_secret_value_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_secret_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SecretId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;secret_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;secret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_secret_value_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SecretString&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;secret_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;secret_dict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;KeyError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY not found in Secrets Manager.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to retrieve API key from Secrets Manager&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Running locally, get key from .env file
&lt;/span&gt;        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY not found in environment variables&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully retrieved OpenAI API key from .env file.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Testing and Validating the API
&lt;/h3&gt;

&lt;p&gt;Testing and validating the API is crucial to ensure it is functioning correctly before deploying it. Below are several tests using &lt;a href="https://circleci.com/blog/pytest-python-testing/" rel="noopener noreferrer"&gt;pytest&lt;/a&gt; and &lt;a href="https://docs.python.org/3/library/unittest.html" rel="noopener noreferrer"&gt;unittest&lt;/a&gt; packages. The &lt;a href="https://circleci.com/blog/unit-testing-vs-integration-testing/" rel="noopener noreferrer"&gt;unit tests&lt;/a&gt; check if the app runs locally and in AWS Lambda, ensuring that requests work in both setups.&lt;/p&gt;

&lt;p&gt;These tests validate the core functionality of the FastAPI-based GenAI server by covering different scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Basic API Functionality&lt;/strong&gt;: Tests the root (&lt;code&gt;/&lt;/code&gt;) endpoint and the &lt;code&gt;/generate&lt;/code&gt; endpoint with a valid prompt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input Validation&lt;/strong&gt;: Ensures that invalid input (e.g., missing &lt;code&gt;prompt&lt;/code&gt;) returns appropriate error responses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Mocks scenarios such as missing API keys and verifies that the API correctly returns error messages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mocking External Dependencies&lt;/strong&gt;: Uses &lt;code&gt;unittest.mock.patch&lt;/code&gt; to simulate OpenAI API calls and AWS Secrets Manager, ensuring API integration works as expected without relying on actual external services.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi.testclient&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TestClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HTTPException&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;unittest.mock&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MagicMock&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;


&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fixture for FastAPI test client&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;TestClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_root_endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the root endpoint&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome to the GenAI API&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_generate_endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the generate endpoint&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert error status code
&lt;/span&gt;    &lt;span class="n"&gt;response_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response_data&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_generate_invalid_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the generate endpoint with invalid input&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Test with missing prompt field
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert validation error
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;422&lt;/span&gt; &lt;span class="c1"&gt;# Unprocessable Entity
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;detail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;loc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main.get_openai_api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Patch the get_openai_api_key function in main.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_generate_text_missing_api_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_get_api_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the generate endpoint when the API key is missing&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Setup mock to raise an HTTPException
&lt;/span&gt;    &lt;span class="n"&gt;mock_get_api_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;side_effect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API key not found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Test with a sample prompt
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert error status code
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt; &lt;span class="c1"&gt;# Internal Server Error
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API key not found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;detail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# # Test function to mock OpenAI client behavior
&lt;/span&gt;&lt;span class="nd"&gt;@patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main.get_openai_client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Patch the get_openai_client function in main.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_mock_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_get_client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test the OpenAI client behavior with a simplified mock client&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Set up the mock OpenAI client and the mock response in one go
&lt;/span&gt;    &lt;span class="n"&gt;mock_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mock_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nc"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mock response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Directly mock the message and its content
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# When `chat.completions.create()` is called, return the mock response
&lt;/span&gt;    &lt;span class="n"&gt;mock_get_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mock_response&lt;/span&gt;


    &lt;span class="c1"&gt;# Simulate calling the OpenAI client's `chat.completions.create()`
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mock_get_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me a joke&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert the mock response
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;mock_response&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mock response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="nd"&gt;@patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;boto3.session.Session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_get_openai_api_key_aws_environment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Test retrieving API key from AWS Secrets Manager&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Set up environment to simulate AWS Lambda
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test-function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;clear&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

        &lt;span class="c1"&gt;# Create mock for the entire boto3 session and client chain
&lt;/span&gt;        &lt;span class="n"&gt;mock_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MagicMock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;mock_session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mock_client&lt;/span&gt;

        &lt;span class="c1"&gt;# Mock the get_secret_value response
&lt;/span&gt;        &lt;span class="n"&gt;mock_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SecretString&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;mock_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_secret_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mock_response&lt;/span&gt;

        &lt;span class="c1"&gt;# Call the function under test
&lt;/span&gt;        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_openai_api_key&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# Assertions
&lt;/span&gt;        &lt;span class="n"&gt;mock_session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assert_called_once&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;mock_session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;return_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assert_called_with&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;secretsmanager&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;mock_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_secret_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assert_called_with&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SecretId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mocking is an essential technique for testing app behavior before deploying it to production environments. It helps simulate API interactions, allowing you to check how the application would respond under various conditions without making real calls to external services.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  API Deployment to AWS with AWS SAM
&lt;/h2&gt;

&lt;p&gt;To expose your FastAPI endpoints using AWS API Gateway, you will use &lt;a href="https://aws.amazon.com/serverless/sam/" rel="noopener noreferrer"&gt;AWS Serverless Application Model (AWS SAM)&lt;/a&gt;. AWS SAM simplifies the process of building and deploying serverless applications on AWS by providing a simplified syntax for defining AWS resources such as Lambda functions, API Gateway, IAM roles, and other related services, all within a &lt;code&gt;template.yaml&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;Key components of the &lt;code&gt;template.yaml&lt;/code&gt; file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lambda Function&lt;/strong&gt;: The serverless function that will execute the FastAPI application logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway&lt;/strong&gt;: The API Gateway exposes the FastAPI application as HTTP endpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secrets Manager&lt;/strong&gt;: Stores OpenAI API Key securely, which will be retrieved by Lambda.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policies&lt;/strong&gt;: Defines necessary IAM roles and policies that allow Lambda to interact with other AWS services (e.g., Secrets Manager).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;AWSTemplateFormatVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2010-09-09'&lt;/span&gt;
&lt;span class="na"&gt;Transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless-2016-10-31&lt;/span&gt;
&lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GenAI API with FastAPI and Lambda&lt;/span&gt;

&lt;span class="c1"&gt;# Global variables&lt;/span&gt;
&lt;span class="na"&gt;Globals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Function&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Lambda function resources in the template&lt;/span&gt;
    &lt;span class="na"&gt;Timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;MemorySize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;
    &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python3.11&lt;/span&gt;
    &lt;span class="na"&gt;Architectures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;x86_64&lt;/span&gt;
    &lt;span class="na"&gt;Environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;OPENAI_API_KEY_SECRET_ARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:openai/api_key-*'&lt;/span&gt;
  &lt;span class="na"&gt;Api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;EndpointConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;REGIONAL&lt;/span&gt;
    &lt;span class="na"&gt;Cors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AllowMethods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt;
      &lt;span class="na"&gt;AllowHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'Content-Type,Authorization'"&lt;/span&gt;
      &lt;span class="na"&gt;AllowOrigin&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt;

&lt;span class="c1"&gt;# AWS resources that will be created&lt;/span&gt;
&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# API Gateway&lt;/span&gt;
  &lt;span class="na"&gt;GenAIApi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Api&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;StageName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dev&lt;/span&gt;
      &lt;span class="na"&gt;EndpointConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;REGIONAL&lt;/span&gt;
      &lt;span class="na"&gt;Cors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;AllowMethods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt;
        &lt;span class="na"&gt;AllowHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'Content-Type,Authorization'"&lt;/span&gt;
        &lt;span class="na"&gt;AllowOrigin&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;'*'"&lt;/span&gt;

  &lt;span class="c1"&gt;# Lambda function&lt;/span&gt;
  &lt;span class="na"&gt;GenAIFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Function&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;CodeUri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./app/&lt;/span&gt;
      &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;main.handler&lt;/span&gt;
      &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;FastAPI GenAI service using OpenAI API&lt;/span&gt;
      &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;AWSLambdaBasicExecutionRole&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;secretsmanager:GetSecretValue&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:openai/api_key-*'&lt;/span&gt;

      &lt;span class="na"&gt;Environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;Variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;OPENAI_API_KEY_SECRET_ARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:openai/api_key-*'&lt;/span&gt;    
      &lt;span class="na"&gt;Events&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;RootPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Api&lt;/span&gt;
          &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;RestApiId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;GenAIApi&lt;/span&gt;
            &lt;span class="na"&gt;Path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;       
            &lt;span class="na"&gt;Method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ANY&lt;/span&gt;   
        &lt;span class="na"&gt;GeneratePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Api&lt;/span&gt;
          &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;RestApiId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;GenAIApi&lt;/span&gt;
            &lt;span class="na"&gt;Path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/generate&lt;/span&gt;   
            &lt;span class="na"&gt;Method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ANY&lt;/span&gt;      

&lt;span class="na"&gt;Outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;GenAIApiEndpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;API Gateway endpoint URL for the GenAI service&lt;/span&gt;
    &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://${GenAIApi}.execute-api.${AWS::Region}.amazonaws.com/dev/'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Deploying the FastAPI Application
&lt;/h3&gt;

&lt;p&gt;Once the &lt;code&gt;template.yaml&lt;/code&gt; file is ready, the next step is to deploy your application using AWS SAM. Before deploying, you will need to create a Lambda deployment package that includes both the application code, &lt;code&gt;main.py&lt;/code&gt; and necessary dependencies.&lt;/p&gt;

&lt;p&gt;To make this easier, you will use a bash script (&lt;code&gt;build-sam.sh&lt;/code&gt;) to automate the process. This script will create a folder named &lt;strong&gt;app&lt;/strong&gt; where &lt;code&gt;main.py&lt;/code&gt; will be copied and the dependencies from &lt;code&gt;pyproject.toml&lt;/code&gt; will be transferred into a &lt;code&gt;requirements.txt&lt;/code&gt; file, which works seamlessly with AWS Lambda.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"🚀 Building Lambda deployment package..."&lt;/span&gt;

&lt;span class="c"&gt;# Create a temporary build directory&lt;/span&gt;
&lt;span class="nv"&gt;LAMBDA_PACKAGE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;/app"&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAMBDA_PACKAGE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Save dependencies to requirements.txt&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"📦 Saving dependencies to requirements.txt..."&lt;/span&gt;
uv pip freeze &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; requirements.txt  

&lt;span class="c"&gt;# Copy application code and project configuration&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"📋 Copying application code and project configuration..."&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;main.py requirements.txt &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAMBDA_PACKAGE&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"✅ Build complete! Your Lambda package is ready for deployment."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script will ensure that your Lambda deployment package includes both the application code and the necessary dependencies for running your application in AWS Lambda.&lt;/p&gt;

&lt;p&gt;Once your Lambda deployment package is ready, you can use the following AWS SAM commands (from the root directory, where the &lt;code&gt;template.yaml&lt;/code&gt; is located) to build and deploy the application.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam build
sam deploy &lt;span class="nt"&gt;--stack-name&lt;/span&gt; sam-app &lt;span class="nt"&gt;--resolve-s3&lt;/span&gt; &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_IAM &lt;span class="nt"&gt;--region&lt;/span&gt; eu-central-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;sam build&lt;/code&gt; command will build the Lambda package and create a &lt;code&gt;.aws-sam&lt;/code&gt; directory. The &lt;code&gt;sam deploy&lt;/code&gt; command will deploy the app to AWS and create the necessary AWS resources (Lambda, API Gateway, and IAM roles). &lt;/p&gt;

&lt;p&gt;Below is what the flags to the command do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;stack-name sam-app&lt;/code&gt; pecifies the name of the app: &lt;code&gt;sam-app&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;resolve-s3&lt;/code&gt; automatically creates and manages an Amazon S3 bucket for storing deployment artifacts.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;capabilities CAPABILITY_IAM&lt;/code&gt; acknowledges that the deployment may create IAM roles and policies.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;region eu-central-1&lt;/code&gt; specifies the AWS region where the stack should be deployed. You can change this to your region.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the deployment is complete, you will get an API Gateway URL in the output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4r3t48dwgvkcwwntrt4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4r3t48dwgvkcwwntrt4.png" alt="deployment-sam-aws" width="512" height="127"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can test the &lt;code&gt;/generate&lt;/code&gt; endpoint by making a POST request using &lt;code&gt;curl&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://znhxj2t415.execute-api.eu-central-1.amazonaws.com/dev/generate &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"prompt": "Tell me a joke!"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should receive a response like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"response"&lt;/span&gt;:&lt;span class="s2"&gt;"Why did the scarecrow win an award? &lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Because he was outstanding in his field!"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can monitor the logs for both AWS Lambda and API Gateway in AWS CloudWatch to ensure everything is running smoothly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2talenmfalcsrwfdp70.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2talenmfalcsrwfdp70.png" alt="lambda-cloudwatch" width="512" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Automating Deployment with CircleCI
&lt;/h2&gt;

&lt;p&gt;CI/CD is a critical aspect of modern software development that helps to automate the process of testing and deploying code. For serverless applications, such as those deployed using AWS Lambda, APIs, and other cloud services, CI/CD pipelines ensure the following benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automation&lt;/strong&gt;: It reduces human errors by automating testing and deployment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Faster Delivery&lt;/strong&gt;: Features and updates are delivered faster with automated pipelines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: Ensures the same deployment process across environments (dev, staging, production).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Serverless architectures naturally scale, and a CI/CD pipeline ensures deployments can handle this scaling seamlessly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next you will walk you through setting up CircleCI to automate the testing and deployment of your FastAPI serverless application to AWS.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up AWS Credentials and OpenAI API Key in CircleCI
&lt;/h3&gt;

&lt;p&gt;Before CircleCI can deploy your serverless application to AWS, you need to configure your AWS credentials and &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;. In your CircleCI account, &lt;a href="https://circleci.com/docs/create-project/" rel="noopener noreferrer"&gt;set up a project&lt;/a&gt; and link it to your GitHub repository. Then, under project settings, &lt;a href="https://circleci.com/docs/set-environment-variable/" rel="noopener noreferrer"&gt;add the environment variables&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feltwoxop9l5ohawglv63.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feltwoxop9l5ohawglv63.png" alt="env-var-circleci" width="512" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These variables will be used by CircleCI during the CI/CD process to authenticate with AWS and OpenAI.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the &lt;code&gt;config.yml&lt;/code&gt; File for CircleCI
&lt;/h3&gt;

&lt;p&gt;Now that you have your credentials and environment variables set up, you need to create a &lt;code&gt;config.yml&lt;/code&gt; file for CircleCI that will automate the testing and deployment of your serverless application. This configuration file should be placed under the &lt;code&gt;.circleci&lt;/code&gt; directory in the root of your repository. Create the folder and add the YAML file to it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2.1&lt;/span&gt;

&lt;span class="na"&gt;orbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;python&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/python@3.0.0&lt;/span&gt;
  &lt;span class="na"&gt;aws-cli&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;circleci/aws-cli@5.2.0&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build-deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cimg/python:3.11&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;checkout&lt;/span&gt;  &lt;span class="c1"&gt;# Checkout the code from the repository&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install uv&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;curl -LsSf https://astral.sh/uv/install.sh | sh&lt;/span&gt;  &lt;span class="c1"&gt;# Install uv package&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install the project&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;uv sync&lt;/span&gt;   &lt;span class="c1"&gt;# Sync the project using uv&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run tests&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;uv run pytest&lt;/span&gt;  

      &lt;span class="c1"&gt;# Using AWS CLI orb for setup instead of manual installation&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;aws-cli/setup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;profile_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install AWS SAM CLI&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;# Download and install AWS SAM CLI for Linux&lt;/span&gt;
            &lt;span class="s"&gt;curl -Lo sam-cli-linux-x86_64.zip https://github.com/aws/aws-sam-cli/releases/latest/download/aws-sam-cli-linux-x86_64.zip&lt;/span&gt;
            &lt;span class="s"&gt;unzip sam-cli-linux-x86_64.zip -d sam-installation  # Unzip the SAM CLI&lt;/span&gt;
            &lt;span class="s"&gt;sudo ./sam-installation/install  # Run the SAM CLI installation script&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build Lambda Deployment Package&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;chmod +x build-sam.sh&lt;/span&gt;
            &lt;span class="s"&gt;./build-sam.sh  # Execute the build script&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build with SAM&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sam build&lt;/span&gt; 

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Verify AWS Credentials&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;aws sts get-caller-identity  # Verify AWS credentials are set up correctly&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy with SAM&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;sam deploy --stack-name sam-app --resolve-s3 --capabilities CAPABILITY_IAM --region eu-central-1  # Deploy with SAM&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Output API Gateway URL&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;API_URL=$(aws cloudformation describe-stacks --stack-name sam-app --region eu-central-1)&lt;/span&gt;
            &lt;span class="s"&gt;echo "GenAI API URL: $API_URL"  # Output the API Gateway URL&lt;/span&gt;

&lt;span class="na"&gt;workflows&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;build-deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws&lt;/span&gt;  &lt;span class="c1"&gt;# Add this to use AWS credentials from CircleCI context&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Breakdown of the &lt;code&gt;config.yml&lt;/code&gt; file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Orbs&lt;/strong&gt;: The configuration uses two orbs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;python&lt;/code&gt;: The CircleCI Python orb handles setting up the Python environment.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;aws-cli&lt;/code&gt;: The AWS CLI orb simplifies the setup of AWS CLI to interact with AWS services.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Jobs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;build-deploy&lt;/code&gt;: This job defines all the steps necessary to deploy your application, from installing all required dependencies, including AWS CLI and AWS SAM, verifying credentials, running tests, till building and deploying the lambda package using AWS SAM.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Workflows&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;deploy&lt;/code&gt; workflow ensures that the build-deploy job is executed in sequence whenever changes are pushed to the repository.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;context: aws&lt;/code&gt; ensures that the AWS credentials stored in CircleCI are available during the deployment process.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;After creating the project, the pipeline will be triggered. If everything is set up correctly , you should see a green build. On clicking the pipeline, you will see at the end of the deployment, the API Gateway URL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd2hhqh4cc50u49f1kzu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd2hhqh4cc50u49f1kzu.png" alt="circleci-deployment" width="512" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Similarly to the previous deployments, you can test the &lt;code&gt;/generate&lt;/code&gt; endpoint by making a POST request using &lt;code&gt;curl&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://jz82zg9i21.execute-api.eu-central-1.amazonaws.com/dev/generate &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"prompt": "Tell me a joke!"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; You should receive a response like this:
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"response"&lt;/span&gt;:&lt;span class="s2"&gt;"Why did the scarecrow win an award? &lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Because he was outstanding in his field!"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Clean Up
&lt;/h3&gt;

&lt;p&gt;If you do not need the app anymore, you can delete it with the following command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam delete
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will be prompted to add the app name (&lt;code&gt;sam-app&lt;/code&gt;) and to confirm the deletion.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Automating the deployment of serverless applications, such as those using FastAPI and AWS Lambda, significantly enhances the efficiency and reliability of the development process. By leveraging CircleCI for Continuous Integration and Continuous Deployment (CI/CD), you ensure that every code change is automatically tested, built, and deployed to AWS. This eliminates manual intervention, reduces the risk of human error, and accelerates the delivery of new features and updates.&lt;/p&gt;

&lt;p&gt;The combination of AWS SAM, CircleCI, and serverless architecture not only streamlines the deployment but also scales seamlessly with demand, allowing you to focus on building and improving your application without worrying about the underlying infrastructure. Additionally, integrating AWS services such as API Gateway and Secrets Manager ensures secure and efficient interactions with external APIs like OpenAI's, while maintaining the best practices for cloud-based applications.&lt;/p&gt;

&lt;p&gt;By following the outlined steps, you can establish a robust and automated pipeline for deploying serverless applications. This approach will empower your team to deliver high-quality code faster, with confidence, and without downtime.&lt;/p&gt;

&lt;p&gt;If you enjoyed reading this content, you can support me by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reacting to the post 💖 💖 💖&lt;/li&gt;
&lt;li&gt;Following me on DEV! 👏 👏 👏&lt;/li&gt;
&lt;li&gt;Sharing my content on LinkedIn! 💯💯💯&lt;/li&gt;
&lt;li&gt;Following my &lt;a href="https://github.com/benitomartin" rel="noopener noreferrer"&gt;Github&lt;/a&gt; 🎶 🎷 🎶&lt;/li&gt;
&lt;li&gt;Hiring &lt;a href="https://martindatasol.com/" rel="noopener noreferrer"&gt;me&lt;/a&gt;! 👨‍💻👨‍💻👨‍💻&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do you have questions or feedback? Feel free to ask below 🚀&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
