<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergio Esteban</title>
    <description>The latest articles on DEV Community by Sergio Esteban (@sergioestebance).</description>
    <link>https://dev.to/sergioestebance</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3483835%2Fe24cf772-dbe2-4dc1-88f0-7075fbe1cc68.png</url>
      <title>DEV Community: Sergio Esteban</title>
      <link>https://dev.to/sergioestebance</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sergioestebance"/>
    <language>en</language>
    <item>
      <title>Launching your personal assistant</title>
      <dc:creator>Sergio Esteban</dc:creator>
      <pubDate>Tue, 16 Dec 2025 07:49:36 +0000</pubDate>
      <link>https://dev.to/sergioestebance/launching-your-personal-assistant-2blp</link>
      <guid>https://dev.to/sergioestebance/launching-your-personal-assistant-2blp</guid>
      <description>&lt;h3&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In this post, we will extend the implementation of an AI Agent. This time, we have improved the agent’s accessibility and its capacity to use different tools for a specific context.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;TLDRS&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;All code used in this article is available in the &lt;a href="https://github.com/warike/expo-lambda-s3-gemini" rel="noopener noreferrer"&gt;repository&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Context&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In addition to using AWS serverless services in the context of rapid experimentation, we have included the agent’s capacity to generate responses based on a certain predefined set of tools and the ability to operate using Gemini as a provider.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What’s the objective?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We want to obtain information in different formats through tools and then present it to the client within a mobile application.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why mobile?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Greatest benefits of AI are when the technology is accessible on the devices you frequently carry.&lt;/p&gt;

&lt;p&gt;I definitely don’t carry a laptop everywhere, do you?&lt;/p&gt;

&lt;p&gt;Below is a high-level diagram representing what you want to achieve.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8n8tg6n4h56klts1qpb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8n8tg6n4h56klts1qpb.png" alt="high level diagram of information flow" width="700" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While there are no significant changes to the architecture, you will see the necessary considerations in how we handle responses from our Lambda.&lt;/p&gt;

&lt;p&gt;Let’s move ahead.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Scope&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We will use the &lt;a href="https://support.davisinstruments.com/article/y4cq28mflh-vantage-pro2-console-user-manual" rel="noopener noreferrer"&gt;Davis Instruments console manuals&lt;/a&gt; as data sources, and we will build a private Chatbot for iPhone that will answers from these manuals, as well as displaying data from the stations.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Functional requirements&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The system should allow you to consult and receive information related to Vantage Pro2 equipment, such as setup questions and/or requests for data from the weather stations.&lt;/li&gt;
&lt;li&gt;This interaction must be in natural language.&lt;/li&gt;
&lt;li&gt;The system should allow sending messages and receiving a response in real time as it is generated.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Non-Functional requirements&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;As usual, you can rely on serverless capabilities and add a few additional characteristics.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The system that provides data supports to the app should be highly available.&lt;/li&gt;
&lt;li&gt;The system should scale automatically to handle variable traffic patterns.&lt;/li&gt;
&lt;li&gt;The system should remain secure with domain-level access control and encrypted traffic.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following is a high level diagram of the architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F21b0pcigxtirnhm60e2k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F21b0pcigxtirnhm60e2k.png" alt="High level diagram of the architecture." width="700" height="283"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Out of scope&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We will omit the feature of being able to request actual data from the &lt;a href="http://weatherlink.com/" rel="noopener noreferrer"&gt;WeatherLink API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However, you can find a &lt;a href="https://github.com/warike/showcase/tree/main/projects/06-davis-mcp-express" rel="noopener noreferrer"&gt;project of mine&lt;/a&gt; that enables an integration using AI and MCP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It is also worth mentioning that the fact that this implementation is possible does not mean this is the ideal way to expose an API.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With that out of the way, let’s move on.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Cost breakdown&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Considering a volume of 1,000 requests per month, which for our use case is very generous, we are running our solution with the following components.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Google AI API Calls&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We will use Gemini 2.5 Flash for language inference and text-embedding-004 for embeddings.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;gemini-2.5-flash input (500k tokens x $0.30/M) plus output (500k tokens x $2.50/M) = $0.15 + $1.25&lt;/li&gt;
&lt;li&gt;text-embedding-004 (500k tokens x $0.10/M) = $0.05&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives us an approximate total of $1.45 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;AWS Lambda&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We will configure a function with 1024 MB and a 60 second timeout. AWS provides 1M free requests and 400,000 GB-seconds monthly, which covers typical usage.&lt;/p&gt;

&lt;p&gt;This results in $0.00 USD per month (within free tier).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;CloudFront&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We will use PriceClass_100 (US and Europe) assuming moderate traffic of 5 GB data transfer and 50,000 HTTPS requests.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data transfer: 5 GB x $0.085/GB = $0.43&lt;/li&gt;
&lt;li&gt;HTTPS requests: 50,000 x $0.01/10,000 = $0.05&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in about $0.48 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;S3 Storage&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For storing static assets and application data, estimating 2 GB of storage with 100,000 GET requests.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Storage: 2 GB x $0.023/GB = $0.05&lt;/li&gt;
&lt;li&gt;GET requests: 100,000 x $0.0004/1,000 = $0.04&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in about $0.09 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Container Registry (AWS ECR)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For storing 3 images of 400 MB each (1.2 GB total). Billable storage after 500 MB Free Tier is 0.7 GB.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;0.7 GB x $0.10/GB = $0.07&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in about $0.07 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Route 53&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We will use a hosted zone ($0.50/month) and include a standard .com domain registration fee ($13.00/year or $1.08/month).&lt;/p&gt;

&lt;h2&gt;
  
  
  Get warike-technologies’s stories in your inbox
&lt;/h2&gt;

&lt;p&gt;Join Medium for free to get updates from this writer.&lt;/p&gt;

&lt;p&gt;Subscribe&lt;/p&gt;

&lt;p&gt;This results in about $1.58 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;CloudWatch&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For logging with seven-day retention, estimating minimal volume (0.2 GB ingestion).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Log Ingestion: 0.2 GB x $0.50/GB = $0.10&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in about $0.10 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Apple Developer Account&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The Apple Developer Program membership is $99 USD per year. This is a fixed annual fee required to publish apps on the App Store.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Annual Cost: $99.00 / 12 months = $8.25&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in a monthly allocation of $8.25 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Total Monthly Cost&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This brings us to a total of approximately $12.02 USD per month.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Implementation&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let’s start from the App and we can progress towards to the infrastructure&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Mobile App&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For the technical implementation of a mobile application, I will suggest using the &lt;a href="https://galaxies.dev/dashboard" rel="noopener noreferrer"&gt;Galaxies Dev&lt;/a&gt; &lt;a href="https://github.com/Galaxies-dev/chatgpt-clone-react-native" rel="noopener noreferrer"&gt;repository&lt;/a&gt;. Big shout out for his work.&lt;/p&gt;

&lt;p&gt;Given that the &lt;a href="https://www.youtube.com/watch?v=8ztx68SUOQo" rel="noopener noreferrer"&gt;video&lt;/a&gt; explains how to build the application from scratch, we will highlight the changes we have included on our part.&lt;/p&gt;

&lt;p&gt;The first thing to consider is that the application will use the following dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { useChat } from '@ai-sdk/react';
import { DefaultChatTransport } from "ai";
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important parts from the implementation at this point are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We need to sign POST Requests as it is mentioned in the &lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-lambda.html" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;We need to add Clerk token into our Resquest Headers.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const transport = useMemo(() =&amp;gt; new DefaultChatTransport({
    fetch: async (url, options) =&amp;gt; {
        if (options?.method === 'POST' &amp;amp;&amp;amp; options.body &amp;amp;&amp;amp; typeof options.body === 'string') {
            const hash = await Crypto.digestStringAsync(
                Crypto.CryptoDigestAlgorithm.SHA256,
                options.body
            );
            options.headers = {
                ...options.headers,
                'x-amz-content-sha256': hash,
            };
        }
        return expoFetch(url as string, options as any);
    },
    credentials: 'include',
    api: generateAPIUrl("/api/chat"),
    headers: async () =&amp;gt; {
        const token = await getTokenRef.current();
        return {
            "Authorization": token ? `Bearer ${token}` : '',
            "X-Clerk-Token": token || '',
        };
    },
}), []);

  ...

const { messages, stop, sendMessage, status, setMessages } = useChat({
    transport: transport,

    onFinish: async ({ message }) =&amp;gt; {
        scrollViewRef.current?.scrollToEnd({ animated: true });

        // Save assistant message to database
        const currentChatId = chatIdRef.current;

        if (currentChatId &amp;amp;&amp;amp; message) {
            const chatIdNum = parseInt(currentChatId);
            if (isNaN(chatIdNum)) return;

            // Filter out tool invocations and empty parts
            const validParts = filterPartsForDB(message.parts);

            // Only save if there's actual text content
            if (validParts.length === 0) return;

            try {
                await addMessage(db, chatIdNum, {
                    parts: validParts,
                    role: Role.Assistant,
                });

            } catch {
                // Fail silently
            }
        }
    },
    onError: (err) =&amp;gt; {
        console.error('Chat error:', err);
    },
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And later, to display the messages from the API, we will assume that certain messages include the tools within the payload.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;type ChatMessageProps = {
  message: UIMessage;
  isLoading?: boolean;
  status?: ChatStatus;
};

const ChatMessage = ({ message, isLoading, status }: ChatMessageProps) =&amp;gt; {
  const { user } = useUser();
  const { role, id, parts } = message;
  const isUser = role === Role.User;
  const isAssistant = role === Role.Assistant;

  // Check if this is a streaming message with no content yet
  const hasTextContent = parts?.some(
    (part) =&amp;gt; part.type === 'text' &amp;amp;&amp;amp; part.text &amp;amp;&amp;amp; part.text.trim().length &amp;gt; 0
  );

  // Check if we have a data tool response to avoid duplicate rendering
  const hasTools = parts?.some(
    // @ts-ignore
    (part) =&amp;gt; part.type === 'tool-dataTool' || part.type === 'tool-weatherTool' || part.type === 'tool-gddTool'
  );

  // Show loader for assistant messages that are empty and we're streaming
  const showLoader = isLoading || (isAssistant &amp;amp;&amp;amp; !hasTextContent &amp;amp;&amp;amp; !hasTools &amp;amp;&amp;amp; status === 'streaming');

  // Don't render empty user messages
  if (isUser &amp;amp;&amp;amp; !hasTextContent) {
    return null;
  }

  return (
    &amp;lt;View style={[styles.row, isUser &amp;amp;&amp;amp; { flexDirection: 'row-reverse' }]}&amp;gt;
      &amp;lt;View style={[styles.item, { backgroundColor: '#fff', paddingTop: 5 }]}&amp;gt;
        &amp;lt;Image
          source={
            isUser
              ? { uri: user?.imageUrl || 'https://placehold.co/250x250?text=U' }
              : require('@/assets/images/logo-white.png')
          }
          style={styles.avatar}
        /&amp;gt;
      &amp;lt;/View&amp;gt;

      &amp;lt;View style={[styles.text, { flex: 1, paddingBottom: 10 }, isUser &amp;amp;&amp;amp; styles.userMessageBubble]}&amp;gt;
        {showLoader ? (
          &amp;lt;LottieLoader /&amp;gt;
        ) : (
          parts?.map((part, i) =&amp;gt; {
            switch (part.type) {
              case 'text':
                if (!part.text || part.text.trim().length === 0) {
                  return null;
                }
                return &amp;lt;CustomMarkdown key={`${id}-${i}`} content={part.text} /&amp;gt;;
              case 'dynamic-tool':
                return &amp;lt;LottieLoader key={`${id}-${i}`} /&amp;gt;;
              case 'tool-weatherTool':
                const weatherOutput = part.output as WeatherDataProps;
                if (weatherOutput) {
                  const { temperature, unit, description, forecast } = weatherOutput;
                  return &amp;lt;MessageData key={`${id}-${i}`} temperature={temperature} unit={unit} description={description} forecast={forecast} /&amp;gt;;
                }
                return &amp;lt;LottieLoader key={`${id}-${i}`} /&amp;gt;;
              case 'tool-dataTool':
                const soilOutput = part.output as SoilChartProps;
                if (soilOutput) {
                  const { labels, datasets, legend } = soilOutput;
                  return &amp;lt;MessageChart key={`${id}-${i}`} labels={labels} datasets={datasets} legend={legend} /&amp;gt;;
                }
                return &amp;lt;LottieLoader key={`${id}-${i}`} /&amp;gt;;
              case 'tool-gddTool':
                const gddOutput = part.output as GDDDataProps;
                if (gddOutput) {
                  const { labels, datasets } = gddOutput;
                  return &amp;lt;MessageGDD key={`${id}-${i}`} labels={labels} datasets={datasets} /&amp;gt;;
                }
                return &amp;lt;LottieLoader key={`${id}-${i}`} /&amp;gt;;
              default:
                return null;
            }
          })
        )}
      &amp;lt;/View&amp;gt;
    &amp;lt;/View&amp;gt;
  );
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Having all that, most of the application should be operating. We can proceed to build the application so that it responds to your requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Implementing Mastra in Lambda&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The first step would be to create a simple flow in Lambda where you must validate the request’s credentials, then check the path, and finally generate the response from the messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export const handler = awslambda.streamifyResponse&amp;lt;APIGatewayProxyEventV2&amp;gt;(
    async (event: APIGatewayProxyEventV2, responseStream: awslambda.HttpResponseStream) =&amp;gt; {

        responseStream.setContentType('text/plain; charset=utf-8');

        if (!await checkAccess(event)) {
            responseStream.write('Unauthorized');
            responseStream.end();
            return;
        };

        if (event.rawPath !== '/api/chat') {
            const httpResponseMetadata = {
                statusCode: 404,
                statusMessage: "Not Found",
                headers: {},
            };
            responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);
            responseStream.write('Not Found');
            responseStream.end();
            return;
        }

        try {
            const httpResponseMetadata = {
                statusCode: 200,
                statusMessage: "OK",
                headers: {},
            };
            responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);
            // Parse request body
            if (event.body) {
                const body = JSON.parse(event.body);
                const { messages } = body;
                await runAgent(messages, responseStream);
            }
        } catch (error) {
            console.error(`Error in handler:`);
            console.error(error);
        } finally {
            responseStream.end();
        }
    }
);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Clerk authentication&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The validation that the user is valid is quite simple through Clerk. You simply verify the token sent from the app, and Clerk takes care of the rest.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import type { APIGatewayProxyEventV2 } from "aws-lambda";
import { verifyToken } from '@clerk/backend';
import { env } from './env.js';

export const checkAccess = async (event: APIGatewayProxyEventV2): Promise&amp;lt;boolean&amp;gt; =&amp;gt; {
  try {
    const headers = event.headers || {};
    const authHeader = headers['x-clerk-token'] ?? headers['authorization'] ?? headers['Authorization'];
    const token = authHeader?.replace(/^Bearer\s+/i, '');
    if (!token) {
      return false;
    }

    await verifyToken(token, {
      secretKey: env.CLERK_SECRET_KEY,
    });
    return true;
  } catch (error) {
    return false;
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All secrets and configuration variables are available on the Clerk dashboard.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5jyj3cj5d5rz4gquxkc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5jyj3cj5d5rz4gquxkc.png" alt="Clerk’s Dashboard" width="700" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Mastra&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Our implementation &lt;strong&gt;cannot use the native functions of ai-sdk or Mastra&lt;/strong&gt;. We must use the pipelines function in NodeJS to pipe the response to the stream of our Lambda function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { mastra } from './app.js';
import { UIMessage, stepCountIs } from 'ai';
import { pipeline } from 'node:stream/promises';

export const runAgent = async (messages: UIMessage[], responseStream: awslambda.HttpResponseStream) =&amp;gt; {
    const agent = mastra.getAgent("basic");
    const result = await agent.stream(messages, {
        stopWhen: stepCountIs(5),
        modelSettings: {
            temperature: 0.0,
            topK: 3,
        },
    });
    const sseResponse = result.aisdk.v5.toUIMessageStreamResponse({ sendReasoning: false });
    await pipeline(sseResponse.body!, responseStream);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also, we must consider creating the agent and its tools. I am going to reuse S3 functionalities and use static functions that represent the data functionalities.&lt;/p&gt;

&lt;p&gt;Feel free to implementing them yourself, or DM me for more specifics.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// agents.ts

import { Agent } from '@mastra/core/agent';
import { rag_system_prompt } from './prompt.js';
import { languageModel } from './model.js';
import { queryTool, weatherTool, gddTool, dataTool } from './tools.js';

export const basicAgent = new Agent({
    id: "basic",
    name: "basic",
    instructions: [
        {
            role: "system", content: rag_system_prompt
        },
    ],
    model: languageModel,
    tools: {
        queryTool,
        weatherTool,
        gddTool,
        dataTool,
    },
});

// tools.ts

import { MastraLanguageModel } from '@mastra/core/memory';
import { createVectorQueryTool } from '@mastra/rag';
import { createTool } from "@mastra/core/tools";
import { embeddingModel, languageModel } from './model.js';
import { S3VectorsStoreName } from './vector.js';
import { env } from '../env.js';
import { z } from 'zod';

export const queryTool = createVectorQueryTool({
    id: 'tool_knowledgebase_s3vectors',
    description:
        'Use it to search for relevant documentation about Vantage Pro 2',
    vectorStoreName: S3VectorsStoreName,
    indexName: env.AWS_S3_VECTORS_INDEX_NAME,
    includeVectors: false,
    model: embeddingModel,
    reranker: {
        model: languageModel as unknown as MastraLanguageModel,
        options: {
            topK: 3,
        },
    },
});

export const weatherTool = createTool({
    id: "weather-tool",
    description: "Fetches temperature from weather stations for a location",
    inputSchema: z.object({
        stationId: z.string(),
    }),
    outputSchema: z.object({
        temperature: z.string(),
        unit: z.string(),
        description: z.string(),
        forecast: z.array(z.string()),
    }),
    execute: async (inputData) =&amp;gt; {
        return {
            temperature: "10",
            unit: "C",
            description: "Sunny",
            forecast: ["10", "11", "15", "10", "4"],
        };
    },
});

export const dataTool = createTool({
    id: "data-tool",
    description: "Fetches soil moisture data from weather stations for a location",
    inputSchema: z.object({
        stationId: z.string(),
    }),
    outputSchema: z.object({
        labels: z.array(z.string()),
        legend: z.array(z.string()).optional(),
        datasets: z.array(z.object({
            data: z.array(z.number()),
            color: z.string().optional(),
            strokeWidth: z.number().optional(),
            withDots: z.boolean().optional()
        }))
    }),
    execute: async (inputData) =&amp;gt; {
        return {
            labels: ["08-12", "09-12", "10-12", "11-12", "12-12"],
            legend: ["20cm", "50cm", "75cm"],
            datasets: [
                {
                    data: [80, 78, 108, 90, 85, 80],
                    color: "rgba(80, 119, 29, 1)",
                    strokeWidth: 2
                },
                {
                    data: [92, 91, 91, 91, 92, 91],
                    color: "rgba(120, 160, 250, 1)",
                    strokeWidth: 2
                },
                {
                    data: [101, 101, 102, 102, 101, 101],
                    color: "rgba(250, 200, 50, 1)",
                    strokeWidth: 2
                },
                {
                    data: [160],
                    withDots: false,
                    color: "transparent",
                    strokeWidth: 0
                },
                {
                    data: [0],
                    withDots: false,
                    color: "transparent",
                    strokeWidth: 0
                }
            ]
        };
    },
});

export const gddTool = createTool({
    id: "gdd-tool",
    description: "Fetches GDD data from weather stations for a location",
    inputSchema: z.object({
        stationId: z.string(),
    }),
    outputSchema: z.object({
        labels: z.array(z.string()),
        datasets: z.array(z.object({
            data: z.array(z.number())
        }))
    }),
    execute: async (inputData) =&amp;gt; {
        return {
            labels: ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"],
            datasets: [
                {
                    data: [5.64, 15, 7.64, 1, 8, 6.1, 10]
                }
            ]
        };
    },
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s move ahead to define the infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Infrastructure&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The infrastructure doesn’t change that much from previous implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Lambda definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It is important to be able to map the corresponding variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;locals {
  lambda_chat = {
    name        = "chat-${basename(path.cwd)}"
    image       = "${aws_ecr_repository.warike_development_ecr.repository_url}:chat-v1.0"
    description = "Lambda chat function for ${local.project_name}"
    memory_size = 1024
    timeout     = 60
    env_vars = {
      GOOGLE_GENERATIVE_AI_API_KEY = var.google_generative_ai_api_key
      GOOGLE_LANGUAGE_MODEL        = var.google_language_model
      GOOGLE_EMBEDDING_MODEL       = var.google_model_embedding

      AWS_S3_VECTORS_BUCKET_NAME = var.vector_bucket_name
      AWS_S3_VECTORS_INDEX_NAME  = var.vector_bucket_index_name

      LANGWATCH_API_KEY = var.langwatch_api_key
      CLERK_SECRET_KEY  = var.clerk_secret_key

      NODE_OPTIONS = "--enable-source-maps --stack-trace-limit=1000"
      NODE_ENV     = "production"
    }
  }
}

## Lambda Chat
module "warike_development_lambda_chat" {
  source  = "terraform-aws-modules/lambda/aws"
  version = "~&amp;gt; 8.1.2"

  ## Configuration
  function_name = local.lambda_chat.name
  description   = local.lambda_chat.description
  memory_size   = local.lambda_chat.memory_size
  timeout       = local.lambda_chat.timeout

  ## Package
  create_package = false
  package_type   = "Image"
  image_uri      = local.lambda_chat.image
  environment_variables = merge(
    local.lambda_chat.env_vars,
    {}
  )

  create_current_version_allowed_triggers = false

  ## Permissions
  create_role = false
  lambda_role = aws_iam_role.warike_development_lambda_chat_role.arn

  ## Logging
  use_existing_cloudwatch_log_group = true
  logging_log_group                 = aws_cloudwatch_log_group.warike_development_lambda_chat_logs.name
  logging_log_format                = "JSON"
  logging_application_log_level     = "INFO"
  logging_system_log_level          = "WARN"

  ## Response Streaming
  invoke_mode = "RESPONSE_STREAM"

  ## Lambda Function URL for testing
  create_lambda_function_url = true
  authorization_type         = "AWS_IAM"

  cors = {
    allow_credentials = true
    allow_origins     = ["*"]
    allow_methods     = ["*"]
    allow_headers     = ["*"]
    expose_headers    = ["*"]
    max_age           = 86400
  }

  tags = merge(local.tags, { Name = local.lambda_chat.name })

  depends_on = [
    aws_cloudwatch_log_group.warike_development_lambda_chat_logs,
    aws_ecr_repository.warike_development_ecr,
    null_resource.warike_development_build_image_seed
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Cloudfront&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It is important to be able to map the corresponding domain names that you have chosen to use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;locals {
  cloudfront_oac_lambda_function_url = "chat_lambda_function_url"
  zone_name                          = var.zone_name
  app_domain_name                    = var.app_domain_name
}

module "warike_development_cloudfront" {
  source  = "terraform-aws-modules/cloudfront/aws"
  version = "5.0.1"

  ## Configuration
  enabled                        = true
  price_class                    = "PriceClass_100"
  retain_on_delete               = false
  wait_for_deployment            = true
  is_ipv6_enabled                = true
  create_monitoring_subscription = true

  ## Extra CNAMEs
  aliases = [local.app_domain_name]
  comment = "Chat CloudFront Distribution"

  ## Origin access control
  create_origin_access_control = true

  origin_access_control = {
    "chat_lambda_function_url" = {
      description      = "CloudFront access to Lambda Function URL"
      origin_type      = "lambda"
      signing_behavior = "always"
      signing_protocol = "sigv4"
    }
  }

  origin = {
    "chat_lambda_function_url" = {
      domain_name           = trimsuffix(replace(module.warike_development_lambda_chat.lambda_function_url, "https://", ""), "/")
      origin_access_control = local.cloudfront_oac_lambda_function_url
      custom_origin_config = {
        http_port              = 80
        https_port             = 443
        origin_protocol_policy = "match-viewer"
        origin_ssl_protocols   = ["TLSv1", "TLSv1.1", "TLSv1.2"]
      }
    }
  }

  default_cache_behavior = {
    target_origin_id       = local.cloudfront_oac_lambda_function_url
    viewer_protocol_policy = "redirect-to-https"
    allowed_methods        = ["HEAD", "DELETE", "POST", "GET", "OPTIONS", "PUT", "PATCH"]
    cached_methods         = ["GET", "HEAD", "OPTIONS"]

    ## Cache policy disabled
    cache_policy_id          = "4135ea2d-6df8-44a3-9df3-4b5a84be39ad"
    origin_request_policy_id = "b689b0a8-53d0-40ab-baf2-68738e2966ac"

    ## Forwarded values disabled
    use_forwarded_values = false

    ## TTL settings
    min_ttl     = 0
    default_ttl = 0
    max_ttl     = 0
    compress    = true

  }

  viewer_certificate = {
    acm_certificate_arn = module.warike_development_acm.acm_certificate_arn
    ssl_support_method  = "sni-only"
  }

  tags = merge(local.tags, { Name = local.cloudfront_oac_lambda_function_url })

  depends_on = [
    module.warike_development_acm,
    module.warike_development_lambda_chat,
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Testing &amp;amp; Observability&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;As the implementation was getting ready, We need to way to understand if the tools are being used correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Running tests for the Lambda functionality.&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The implementation of the tests for the Lambda logic is quite straightforward.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PASS  src/__tests__/mastra/app.spec.ts
PASS  src/__tests__/mastra/agent.spec.ts
PASS  src/__tests__/mastra/tool.spec.ts
PASS  src/__tests__/auth.spec.ts
PASS  src/__tests__/mastra/prompt.spec.ts

Test Suites: 5 passed, 5 total
Tests:       10 passed, 10 total
Snapshots:   0 total
Time:        3.628 s, estimated 4 s
Ran all test suites.
Watch Usage: Press w to show more.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, what we ideally want to know is whether the agent will behave according to the prompt we have defined.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;LangWatch&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here is where I will use the &lt;a href="https://scenario.langwatch.ai/" rel="noopener noreferrer"&gt;LangWatch scenarios&lt;/a&gt; tool. It helps to test out my AI agent by integrating quite easily on Mastra.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { type AgentAdapter, AgentRole, AgentReturnTypes } from "@langwatch/scenario";
import { mastra } from "../mastra/app.js";

export const basicAgent: AgentAdapter = {
    role: AgentRole.AGENT,
    call: async (input) =&amp;gt; {
        const basic = mastra.getAgent("basic");
        const result = await basic.generate(input.messages);
        return result.response.messages as unknown as AgentReturnTypes;
    },
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, the tests for the agent’s behavior are defined.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import scenario from "@langwatch/scenario";
import { describe, it, expect } from "vitest";
import { basicAgent } from "./agent-adapter.spec";

describe("Weather Station RAG Agent", () =&amp;gt; {
    it("should provide accurate calibration instructions", async () =&amp;gt; {
        const result = await scenario.run({
            name: "barometric pressure calibration request",
            description: `The user needs help calibrating their Vantage Pro2 weather station's barometric pressure sensor.`,
            agents: [
                basicAgent,
                scenario.userSimulatorAgent(),
                scenario.judgeAgent({
                    criteria: [
                        "Agent should provide calibration steps directly without asking follow-up questions",
                        "Response should include step-by-step instructions",
                        "Response should cite the Vantage Pro2 Operations Manual as source",
                        "Response should use telegraphic style (imperative verbs, minimal articles)",
                        "Agent should not include greetings or filler text",
                        "Instructions should mention setting correct elevation first",
                    ],
                }),
            ],
            script: [
                scenario.user("How do I calibrate pressure?"),
                scenario.agent(),
                scenario.judge(),
            ],
        });

        expect(result.success).toBe(true);
    }, 30_000);

    it("should use weatherTool for temperature and humidity requests", async () =&amp;gt; {
        const result = await scenario.run({
            name: "current weather data request",
            description: `The user wants current temperature and humidity readings.`,
            agents: [
                basicAgent,
                scenario.userSimulatorAgent(),
                scenario.judgeAgent({
                    criteria: [
                        "Agent should use the weatherTool to fetch current weather data",
                        "Agent should respond with 'Here's the weather data that you requested'",
                        "Agent should not ask follow-up questions",
                    ],
                }),
            ],
            script: [
                scenario.user("What's the current temperature and humidity?"),
                scenario.agent(),
                scenario.user("Station ID is TEST123009"),
                scenario.agent(),
                scenario.judge(),
            ],
        });

        expect(result.success).toBe(true);
    }, 30_000);

    it("should use dataTool for soil moisture requests", async () =&amp;gt; {
        const result = await scenario.run({
            name: "soil moisture data request",
            description: `The user wants current soil moisture data from their weather station.`,
            agents: [
                basicAgent,
                scenario.userSimulatorAgent(),
                scenario.judgeAgent({
                    criteria: [
                        "Agent should request for Station ID",
                        "Agent should use the dataTool to fetch soil moisture data",
                        "Agent should respond with 'Here's the soil moisture data that you requested'",
                        "Agent should not provide additional explanation beyond tool usage",
                    ],
                }),
            ],
            script: [
                scenario.user("Show me current soil moisture levels"),
                scenario.agent(),
                scenario.user("Station ID is TEST123009"),
                scenario.agent(),
                scenario.judge(),
            ],
        });

        expect(result.success).toBe(true);
    }, 30_000);

    it("should use gddTool for growing degree day requests", async () =&amp;gt; {
        const result = await scenario.run({
            name: "gdd data request",
            description: `The user wants current growing degree day data from their weather station.`,
            agents: [
                basicAgent,
                scenario.userSimulatorAgent(),
                scenario.judgeAgent({
                    criteria: [
                        "Agent should request for Station ID",
                        "Agent should use the gddTool to fetch growing degree day data",
                        "Agent should respond with 'Here's the growing degree day data that you requested'",
                        "Agent should not provide additional explanation beyond tool usage",
                    ],
                }),
            ],
            script: [
                scenario.user("Show me current GDD data"),
                scenario.agent(),
                scenario.user("Station ID is TEST123009"),
                scenario.agent(),
                scenario.judge(),
            ],
        });

        expect(result.success).toBe(true);
    }, 30_000);

});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then I can view the results both in my terminal and on the LangWatch platform.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftag1ade4gavmrtchbiz1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftag1ade4gavmrtchbiz1.png" alt="LangWatch platform" width="700" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Additionally, I can configure LangWatch for the observability of my traces.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { OtelExporter } from "@mastra/otel-exporter";
import { env } from "../env.js";

export const observabilityExporter = new OtelExporter({
    provider: {
        custom: {
            endpoint: "https://app.langwatch.ai/api/otel/v1/traces",
            headers: { "Authorization": `Bearer ${env.LANGWATCH_API_KEY}` },
        },
    },
})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you can see the traces in the &lt;a href="https://app.langwatch.ai/" rel="noopener noreferrer"&gt;dashboard&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyabi7tajtmm7jyw9jef5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyabi7tajtmm7jyw9jef5.png" alt="Tracing details in LangWatch" width="700" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Human Testing the app&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;To test the application, I will use the iPhone simulator.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc59b4rdx3rjegicx7ea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc59b4rdx3rjegicx7ea.png" alt="Sequence of app screenshots." width="700" height="525"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;et voila!&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusions&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This article demonstrates a complete, end-to-end implementation for deploying a powerful, private AI assistant accessible via a mobile application.&lt;/p&gt;

&lt;p&gt;By combining the accessibility of Expo on the frontend with the robustness of AWS Lambda and S3 on the backend, we created a scalable RAG solution.&lt;/p&gt;

&lt;p&gt;A core takeaway is the effective orchestration of the AI Agent using Mastra. The agent is configured with multiple tools, including RAG for documentation and specific data fetching functionalities. This tool-calling capability is crucial for delivering diverse and context-specific responses directly to the user.&lt;/p&gt;

&lt;p&gt;Furthermore, integrating LangWatch proved invaluable for ensuring the agent’s reliability. The use of scenarios for testing agent behavior, especially its tool-use decisions, confirms that the system adheres to the defined prompts and requirements before deployment.&lt;/p&gt;

&lt;p&gt;This makes a personal assistant a very achievable goal.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>expo</category>
      <category>mastra</category>
      <category>rag</category>
    </item>
    <item>
      <title>Launching your RAG system on AWS: CloudFront, Lambda, Bedrock &amp; S3 Vectors</title>
      <dc:creator>Sergio Esteban</dc:creator>
      <pubDate>Mon, 24 Nov 2025 19:05:52 +0000</pubDate>
      <link>https://dev.to/sergioestebance/launching-your-rag-system-on-aws-cloudfront-lambda-bedrock-s3-vectors-pk</link>
      <guid>https://dev.to/sergioestebance/launching-your-rag-system-on-aws-cloudfront-lambda-bedrock-s3-vectors-pk</guid>
      <description>&lt;p&gt;A step by step guide with AI SDK, AWS and Terraform&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In this post I revisit the &lt;a href="https://medium.com/warike/launching-your-ai-agent-on-aws-bedrock-lambda-api-gateway-bd0934f3697b" rel="noopener noreferrer"&gt;implementation&lt;/a&gt; of an AI Agent, this time adding the ability to return responses tied to a specific context.&lt;/p&gt;

&lt;h2&gt;
  
  
  TLDRS
&lt;/h2&gt;

&lt;p&gt;All code used in this article is available in the &lt;a href="https://github.com/warike/aws-lambda-cloudfront-stream" rel="noopener noreferrer"&gt;repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;As usual, I rely on Serverless services to launch the experiment quickly. With the new capabilities AWS is rolling out, the agent will stream responses in real time as they are generated.&lt;/p&gt;

&lt;p&gt;Below is a high level diagram representing the target architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh0i3dm0mjlz2eask1i0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh0i3dm0mjlz2eask1i0.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Scope
&lt;/h2&gt;

&lt;p&gt;For this project I will implement a RAG system using serverless services.&lt;/p&gt;

&lt;p&gt;We can outline the following requirements for a low volume of 1000 requests per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  Functional requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The system should allow sending messages and receiving a response in real time as it is generated.&lt;/li&gt;
&lt;li&gt;The system should restrict access only through the predefined domain.&lt;/li&gt;
&lt;li&gt;The system should respond only when the question is related to the previously provided content.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Non functional requirements
&lt;/h3&gt;

&lt;p&gt;As usual we can rely on serverless capabilities and add a few additional characteristics.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The system should be highly available&lt;/li&gt;
&lt;li&gt;The system should scale automatically to handle variable traffic patterns&lt;/li&gt;
&lt;li&gt;The system should remain secure with domain level access control and encrypted traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Out of Scope
&lt;/h2&gt;

&lt;p&gt;This article does not cover data ingestion for the context that the system will use when answering.&lt;/p&gt;

&lt;p&gt;However, notebooks are available for testing.&lt;/p&gt;

&lt;p&gt;It is also worth mentioning that being possible does not mean this is the ideal way to expose an Agent.&lt;/p&gt;

&lt;p&gt;With that out of the way, let’s move on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost breakdown
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;We are running a serverless RAG application with the following components&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Considering a volume of 1k requests per month, we can list the following.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bedrock API Calls
&lt;/h3&gt;

&lt;p&gt;We will use the most cost effective models, Nova Micro and Titan Embed v2, for language inference and embeddings.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nova Micro input (500K tokens × 0.00035 USD) plus output (500K tokens × 0.0014 USD)&lt;/li&gt;
&lt;li&gt;Titan Embed v2 (500K tokens × 0.0001 USD) plus vector search operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives us an approximate total of 1.05 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Lambda
&lt;/h3&gt;

&lt;p&gt;We will configure a function with 512 MB and a 60 second timeout. This results in roughly 0.20 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudFront
&lt;/h3&gt;

&lt;p&gt;We will use PriceClass_100 (US and Europe) assuming minimal data transfer and HTTPS requests.&lt;/p&gt;

&lt;p&gt;This results in about 0.085 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  Container Registry
&lt;/h3&gt;

&lt;p&gt;For storing the Lambda function image, estimating 1 GB for the Docker image and handling all traffic within the same region.&lt;/p&gt;

&lt;p&gt;This results in about 0.10 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  Route53
&lt;/h3&gt;

&lt;p&gt;We will use a hosted zone regardless of traffic.&lt;/p&gt;

&lt;p&gt;This results in about 0.50 USD per month.&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch
&lt;/h3&gt;

&lt;p&gt;For logging with seven day retention.&lt;/p&gt;

&lt;p&gt;This results in about 0.01 USD per month.&lt;/p&gt;

&lt;p&gt;This brings us to a total of approximately 1.95 USD per month. This aligns with the low requirements we defined for model responses.&lt;/p&gt;

&lt;p&gt;Let’s continue.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  S3 Vector
&lt;/h3&gt;

&lt;p&gt;Terraform does not currently support S3 Vector resources, so everything must be created using the AWS CLI.&lt;/p&gt;

&lt;p&gt;First, create the bucket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3vectors create-vector-bucket &lt;span class="se"&gt;\ &lt;/span&gt;     
  &lt;span class="nt"&gt;--vector-bucket-name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$VECTOR_BUCKET_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profile&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROFILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we will create the Index inside the bucket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3vectors create-index &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--vector-bucket-name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$VECTOR_BUCKET_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--index-name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$VECTOR_BUCKET_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data-type&lt;/span&gt; float32 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dimension&lt;/span&gt; 1024 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--distance-metric&lt;/span&gt; cosine &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--metadata-configuration&lt;/span&gt; &lt;span class="nv"&gt;nonFilterableMetadataKeys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;,chunk &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profile&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROFILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To continue, we will need the ARN.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3vectors list-indexes &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vector-bucket-name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$VECTOR_BUCKET_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--profile&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROFILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Indexes[?IndexName=='&lt;/span&gt;&lt;span class="nv"&gt;$INDEX_NAME&lt;/span&gt;&lt;span class="s2"&gt;'].IndexArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Unable to retrieve ARN"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With these resources in place we can move forward using Terraform.&lt;/p&gt;

&lt;p&gt;Unlike my previous article, this version will stream responses as they are generated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda creation
&lt;/h3&gt;

&lt;p&gt;To support streamed responses we need to use &lt;code&gt;streamText&lt;/code&gt; together with &lt;code&gt;@types/aws-lambda&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The following is the final result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;node:stream/promises&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./bedrock/config&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;streamText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;bedrock&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./bedrock/model&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;findRelevantContent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./bedrock/query-vector&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;APIGatewayProxyEventV2&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-lambda&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;awslambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;streamifyResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;APIGatewayProxyEventV2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setContentType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/plain; charset=utf-8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Failed to parse request body:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Error: No prompt provided&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;modelId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;similarDocuments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;findRelevantContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;similarDocuments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`You are a helpful assistant. Answer the user's question using ONLY the context provided below. 
            If the answer is not in the context, say "I don't know" or "The provided context does not contain the answer."
            Do not hallucinate or use outside knowledge.

            Context:
            &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error in handler:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="nx"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the general logic ready, we can move on to implementing semantic search inside our Vector Index.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic search using Vector S3
&lt;/h3&gt;

&lt;p&gt;We need to take the user message and generate embeddings using the &lt;code&gt;amazon.titan-embed-text-v2:0&lt;/code&gt; model.&lt;/p&gt;

&lt;p&gt;Once we get the model response, we parse the result according to our needs. In this case we know each chunk contains the raw text we will use to look for relevant content.&lt;/p&gt;

&lt;p&gt;With this simple flow we return additional context to the model so it can produce a response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;embed&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;bedrock&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./model&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../env&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;QueryVectorsCommand&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;S3VectorsClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@aws-sdk/client-s3vectors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./config&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;VectorMetadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
    &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;s3Vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;S3VectorsClient&lt;/span&gt;&lt;span class="p"&gt;({})&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;findRelevantContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replaceAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s1"&gt;n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;modelId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;amazon.titan-embed-text-v2:0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userQueryEmbedded&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;textEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;providerOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;indexArn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_VECTOR_BUCKET_INDEX_ARN&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;queryVector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userQueryEmbedded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;topK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;returnMetadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;returnDistance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;s3Vectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;QueryVectorsCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vectors&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No vectors found for the query&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;VectorMetadata&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Metadata is required in the vector response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Vector metadata must contain id and chunk fields&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is important to note that the retrieved results do not always represent the right context. To get higher quality responses we need to spend more time preparing the data we process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Docker
&lt;/h3&gt;

&lt;p&gt;To package our solution we define the Lambda.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# ---- Build Stage ----&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;public.ecr.aws/lambda/nodejs:22&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /usr/src/app&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;corepack &lt;span class="nb"&gt;enable&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; package.json pnpm-lock.yaml* ./&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pnpm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--frozen-lockfile&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pnpm run build

&lt;span class="c"&gt;# ---- Runtime Stage ----&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/nodejs:22&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${LAMBDA_TASK_ROOT}&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /usr/src/app/dist/ ./&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /usr/src/app/node_modules ./node_modules&lt;/span&gt;

&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; [ "/lambda-entrypoint.sh", "index.handler" ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this in place we can move on to creating the cloud resources.&lt;/p&gt;

&lt;h3&gt;
  
  
  Infrastructure
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Container registry
&lt;/h3&gt;

&lt;p&gt;We need a centralized location to host our Docker images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_ecr_repository"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_ecr"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ecr-${local.project_name}"&lt;/span&gt;
  &lt;span class="nx"&gt;image_tag_mutability&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"IMMUTABLE_WITH_EXCLUSION"&lt;/span&gt;

  &lt;span class="nx"&gt;encryption_configuration&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;encryption_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AES256"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;image_scanning_configuration&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;scan_on_push&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;image_tag_mutability_exclusion_filter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;filter&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"*latest"&lt;/span&gt;
    &lt;span class="nx"&gt;filter_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"WILDCARD"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;force_delete&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lambda module
&lt;/h3&gt;

&lt;p&gt;For deploying our Lambda function using Docker we need to set the invocation mode to &lt;code&gt;RESPONSE_STREAM&lt;/code&gt;, enable &lt;code&gt;create_lambda_function_url&lt;/code&gt;, and use &lt;code&gt;AWS_IAM&lt;/code&gt; as the authorization type.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;lambda_chat&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"chat-${basename(path.cwd)}"&lt;/span&gt;
    &lt;span class="nx"&gt;image&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${aws_ecr_repository.warike_development_ecr.repository_url}:chat-latest"&lt;/span&gt;
    &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Lambda chat function for ${local.project_name}"&lt;/span&gt;
    &lt;span class="nx"&gt;memory_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;
    &lt;span class="nx"&gt;timeout&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
    &lt;span class="nx"&gt;env_vars&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;AWS_BEARER_TOKEN_BEDROCK&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_bearer_token_bedrock&lt;/span&gt;
      &lt;span class="nx"&gt;AWS_BEDROCK_MODEL&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_bedrock_model&lt;/span&gt;
      &lt;span class="nx"&gt;AWS_VECTOR_BUCKET_INDEX_ARN&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vector_bucket_index_arn&lt;/span&gt;
      &lt;span class="nx"&gt;AWS_BEDROCK_MODEL_EMBEDDING&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_bedrock_model_embedding&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;## Lambda Chat&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_lambda_chat"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 8.1.2"&lt;/span&gt;

  &lt;span class="c1"&gt;## Configuration&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;
  &lt;span class="nx"&gt;memory_size&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;memory_size&lt;/span&gt;
  &lt;span class="nx"&gt;timeout&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timeout&lt;/span&gt;

  &lt;span class="c1"&gt;## Package&lt;/span&gt;
  &lt;span class="nx"&gt;create_package&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;package_type&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Image"&lt;/span&gt;
  &lt;span class="nx"&gt;image_uri&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image&lt;/span&gt;
  &lt;span class="nx"&gt;environment_variables&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env_vars&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c1"&gt;## API Gateway&lt;/span&gt;
  &lt;span class="nx"&gt;create_current_version_allowed_triggers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="c1"&gt;## Permissions&lt;/span&gt;
  &lt;span class="nx"&gt;create_role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;lambda_role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_chat_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;

  &lt;span class="c1"&gt;## Logging&lt;/span&gt;
  &lt;span class="nx"&gt;use_existing_cloudwatch_log_group&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;logging_log_group&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_chat_logs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;logging_log_format&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"JSON"&lt;/span&gt;
  &lt;span class="nx"&gt;logging_application_log_level&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"INFO"&lt;/span&gt;
  &lt;span class="nx"&gt;logging_system_log_level&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"WARN"&lt;/span&gt;

  &lt;span class="c1"&gt;## Response Streaming&lt;/span&gt;
  &lt;span class="nx"&gt;invoke_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RESPONSE_STREAM"&lt;/span&gt;

  &lt;span class="c1"&gt;## Lambda Function URL&lt;/span&gt;
  &lt;span class="nx"&gt;create_lambda_function_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;authorization_type&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AWS_IAM"&lt;/span&gt;

  &lt;span class="nx"&gt;cors&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;allow_credentials&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;allow_origins&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;allow_methods&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;allow_headers&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;expose_headers&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;max_age&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_chat_logs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;aws_ecr_repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_ecr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;null_resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_build_image_seed&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ACM and Route 53
&lt;/h3&gt;

&lt;p&gt;Next, to access our Lambda function through CloudFront, we define the DNS records and a certificate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;## Amazon Certificate Manager&lt;/span&gt;
module &lt;span class="s2"&gt;"warike_development_acm"&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;source&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/acm/aws"&lt;/span&gt;
  version &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 6.1.0"&lt;/span&gt;

  domain_name               &lt;span class="o"&gt;=&lt;/span&gt; local.domain_name
  zone_id                   &lt;span class="o"&gt;=&lt;/span&gt; data.aws_route53_zone.warike_development_warike_tech.id
  subject_alternative_names &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*.&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.domain_name&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;

  validation_method &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"DNS"&lt;/span&gt;

  tags &lt;span class="o"&gt;=&lt;/span&gt; local.tags
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;## Route 53 - Hosted Zone&lt;/span&gt;
data &lt;span class="s2"&gt;"aws_route53_zone"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_warike_tech"&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  name &lt;span class="o"&gt;=&lt;/span&gt; local.domain_name
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;## Route 53 - Apex Record&lt;/span&gt;
resource &lt;span class="s2"&gt;"aws_route53_record"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_apex_record"&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  zone_id &lt;span class="o"&gt;=&lt;/span&gt; data.aws_route53_zone.warike_development_warike_tech.zone_id
  name    &lt;span class="o"&gt;=&lt;/span&gt; local.domain_name
  &lt;span class="nb"&gt;type&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"A"&lt;/span&gt;

  &lt;span class="nb"&gt;alias&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    name                   &lt;span class="o"&gt;=&lt;/span&gt; module.warike_development_cloudfront.cloudfront_distribution_domain_name
    zone_id                &lt;span class="o"&gt;=&lt;/span&gt; module.warike_development_cloudfront.cloudfront_distribution_hosted_zone_id
    evaluate_target_health &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;

  depends_on &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;
    module.warike_development_cloudfront,
  &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With everything above in place we can start configuring CloudFront.&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudFront
&lt;/h3&gt;

&lt;p&gt;For CloudFront we need to consider that an Origin Access Control will be created.&lt;/p&gt;

&lt;p&gt;We will reference &lt;code&gt;lambda_function_url&lt;/code&gt; from it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_cloudfront"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/cloudfront/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.0.1"&lt;/span&gt;

  &lt;span class="c1"&gt;## Configuration&lt;/span&gt;
  &lt;span class="nx"&gt;enabled&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;price_class&lt;/span&gt;                    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PriceClass_100"&lt;/span&gt;
  &lt;span class="nx"&gt;retain_on_delete&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;wait_for_deployment&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;is_ipv6_enabled&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;create_monitoring_subscription&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="c1"&gt;## Extra CNAMEs&lt;/span&gt;
  &lt;span class="nx"&gt;aliases&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"${local.domain_name}"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;comment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Chat CloudFront Distribution"&lt;/span&gt;

  &lt;span class="c1"&gt;## Origin access control&lt;/span&gt;
  &lt;span class="nx"&gt;create_origin_access_control&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;origin_access_control&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"chat_lambda_function_url"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;description&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"CloudFront access to Lambda Function URL"&lt;/span&gt;
      &lt;span class="nx"&gt;origin_type&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"lambda"&lt;/span&gt;
      &lt;span class="nx"&gt;signing_behavior&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"always"&lt;/span&gt;
      &lt;span class="nx"&gt;signing_protocol&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sigv4"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;origin&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"chat_lambda_function_url"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;domain_name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;trimsuffix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"https://"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s2"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nx"&gt;origin_access_control&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cloudfront_oac_lambda_function_url&lt;/span&gt;
      &lt;span class="nx"&gt;custom_origin_config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;http_port&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
        &lt;span class="nx"&gt;https_port&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt;
        &lt;span class="nx"&gt;origin_protocol_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"match-viewer"&lt;/span&gt;
        &lt;span class="nx"&gt;origin_ssl_protocols&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"TLSv1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"TLSv1.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"TLSv1.2"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;default_cache_behavior&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;target_origin_id&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cloudfront_oac_lambda_function_url&lt;/span&gt;
    &lt;span class="nx"&gt;viewer_protocol_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"redirect-to-https"&lt;/span&gt;
    &lt;span class="nx"&gt;allowed_methods&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"HEAD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"DELETE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"POST"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"GET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"OPTIONS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"PUT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"PATCH"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;cached_methods&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"GET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"HEAD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"OPTIONS"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;## Cache policy disabled&lt;/span&gt;
    &lt;span class="nx"&gt;cache_policy_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"4135ea2d-6df8-44a3-9df3-4b5a84be39ad"&lt;/span&gt;

    &lt;span class="c1"&gt;## Forwarded values disabled&lt;/span&gt;
    &lt;span class="nx"&gt;use_forwarded_values&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

    &lt;span class="c1"&gt;## TTL settings&lt;/span&gt;
    &lt;span class="nx"&gt;min_ttl&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nx"&gt;default_ttl&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nx"&gt;max_ttl&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nx"&gt;compress&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

    &lt;span class="nx"&gt;function_association&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;viewer-request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;function_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudfront_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_restrict_domain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;viewer_certificate&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;acm_certificate_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_acm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;acm_certificate_arn&lt;/span&gt;
    &lt;span class="nx"&gt;ssl_support_method&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sni-only"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cloudfront_oac_lambda_function_url&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_acm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_chat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudfront_function"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_restrict_domain"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"restrict-domain-${local.project_name}"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cloudfront-js-1.0"&lt;/span&gt;
  &lt;span class="nx"&gt;comment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Restrict access to custom domain only"&lt;/span&gt;
  &lt;span class="nx"&gt;publish&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;code&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"${path.module}/functions/auth.js"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CloudFront function
&lt;/h3&gt;

&lt;p&gt;Additionally, I added a lightweight CloudFront function to ensure requests are directed to the predefined domain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="err"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;host&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;allowedDomain&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'dev.zaistev.com'&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;

    &lt;span class="nx"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt; &lt;span class="err"&gt;!&lt;/span&gt;&lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="nx"&gt;allowedDomain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;statusDescription&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'Forbidden'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;"content-type"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"value"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"text/plain"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"encoding"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"value"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"Access denied. Please use the custom domain."&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the main resources in place we can move on to the actual cloud deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment
&lt;/h3&gt;

&lt;p&gt;I created a set of functions that automate the deployment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;.PHONY: deploy compile-server push-image terraform-apply

&lt;span class="c"&gt;# Extract current version from terraform file (e.g., chat-v1)&lt;/span&gt;
CURRENT_VERSION :&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;shell &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s1"&gt;'chat-v[0-9]*'&lt;/span&gt; infra/lambda-chat.tf | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 1&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# Extract the number part (e.g., 1)&lt;/span&gt;
VERSION_NUM :&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;shell &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;CURRENT_VERSION&lt;span class="si"&gt;)&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/chat-v//'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# Increment the number&lt;/span&gt;
NEXT_VERSION_NUM :&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;shell &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$$&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;VERSION_NUM&lt;span class="si"&gt;)&lt;/span&gt; + 1&lt;span class="o"&gt;))&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# Form the new version string (e.g., chat-v2)&lt;/span&gt;
NEXT_VERSION :&lt;span class="o"&gt;=&lt;/span&gt; chat-v&lt;span class="si"&gt;$(&lt;/span&gt;NEXT_VERSION_NUM&lt;span class="si"&gt;)&lt;/span&gt;

deploy: compile-server push-image terraform-apply &lt;span class="nb"&gt;test

&lt;/span&gt;compile-server:
    @echo &lt;span class="s2"&gt;"Compiling server app..."&lt;/span&gt;
    &lt;span class="nb"&gt;cd &lt;/span&gt;apps/server &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pnpm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pnpm run build

push-image:
    @echo &lt;span class="s2"&gt;"Current version: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;CURRENT_VERSION&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    @echo &lt;span class="s2"&gt;"Next version: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;NEXT_VERSION&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    @echo &lt;span class="s2"&gt;"Building and pushing Docker image..."&lt;/span&gt;
    ./infra/push_chat_image.sh &lt;span class="si"&gt;$(&lt;/span&gt;NEXT_VERSION&lt;span class="si"&gt;)&lt;/span&gt;
    @echo &lt;span class="s2"&gt;"Updating Terraform configuration..."&lt;/span&gt;
    &lt;span class="c"&gt;# Update the version in the terraform file&lt;/span&gt;
    &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="s1"&gt;'s/$(CURRENT_VERSION)/$(NEXT_VERSION)/g'&lt;/span&gt; infra/lambda-chat.tf

terraform-apply:
    @echo &lt;span class="s2"&gt;"Applying Terraform changes..."&lt;/span&gt;
    &lt;span class="nb"&gt;cd &lt;/span&gt;infra &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; terraform apply &lt;span class="nt"&gt;-auto-approve&lt;/span&gt;

&lt;span class="nb"&gt;test&lt;/span&gt;:
    @echo &lt;span class="s2"&gt;"Running integration tests..."&lt;/span&gt;
    ./infra/test_stream.sh

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Therefore, to get it running we need to execute&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;Additionally, we don’t have any context data available. This can be addressed by manually inserting data using Python.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;def insert_vector&lt;span class="o"&gt;(&lt;/span&gt;text, bucket_name, index_name&lt;span class="o"&gt;)&lt;/span&gt;:
    &lt;span class="s2"&gt;"""
    Insert a single text chunk into the S3 Vector Index.
    """&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;not s3_vectors_client:
        print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"❌ S3 Vectors client is not initialized."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;False

    &lt;span class="k"&gt;if &lt;/span&gt;not bucket_name or not index_name:
        print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"⚠️ S3_VECTOR_BUCKET_NAME or S3_VECTOR_INDEX_NAME is not set."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;False

    &lt;span class="c"&gt;# Generate unique ID&lt;/span&gt;
    unique_id &lt;span class="o"&gt;=&lt;/span&gt; generate_nanoid&lt;span class="o"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;# Generate embedding&lt;/span&gt;
    embedding &lt;span class="o"&gt;=&lt;/span&gt; get_embedding&lt;span class="o"&gt;(&lt;/span&gt;text&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;not embedding:
        print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"❌ Failed to generate embedding for text: {text[:50]}..."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;False

    &lt;span class="c"&gt;# Create metadata&lt;/span&gt;
    metadata &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"id"&lt;/span&gt;: unique_id,
        &lt;span class="s2"&gt;"chunk"&lt;/span&gt;: text
    &lt;span class="o"&gt;}&lt;/span&gt;

    try:
        &lt;span class="c"&gt;# Insert vector into S3 Vector Index&lt;/span&gt;
        response &lt;span class="o"&gt;=&lt;/span&gt; s3_vectors_client.put_vectors&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nv"&gt;vectorBucketName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;bucket_name,
            &lt;span class="nv"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;index_name,
            &lt;span class="nv"&gt;vectors&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;
                &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="s2"&gt;"key"&lt;/span&gt;: unique_id,
                    &lt;span class="s2"&gt;"data"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"float32"&lt;/span&gt;: embedding&lt;span class="o"&gt;}&lt;/span&gt;,
                    &lt;span class="s2"&gt;"metadata"&lt;/span&gt;: metadata
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;]&lt;/span&gt;
        &lt;span class="o"&gt;)&lt;/span&gt;

        print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"✅ Inserted vector with ID: {unique_id}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"   Text preview: {text[:80]}..."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;True

    except Exception as e:
        print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"❌ Error inserting vector: {e}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;False

&lt;span class="c"&gt;# Insert all predefined texts&lt;/span&gt;
print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;🚀 Starting vector insertion...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
success_count &lt;span class="o"&gt;=&lt;/span&gt; 0
fail_count &lt;span class="o"&gt;=&lt;/span&gt; 0

&lt;span class="k"&gt;for &lt;/span&gt;i, text &lt;span class="k"&gt;in &lt;/span&gt;enumerate&lt;span class="o"&gt;(&lt;/span&gt;predefined_texts, 1&lt;span class="o"&gt;)&lt;/span&gt;:
    print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;[{i}/{len(predefined_texts)}] Processing..."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;insert_vector&lt;span class="o"&gt;(&lt;/span&gt;text, s3_vector_bucket_name, s3_vector_index_name&lt;span class="o"&gt;)&lt;/span&gt;:
        success_count +&lt;span class="o"&gt;=&lt;/span&gt; 1
    &lt;span class="k"&gt;else&lt;/span&gt;:
        fail_count +&lt;span class="o"&gt;=&lt;/span&gt; 1

print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;📊 Insertion Summary:"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"   ✅ Successful: {success_count}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"   ❌ Failed: {fail_count}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"   📝 Total: {len(predefined_texts)}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Later, if we test with available context, we can get a result like the following.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;🔍 Testing with query: &lt;span class="s1"&gt;'Explain how EventBridge works in LLM Workflow context?'&lt;/span&gt;

✅ Found 3 similar vectors:

1. ID: HlgHN84M1CUmWH3q5tzHl
   Distance: 0.4293
   Text: An application emits an event &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;example, &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"type"&lt;/span&gt;: &lt;span class="s2"&gt;"orderCreated"&lt;/span&gt;, &lt;span class="s2"&gt;"priority"&lt;/span&gt;: &lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; Amazon EventBridge evaluates the event against its routing rules. Based on an event&lt;span class="s1"&gt;'s attributes, the system dynamically dispatches to the following: HighPriorityOrderProcessor (service A), StandardOrderProcessor (service B), UpdateOrderProcessor (service C). This pattern supports loose coupling, domain-based specialization, and runtime extensibility. This allows systems to respond intelligently to changing requirements and event semantics.

2. ID: Pv2Y1r0zYKxdFRrToZ0fo
   Distance: 0.5672
   Text: LLM-based routing: In agentic systems, routing also performs dynamic task delegation - but instead of Amazon EventBridge rules or metadata filters, the LLM classifies and interprets the user'&lt;/span&gt;s intent through natural language. The result is a flexible, semantic, and adaptive form of dispatching.

3. ID: Cp98hjhcis6SSNooIuhZ8
   Distance: 0.6386
   Text: Agent router workflow: A user submits a natural language request through an SDK. An Amazon Bedrock agent uses an LLM to classify the task &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;example, legal, technical, or scheduling&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; The agent dynamically routes the task through an action group to invoke the required agent: Domain-specific agent, Specialized tool chain, Custom prompt configuration. The selected handler processes the task and returns a tailored response.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;Building this context-aware Agent proves that serverless RAG is both practical and cost-efficient. By pairing the AI SDK with AWS Lambda, we achieved real-time streaming without the operational overhead of complex implementation, and yes, I mean WebSockets on Lambda.&lt;/p&gt;

&lt;p&gt;We had to rely on the AWS CLI for S3 Vectors since native Terraform support is currently missing. While this adds a manual step, it effectively removes the need for a dedicated and expensive vector database.&lt;/p&gt;

&lt;p&gt;Ultimately, this architecture provides a secure and scalable starting point. With CloudFront and Docker in place, you have a system that keeps costs remarkably low.&lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.plainenglish.io/uploading-documents-to-s3-vector-buckets-f418594feca3" rel="noopener noreferrer"&gt;https://aws.plainenglish.io/uploading-documents-to-s3-vector-buckets-f418594feca3&lt;/a&gt;&lt;br&gt;
&lt;a href="https://dev.to/aws-builders/ai-sdk-streaming-text-from-lambda-cfd"&gt;https://dev.to/aws-builders/ai-sdk-streaming-text-from-lambda-cfd&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Warike technologies&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>lambda</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>Launching Your AI Agent on AWS: Bedrock, Lambda &amp; API Gateway</title>
      <dc:creator>Sergio Esteban</dc:creator>
      <pubDate>Sat, 08 Nov 2025 14:34:29 +0000</pubDate>
      <link>https://dev.to/sergioestebance/ai-agent-on-aws-bedrock-lambda-api-gateway-3cak</link>
      <guid>https://dev.to/sergioestebance/ai-agent-on-aws-bedrock-lambda-api-gateway-3cak</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In this post, I’m going to explore the implementation of a service that allows us to send a questions and receive an answer back. Sending a question would be handle as a prompt to a specific model using Bedrock.&lt;/p&gt;

&lt;p&gt;To minimize the costs of this implementation, I’ll be utilizing the services of API Gateway, AWS Lambda, and Amazon Bedrock.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;When we want to implement a Generative AI service, there are several possible paths, and we must take the time to understand the minimal requirements before making a decision.&lt;/p&gt;

&lt;p&gt;In this post, I will demonstrate the process of deploying a simple GenAI solution that we can either connect to our current solution or expose to a third party via API Gateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  TLDR
&lt;/h2&gt;

&lt;p&gt;All the content for this implementation can be found in this &lt;a href="https://github.com/warike/showcase/tree/main/projects/11-aws-lambda-apigw-bedrock" rel="noopener noreferrer"&gt;&lt;strong&gt;repository&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scope
&lt;/h2&gt;

&lt;p&gt;For this implementation, I’m going to define a set of quite simple requirements.&lt;/p&gt;

&lt;p&gt;Below is a diagram that represents the implementation we will be building.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz72fa9zwyvfwamkovym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz72fa9zwyvfwamkovym.png" alt="High level Diagram" width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Functional Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;We should be able to send a prompt and receive a responsein the form of model completion from Nova Micro.&lt;/li&gt;
&lt;li&gt;We should be able to expose an HTTP endpoint to trigger the process of generating a response.&lt;/li&gt;
&lt;li&gt;Let's estimate, for the sake of cost calculations, that we will serve a monthly volume of 100 requests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Non-Functional Requirements
&lt;/h3&gt;

&lt;p&gt;Since we're using AWS serverless services, we can lean on their built in features for our non-functional requirements.&lt;/p&gt;

&lt;p&gt;For automation, the deployment should be handled completely by GitHub Actions. When it comes to availability, the API needs to maintain a monthly uptime of 99.9% or higher.&lt;/p&gt;

&lt;p&gt;On the security front, we'll need IAM-scoped access for Bedrock, use OpenID Connect for authentication, and ensure all traffic is HTTPS only.&lt;/p&gt;

&lt;p&gt;And finally, for observability, we must have structured logs that capture metadata, token usage, and any errors, all visualized with CloudWatch dashboards.&lt;/p&gt;

&lt;p&gt;Simple.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Out of Scope?
&lt;/h2&gt;

&lt;p&gt;For this implementation, I will be excluding &lt;strong&gt;authentication&lt;/strong&gt;, &lt;strong&gt;sanitization&lt;/strong&gt;, &lt;strong&gt;authorization&lt;/strong&gt; and any &lt;strong&gt;security processes&lt;/strong&gt; applied to the requests. This approach allows me to focus on the general GenAI implementation process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Breakdown per service
&lt;/h2&gt;

&lt;p&gt;Given the predefined volume, we can estimate an average usage per request of 22 input tokens and 232 output tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Bedrock (Nova Micro)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Nova Micro model is processing 2,200 input tokens and 23,200 output tokens monthly. This runs about $0.003/month. Output tokens cost more than input  about x 4 times.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Lambda&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Considering only 100 invocations and maybe 3 to 5 seconds per call** with 512MB memory, we are nowhere near the Free Tier limits. Lambda gives us 1 million requests and 400,000 GB-seconds free every month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Gateway&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For the first year, we get 1 million requests free. After that, we are looking at maybe $0.0004/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon ECR&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That 300MB Docker imagesitting in ECR costs about a cent a month. After the 500MB Free Tier, we are paying for around 136MB of storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens When It Scales?
&lt;/h2&gt;

&lt;p&gt;If it reaches 1,000 requests monthly, costs are around $0.04. At 10K requests, it's about $0.39. Even at 100K requests, we are only paying about **$3.76/month.&lt;/p&gt;

&lt;p&gt;Alrighty, let's move ahead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;p&gt;To begin the implementation, we can start by creating the agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Agent
&lt;/h3&gt;

&lt;p&gt;To create the agent, I first need to set up the complete TypeScript project.&lt;/p&gt;

&lt;p&gt;We can move quickly by following these instructions and installing the necessary dependencies so the project can operate correctly.&lt;/p&gt;

&lt;p&gt;Feel free to use the configuration that suits you best.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir -p handler terraform 
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts 

pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock
pnpm add zod dotenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this project, I will define &lt;strong&gt;three fundamental parts&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Logic for compatibility&lt;/strong&gt; with AWS Lambda.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logic to manage requests as prompts&lt;/strong&gt;, solicit the text generation based on these, and handle their potential future evolution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core logic for the actual text generation&lt;/strong&gt; using the &lt;strong&gt;&lt;code&gt;@ai-sdk&lt;/code&gt;&lt;/strong&gt; and Bedrock invocations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Below is a summary of the project’s main files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;import &lt;span class="o"&gt;{&lt;/span&gt; main &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"./app"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;const handler &lt;span class="o"&gt;=&lt;/span&gt; async &lt;span class="o"&gt;(&lt;/span&gt;event: any, context: any&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    try &lt;span class="o"&gt;{&lt;/span&gt;
        const body &lt;span class="o"&gt;=&lt;/span&gt; event.body ? JSON.parse&lt;span class="o"&gt;(&lt;/span&gt;event.body&lt;span class="o"&gt;)&lt;/span&gt; : &lt;span class="o"&gt;{}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        const prompt &lt;span class="o"&gt;=&lt;/span&gt; body.prompt ?? &lt;span class="s2"&gt;"Welcome from Warike technologies - GenAI solutions architecture"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        const response &lt;span class="o"&gt;=&lt;/span&gt; await main&lt;span class="o"&gt;(&lt;/span&gt;prompt&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            statusCode: 200,
            body: JSON.stringify&lt;span class="o"&gt;({&lt;/span&gt;
                success: &lt;span class="nb"&gt;true&lt;/span&gt;,
                data: response,
            &lt;span class="o"&gt;})&lt;/span&gt;,
        &lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; catch &lt;span class="o"&gt;(&lt;/span&gt;error&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        console.error&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Error in Lambda handler:'&lt;/span&gt;, error&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            statusCode: 500,
            body: JSON.stringify&lt;span class="o"&gt;({&lt;/span&gt;
                success: &lt;span class="nb"&gt;false&lt;/span&gt;,
                error: error instanceof Error ? error.message : &lt;span class="s1"&gt;'An unexpected error occurred'&lt;/span&gt;
            &lt;span class="o"&gt;})&lt;/span&gt;,
        &lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

// app.ts
import &lt;span class="o"&gt;{&lt;/span&gt; generateResponse &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"./utils/bedrock"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;async &lt;span class="k"&gt;function &lt;/span&gt;main&lt;span class="o"&gt;(&lt;/span&gt;prompt: string&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    try &lt;span class="o"&gt;{&lt;/span&gt;
        console.log&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'🚀 Starting Bedrock:'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return &lt;/span&gt;await generateResponse&lt;span class="o"&gt;(&lt;/span&gt;prompt&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; catch &lt;span class="o"&gt;(&lt;/span&gt;error&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        console.error&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'An unexpected error occurred running workflow:'&lt;/span&gt;, error&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        throw error&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;}&lt;/span&gt;

// utils/bedrock.ts
import &lt;span class="o"&gt;{&lt;/span&gt; config &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"./config"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; createAmazonBedrock &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s1"&gt;'@ai-sdk/amazon-bedrock'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; generateText &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s1"&gt;'ai'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;async &lt;span class="k"&gt;function &lt;/span&gt;generateResponse&lt;span class="o"&gt;(&lt;/span&gt;prompt: string&lt;span class="o"&gt;){&lt;/span&gt;
  const &lt;span class="o"&gt;{&lt;/span&gt; regionId, modelId &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; config&lt;span class="o"&gt;({&lt;/span&gt; &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  try &lt;span class="o"&gt;{&lt;/span&gt;
    const bedrock &lt;span class="o"&gt;=&lt;/span&gt; createAmazonBedrock&lt;span class="o"&gt;({&lt;/span&gt; 
        region: regionId
    &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    const &lt;span class="o"&gt;{&lt;/span&gt; text, usage &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; await generateText&lt;span class="o"&gt;({&lt;/span&gt;
        model: bedrock&lt;span class="o"&gt;(&lt;/span&gt;modelId&lt;span class="o"&gt;)&lt;/span&gt;,
        system: &lt;span class="s2"&gt;"You are a helpful assistant."&lt;/span&gt;,
        prompt: &lt;span class="o"&gt;[&lt;/span&gt;
          &lt;span class="o"&gt;{&lt;/span&gt; role: &lt;span class="s2"&gt;"user"&lt;/span&gt;, content: prompt &lt;span class="o"&gt;}&lt;/span&gt;,
        &lt;span class="o"&gt;]&lt;/span&gt;,
        &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  
    console.log&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;model: &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;modelId&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;, &lt;span class="se"&gt;\n&lt;/span&gt; response: &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;, usage: &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.stringify(usage)&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;text&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;}&lt;/span&gt; catch &lt;span class="o"&gt;(&lt;/span&gt;error&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    console.log&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;ERROR: Can&lt;span class="s1"&gt;'t invoke '&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;modelId&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;'. Reason: ${error}`);
  }
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For local testing, I've defined the following environment variables. We can use an AWS Bedrock API key for testing purposes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-west-2
&lt;span class="nv"&gt;AWS_BEDROCK_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'amazon.nova-micro-v1:0'&lt;/span&gt;
&lt;span class="nv"&gt;AWS_BEARER_TOKEN_BEDROCK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'aws_bearer_token_bedrock'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's highly recommended to only use &lt;strong&gt;Short-term API keys&lt;/strong&gt; to avoid compromising the system's security.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Defining the Infrastructure&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now that the system's logic is functional, we can create its Dockerfile, which will facilitate its deployment to AWS Lambda.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# ---- Build Stage ----&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;node:22-alpine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /usr/src/app&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;corepack &lt;span class="nb"&gt;enable&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; package.json pnpm-lock.yaml* ./&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pnpm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--frozen-lockfile&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pnpm run build

&lt;span class="c"&gt;# ---- Runtime Stage ----&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/nodejs:22&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${LAMBDA_TASK_ROOT}&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /usr/src/app/dist/src ./ &lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /usr/src/app/node_modules ./node_modules&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; [ "index.handler" ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With all components ready in our handler, we can proceed to define the resources in Terraform.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Exposing the Service with API Gateway&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We'll start by defining our API Gateway. We'll use the HTTP protocol and focus solely on creating the API, its stage, and its subsequent integration with Lambda.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;api_gateway_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev-http-${local.project_name}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_api_gw"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/apigateway-v2/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"5.4.1"&lt;/span&gt;

  &lt;span class="nx"&gt;name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;api_gateway_name&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"API Gateway for ${local.project_name}"&lt;/span&gt;
  &lt;span class="nx"&gt;protocol_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"HTTP"&lt;/span&gt;

  &lt;span class="nx"&gt;create_domain_name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;create_certificate&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;create_domain_records&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="nx"&gt;cors_configuration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;allow_headers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;allow_methods&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;allow_origins&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Access logs&lt;/span&gt;
  &lt;span class="nx"&gt;stage_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev"&lt;/span&gt;
  &lt;span class="nx"&gt;stage_description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Development API Gateway"&lt;/span&gt;


  &lt;span class="nx"&gt;stage_access_log_settings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nx"&gt;create_log_group&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="nx"&gt;destination_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_api_gw_logs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;

    &lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;requestId&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.requestId"&lt;/span&gt;
        &lt;span class="nx"&gt;requestTime&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.requestTime"&lt;/span&gt;
        &lt;span class="nx"&gt;protocol&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.protocol"&lt;/span&gt;
        &lt;span class="nx"&gt;httpMethod&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.httpMethod"&lt;/span&gt;
        &lt;span class="nx"&gt;resourcePath&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.resourcePath"&lt;/span&gt;
        &lt;span class="nx"&gt;routeKey&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.routeKey"&lt;/span&gt;
        &lt;span class="nx"&gt;status&lt;/span&gt;                  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.status"&lt;/span&gt;
        &lt;span class="nx"&gt;responseLength&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.responseLength"&lt;/span&gt;
        &lt;span class="nx"&gt;integrationErrorMessage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.integrationErrorMessage"&lt;/span&gt;

        &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;message&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.error.message"&lt;/span&gt;
          &lt;span class="nx"&gt;responseType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.error.responseType"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;identity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;sourceIP&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.identity.sourceIp"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;integration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;error&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.integration.error"&lt;/span&gt;
          &lt;span class="nx"&gt;integrationStatus&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$context.integration.integrationStatus"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Routes &amp;amp; Integration&lt;/span&gt;
  &lt;span class="nx"&gt;routes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"POST /"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;integration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;uri&lt;/span&gt;                    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_arn&lt;/span&gt;
        &lt;span class="nx"&gt;payload_format_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.0"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;stage_tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${local.api_gateway_name}-dev"&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;api_gateway_name&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_api_gw_logs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudwatch_log_group"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_api_gw_logs"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/aws/api-gw/${local.api_gateway_name}"&lt;/span&gt;
  &lt;span class="nx"&gt;retention_in_days&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Connecting to Amazon Bedrock&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Next, to utilize Amazon Bedrock, this implementation will use the Amazon Nova Micro inference profile in the US region.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;model_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"amazon.nova-micro-v1:0"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_bedrock_inference_profile"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_lambda_bedrock_model"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;inference_profile_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us.${local.model_id}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;## Bedrock Policy ##&lt;/span&gt;
&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_lambda_bedrock_policy_doc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s2"&gt;"bedrock:InvokeModel"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lambda Service
&lt;/h3&gt;

&lt;p&gt;Finally, we'll create the Lambda function associated with all the components we've created, along with others that can be found in the repository.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;lambda_function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_name&lt;/span&gt;
  &lt;span class="nx"&gt;lambda_env_vars&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;AWS_BEDROCK_MODEL&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_bedrock_inference_profile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_bedrock_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inference_profile_arn&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_lambda"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"8.1.2"&lt;/span&gt;

  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_name&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Lambda function for ${local.project_name}"&lt;/span&gt;

  &lt;span class="nx"&gt;image_uri&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${aws_ecr_repository.warike_development_ecr.repository_url}:latest"&lt;/span&gt;
  &lt;span class="nx"&gt;package_type&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Image"&lt;/span&gt;
  &lt;span class="nx"&gt;create_package&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;ignore_source_code_hash&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;memory_size&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;
  &lt;span class="nx"&gt;timeout&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;

  &lt;span class="nx"&gt;environment_variables&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_env_vars&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;create_role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;lambda_role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;

  &lt;span class="c1"&gt;## Cloudwatch logging&lt;/span&gt;
  &lt;span class="nx"&gt;use_existing_cloudwatch_log_group&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;logging_log_group&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_logs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;logging_log_format&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"JSON"&lt;/span&gt;
  &lt;span class="nx"&gt;logging_application_log_level&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"INFO"&lt;/span&gt;
  &lt;span class="nx"&gt;logging_system_log_level&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"WARN"&lt;/span&gt;

  &lt;span class="c1"&gt;## function URL&lt;/span&gt;
  &lt;span class="nx"&gt;create_lambda_function_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;aws_cloudwatch_log_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_lambda_logs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;null_resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_seed_ecr_image&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_name&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"null_resource"&lt;/span&gt; &lt;span class="s2"&gt;"warike_development_seed_ecr_image"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;provisioner&lt;/span&gt; &lt;span class="s2"&gt;"local-exec"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOT&lt;/span&gt;&lt;span class="sh"&gt;
      aws ecr get-login-password --region ${local.aws_region} --profile ${local.aws_profile} \
        | docker login --username AWS --password-stdin ${data.aws_caller_identity.current.account_id}.dkr.ecr.${local.aws_region}.amazonaws.com

      docker pull public.ecr.aws/lambda/nodejs:22
      docker tag public.ecr.aws/lambda/nodejs:22 ${aws_ecr_repository.warike_development_ecr.repository_url}:latest
      docker push ${aws_ecr_repository.warike_development_ecr.repository_url}:latest
&lt;/span&gt;&lt;span class="no"&gt;    EOT
&lt;/span&gt;  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_ecr_repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warike_development_ecr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important notice, &lt;strong&gt;&lt;em&gt;warike_development_seed_ecr_image&lt;/em&gt;&lt;/strong&gt; will require you to have your local docker running.&lt;/p&gt;

&lt;p&gt;Let’s move ahead.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Creating the Resources&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Once the resources are created, you should see a message similar to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform apply
...
Apply &lt;span class="nb"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; Resources: 26 added, 0 changed, 0 destroyed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring CI/CD with GitHub + ECR
&lt;/h2&gt;

&lt;p&gt;Additionally, we need a Docker image containing our project. The following GitHub pipeline will allow us to build and push an image to ECR, followed by its deployment to Lambda.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Lambda CI/CD Common&lt;/span&gt;
&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;on'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_call&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
      &lt;span class="na"&gt;lambda-function-secret-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
      &lt;span class="na"&gt;pnpm-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;number&lt;/span&gt;
        &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;number&lt;/span&gt;
        &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;22&lt;/span&gt;
    &lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;PROJECT_NAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;AWS_OIDC_ROLE_ARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;AWS_REGION&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;ECR_REPOSITORY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;AWS_LAMBDA_FUNCTION_NAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;AWS_LAMBDA_FUNCTION_ROLE_ARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Test&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;working-directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./handler&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install pnpm&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm/action-setup@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ inputs.pnpm-version }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use Node.js ${{ inputs.node-version }}&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ inputs.node-version }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Dependencies&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm install --frozen-lockfile&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan for critical vulnerabilities&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm audit --audit-level=critical&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run Tests&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;DOTENV_QUIET&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm test:ci&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm run build&lt;/span&gt;

  &lt;span class="na"&gt;build-docker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Push Docker Image&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;sha&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.vars.outputs.sha }}&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;working-directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./handler&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_OIDC_ROLE_ARN }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_REGION }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Login to Amazon ECR&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;login-ecr&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/amazon-ecr-login@v2&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;mask-password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;true'&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set commit-sha&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vars&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;calculatedSha=$(git rev-parse --short ${{ github.sha }})&lt;/span&gt;
          &lt;span class="s"&gt;echo "sha=${calculatedSha}" &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Push Docker Image&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;DOCKER_IMAGE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ECR_REPOSITORY }}:${{ inputs.app-name }}-${{ steps.vars.outputs.sha }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;echo "Building Docker image $DOCKER_IMAGE"&lt;/span&gt;
          &lt;span class="s"&gt;docker build -t $DOCKER_IMAGE .&lt;/span&gt;
          &lt;span class="s"&gt;docker tag $DOCKER_IMAGE "${{ secrets.ECR_REPOSITORY }}:${{ inputs.app-name }}-${{ steps.vars.outputs.sha }}"&lt;/span&gt;
          &lt;span class="s"&gt;docker tag $DOCKER_IMAGE "${{ secrets.ECR_REPOSITORY }}:latest"&lt;/span&gt;
          &lt;span class="s"&gt;docker push $DOCKER_IMAGE&lt;/span&gt;
          &lt;span class="s"&gt;docker push "${{ secrets.ECR_REPOSITORY }}:${{ inputs.app-name }}-${{ steps.vars.outputs.sha }}"&lt;/span&gt;
          &lt;span class="s"&gt;docker push "${{ secrets.ECR_REPOSITORY }}:latest"&lt;/span&gt;

  &lt;span class="na"&gt;deploy-prod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy Production Lambda&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.ref == 'refs/heads/main'&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build-docker&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;working-directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./handler&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_OIDC_ROLE_ARN }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_REGION }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy Production Lambda&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/aws-lambda-deploy@v1.1.0&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;function-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_LAMBDA_FUNCTION_NAME }}&lt;/span&gt;
          &lt;span class="na"&gt;package-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Image&lt;/span&gt;
          &lt;span class="na"&gt;image-uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ECR_REPOSITORY }}:${{ inputs.app-name }}-${{ needs.build-docker.outputs.sha }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And if everything is correct, we will be in a position to test Amazon Bedrock.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7otc5gbv6bo4pjg8z04g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7otc5gbv6bo4pjg8z04g.png" alt="Github actions" width="800" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;Let's perform a test. From our terminal, we can make the following query and observe the result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="s2"&gt;"https://123456.execute-api.us-west-2.amazonaws.com/dev/"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"prompt":"Heeey hoe gaat het?"}'&lt;/span&gt; | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The expected output is something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Hoi! Het gaat prima, bedankt voor het vragen. Hoe gaat het met jou? Is er iets waar ik je kan helpen of iets waar je graag over wilt praten?"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I  call that a &lt;strong&gt;success&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Observability&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It is important to mention that we can monitor any errors from &lt;strong&gt;CloudWatch&lt;/strong&gt;, so we don't have to navigate in the dark.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09miq1bxzmogjho0akz7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09miq1bxzmogjho0akz7.png" alt="Cloudwatch Dashboard" width="800" height="176"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Cleaning&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Finally, we clean up the resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform destroy
...
Destroy &lt;span class="nb"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; Resources: 26 destroyed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;Overall, implementing this GenAI service using AWS serverless components was quite straightforward.&lt;/p&gt;

&lt;p&gt;The combination of API Gateway, AWS Lambda, and Amazon Bedrock, specifically with the Nova Micro model, proves to be not only functional but also incredibly cost-effective. The analysis showed that this solution remains exceptionally inexpensive, even when (manually) scaling traffic significantly.&lt;/p&gt;

&lt;p&gt;By leveraging Terraform for infrastructure management and GitHub Actions for the CI/CD pipeline, we achieved a robust and fully automated deployment process.&lt;/p&gt;

&lt;p&gt;Finally, even by excluding aspects like authentication, it provides a solid and scalable foundation for building more complex Generative AI applications.&lt;/p&gt;

</description>
      <category>bedrock</category>
      <category>aws</category>
      <category>lambda</category>
      <category>apigateway</category>
    </item>
  </channel>
</rss>
