DEV Community

Cover image for Stream Amazon Bedrock Response with AWS Lambda Response Streaming
Jimmy for AWS Community Builders

Posted on • Updated on • Originally published at dev.to

Stream Amazon Bedrock Response with AWS Lambda Response Streaming

AWS Lambda support for response payload streaming. Response streaming is a new invocation pattern that lets functions progressively stream response payloads back to clients.

In traditional request-response models, the response needs to be fully generated and buffered before it is returned to the client. This can delay the time to first byte (TTFB) performance while the client waits for the response to be generated.

You can use Lambda response payload streaming to send response data to callers as it becomes available. This can improve performance for web and mobile applications. Response streaming also allows you to build functions that return larger payloads and perform long-running operations while reporting incremental progress.

In this article, we will explore how to create a lambda function URL with response streaming that can stream Amazon Bedrock response like this.
streams
In the above demo, we are invoking Lambda function URL with a prompt payload,then our Lambda will invoke Amazon Bedrock model with response stream then Lambda will return the response in a stream.

The architecture will look like this

architecture

Follow me for more

FullStack Saiyan | Twitter, Instagram, Facebook | Linktree

Share about Web, Mobile, Backend, Frontend, and Cloud Computing.

favicon linktr.ee

AWS Lambda

aws lambda

Amazon Web Services (AWS) Lambda is a serverless computing service provided by Amazon as part of its cloud computing platform. Serverless computing allows you to run code without provisioning or managing servers, which means you can focus on writing your application code without the need to worry about infrastructure.

Amazon Bedrock

amazon bedrock

Amazon Bedrock is a fully managed service that makes it easy to build and scale generative AI applications. It offers a choice of high-performing foundation models from leading AI companies, along with a broad set of capabilities that you need to build generative AI applications, simplifying development while maintaining privacy and security.

Prerequisites

  • AWS Serverless Application Model (SAM): AWS Serverless Application Model (AWS SAM) is an open-source framework that simplifies serverless application development and deployment on Amazon Web Services (AWS). To install AWS SAM you can refer to this documentation.
  • AWS Command Line Interface (CLI): AWS Command Line Interface (AWS CLI) is an open-source tool that enables you to manage AWS services from the command line. It provides a unified interface for interacting with AWS services, allowing you to control multiple AWS services from a single tool. The AWS CLI is available for Windows, macOS, and Linux operating systems. To install and configure AWS CLI you can refer to this documentation
  • Amazon Bedrock Model Access: To use Amazon Bedrock Model, make sure you already request access to the model that we are going to use which is Anthropic Claude model via AWS management console region N. Virginia us-east-1.

amazon bedrock model access

Create AWS SAM project

  • You can start creating an AWS SAM project by running the following command in your terminal
sam init --name lambda-bedrock --runtime nodejs18.x
Enter fullscreen mode Exit fullscreen mode
  • Then choose 1 - AWS Quick Start Templates.
Which template source would you like to use?
        1 - AWS Quick Start Templates
        2 - Custom Template Location
Choice: 1
Enter fullscreen mode Exit fullscreen mode
  • Next, choose 10 - Lambda Response Streaming.
Choose an AWS Quick Start application template
        1 - Hello World Example
        2 - GraphQLApi Hello World Example
        3 - Hello World Example with Powertools for AWS Lambda
        4 - Multi-step workflow
        5 - Standalone function
        6 - Scheduled task
        7 - Data processing
        8 - Serverless API
        9 - Full Stack
        10 - Lambda Response Streaming
Template: 10
Enter fullscreen mode Exit fullscreen mode
  • Choose N to enable X-Ray tracing
Would you like to enable X-Ray tracing on the function(s) in your application?  [y/N]: N
Enter fullscreen mode Exit fullscreen mode
  • Choose N to enable monitoring using CloudWatch Application Insights
Would you like to enable monitoring using CloudWatch Application Insights?
For more info, please view https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-application-insights.html [y/N]: N
Enter fullscreen mode Exit fullscreen mode
  • SAM will then start creating your project
Cloning from https://github.com/aws/aws-sam-cli-app-templates (process may take a moment)

    -----------------------
    Generating application:
    -----------------------
    Name: lambda-bedrock
    Runtime: nodejs18.x
    Architectures: x86_64
    Dependency Manager: npm
    Application Template: response-streaming
    Output Directory: .
    Configuration file: lambda-bedrock\samconfig.toml

    Next steps can be found in the README file at lambda-berdrock\README.md
Enter fullscreen mode Exit fullscreen mode
  • Our project folder structure will look like this
.
|-- README.md
|-- __tests__
|   `-- unit
|-- package.json
|-- samconfig.toml
|-- src
|   `-- index.js
`-- template.yaml
Enter fullscreen mode Exit fullscreen mode
  • There are several files we need to modify like src\index.js where our function handler logic will be, and template.yaml which is our AWS SAM template.

Lambda Function Handler

  • Go to src\index.js.
  • Modify our index.js handler into this
const {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} = require("@aws-sdk/client-bedrock-runtime"); 

const bedrock = new BedrockRuntimeClient({ region: "us-east-1" });

exports.handler = awslambda.streamifyResponse(
  async (event, responseStream, context) => {
    const { prompt } = JSON.parse(event.body);

    const params = {
      modelId: "anthropic.claude-v2",
      contentType: "application/json",
      accept: "*/*",
      body: JSON.stringify({
        prompt: `\n\nHuman: ${prompt} \n\nAssistant:`,
        max_tokens_to_sample: 2048,
        temperature: 0.5,
        top_k: 250,
        top_p: 0.5,
        stop_sequences: ["\n\nHuman:"],
        anthropic_version: "bedrock-2023-05-31",
      }),
    };

    console.log(params);

    const command = new InvokeModelWithResponseStreamCommand(params);

    const response = await bedrock.send(command);
    const chunks = [];

    for await (const chunk of response.body) {
      const parsed = JSON.parse(
        Buffer.from(chunk.chunk.bytes, "base64").toString("utf-8")
      );
      chunks.push(parsed.completion);
      responseStream.write(parsed.completion);
    }

    console.log(chunks.join(""));
    responseStream.end();
  }
);

Enter fullscreen mode Exit fullscreen mode
  • This is where we define the Claude model params
const params = {
      modelId: "anthropic.claude-v2",
      contentType: "application/json",
      accept: "*/*",
      body: JSON.stringify({
        prompt: `\n\nHuman: ${prompt} \n\nAssistant:`,
        max_tokens_to_sample: 2048,
        temperature: 0.5,
        top_k: 250,
        top_p: 0.5,
        stop_sequences: ["\n\nHuman:"],
        anthropic_version: "bedrock-2023-05-31",
      }),
    };
Enter fullscreen mode Exit fullscreen mode
  • Then invoke Bedrock model with response stream.
const command = new InvokeModelWithResponseStreamCommand(params);
Enter fullscreen mode Exit fullscreen mode
  • Then send the response back in chunks to client
 for await (const chunk of response.body) {
      const parsed = JSON.parse(
        Buffer.from(chunk.chunk.bytes, "base64").toString("utf-8")
      );
      chunks.push(parsed.completion);
      responseStream.write(parsed.completion);
    }
Enter fullscreen mode Exit fullscreen mode

SAM Template

  • Next we will modify our SAM template template.yaml into this :
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: lambda-bedrock template

Resources:
  BedrockFunction:
    Type: AWS::Serverless::Function
    Region: us-east-1
    Properties:
      CodeUri: src/
      Handler: index.handler
      Runtime: nodejs18.x
      Timeout: 30
      FunctionUrlConfig:
        AuthType: NONE
        InvokeMode: RESPONSE_STREAM
        Cors:
          AllowMethods:
            - POST
          AllowHeaders:
            - "*"
          AllowOrigins:
            - "*"
      Policies:
        - Statement:
            - Effect: Allow
              Action: bedrock:InvokeModelWithResponseStream
              Resource: "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2"

Outputs:
  FunctionUrlEndpoint:
    Description: "Lambda Bedrock URL Endpoint"
    Value:
      Fn::GetAtt: BedrockFunctionUrl.FunctionUrl

Enter fullscreen mode Exit fullscreen mode
  • In this template we will create a BedrockFunction with Nodejs runtime nodejs18.x
BedrockFunction:
    Type: AWS::Serverless::Function
    Region: us-east-1
    Properties:
      CodeUri: src/
      Handler: index.handler
      Runtime: nodejs18.x
      Timeout: 30
Enter fullscreen mode Exit fullscreen mode
  • Then our the lambda function will have function url enabled with RESPONSE_STREAM invoke mode.
FunctionUrlConfig:
        AuthType: NONE
        InvokeMode: RESPONSE_STREAM
Enter fullscreen mode Exit fullscreen mode
  • Finally we will output our Lambda function URL
Outputs:
  FunctionUrlEndpoint:
    Description: "Lambda Bedrock URL Endpoint"
    Value:
      Fn::GetAtt: BedrockFunctionUrl.FunctionUrl
Enter fullscreen mode Exit fullscreen mode

Deploy AWS SAM project

  • Before we can deploy our SAM project, we need to run the build command like this
sam build
Enter fullscreen mode Exit fullscreen mode
  • It will return this output
Starting Build use cache
Manifest is not changed for (BedrockFunction), running incremental build
Building codeuri: D:\lambda-bedrock\src runtime: nodejs18.x metadata: {} architecture: x86_64 functions: BedrockFunction
 Running NodejsNpmBuilder:NpmPack
 Running NodejsNpmBuilder:CopyNpmrcAndLockfile
 Running NodejsNpmBuilder:CopySource
 Running NodejsNpmBuilder:CopySource
 Running NodejsNpmBuilder:CleanUpNpmrc
 Running NodejsNpmBuilder:LockfileCleanUp
 Running NodejsNpmBuilder:LockfileCleanUp

Build Succeeded

Built Artifacts  : .aws-sam\build
Built Template   : .aws-sam\build\template.yaml

Commands you can use next
=========================
[*] Validate SAM template: sam validate
[*] Invoke Function: sam local invoke
[*] Test Function in the Cloud: sam sync --stack-name {{stack-name}} --watch
[*] Deploy: sam deploy --guided
Enter fullscreen mode Exit fullscreen mode
  • Next, to deploy it for the first time we can run this command
sam deploy --guided --region us-east-1
Enter fullscreen mode Exit fullscreen mode
  • Then you can configure your SAM deployment like this
Configuring SAM deploy
======================

        Looking for config file [samconfig.toml] :  Found
        Reading default arguments  :  Success

        Setting default arguments for 'sam deploy'
        =========================================
        Stack Name [lambda-bedrock]: 
        AWS Region [us-east-1]: 
        #Shows you resources changes to be deployed and require a 'Y' to initiate deploy
        Confirm changes before deploy [Y/n]: 
        #SAM needs permission to be able to create roles to connect to the resources in your template
        Allow SAM CLI IAM role creation [Y/n]: 
        #Preserves the state of previously provisioned resources when an operation fails
        Disable rollback [y/N]: 
        BedrockFunction Function Url has no authentication. Is this okay? [y/N]: y
        Save arguments to configuration file [Y/n]: y
        SAM configuration file [samconfig.toml]: 
        SAM configuration environment [default]: 
Enter fullscreen mode Exit fullscreen mode
  • Then SAM will start provisioning your defined resources like this
Initiating deployment
=====================

File with same data already exists at lambda-bedrock/bbf3c1abca051b90e7660be4b7b51222.template, skipping upload


Waiting for changeset to be created..

CloudFormation stack changeset
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
Operation                               LogicalResourceId                       ResourceType                            Replacement
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
+ Add                                   BedrockFunctionRole                     AWS::IAM::Role                          N/A
+ Add                                   BedrockFunctionUrlPublicPermissions     AWS::Lambda::Permission                 N/A
+ Add                                   BedrockFunctionUrl                      AWS::Lambda::Url                        N/A
+ Add                                   BedrockFunction                         AWS::Lambda::Function                   N/A
-------------------------------------------------------------------------------------------------------------------------------------------------------------   


Changeset created successfully. arn:aws:cloudformation:us-east-1:319450934564:changeSet/samcli-deploy1700755410/ae1f9cd1-ec0c-4aee-91cb-1feac84e9b5e


Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y

2023-11-23 23:03:48 - Waiting for stack create/update to complete

CloudFormation events from stack operations (refresh every 5.0 seconds)
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
ResourceStatus                          ResourceType                            LogicalResourceId                       ResourceStatusReason
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
CREATE_IN_PROGRESS                      AWS::CloudFormation::Stack              lambda-bedrock                          User Initiated
CREATE_IN_PROGRESS                      AWS::IAM::Role                          BedrockFunctionRole                     -
CREATE_IN_PROGRESS                      AWS::IAM::Role                          BedrockFunctionRole                     Resource creation Initiated
CREATE_COMPLETE                         AWS::IAM::Role                          BedrockFunctionRole                     -
CREATE_IN_PROGRESS                      AWS::Lambda::Function                   BedrockFunction                         -
CREATE_IN_PROGRESS                      AWS::Lambda::Function                   BedrockFunction                         Resource creation Initiated
CREATE_COMPLETE                         AWS::Lambda::Function                   BedrockFunction                         -
CREATE_IN_PROGRESS                      AWS::Lambda::Url                        BedrockFunctionUrl                      -
CREATE_IN_PROGRESS                      AWS::Lambda::Permission                 BedrockFunctionUrlPublicPermissions     -
CREATE_IN_PROGRESS                      AWS::Lambda::Permission                 BedrockFunctionUrlPublicPermissions     Resource creation Initiated
CREATE_IN_PROGRESS                      AWS::Lambda::Url                        BedrockFunctionUrl                      Resource creation Initiated
CREATE_COMPLETE                         AWS::Lambda::Permission                 BedrockFunctionUrlPublicPermissions     -
CREATE_COMPLETE                         AWS::Lambda::Url                        BedrockFunctionUrl                      -
CREATE_COMPLETE                         AWS::CloudFormation::Stack              lambda-bedrock                          -
------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Enter fullscreen mode Exit fullscreen mode
  • Finally SAM will output our function URL endpoint like this
CloudFormation outputs from deployed stack
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
Outputs
-------------------------------------------------------------------------------------------------------------------------------------------------------------   
Key                 FunctionUrlEndpoint
Description         Lambda Bedrock URL Endpoint
Value               https://mxtyk5msee54ddynby4h6wvkt40xoyab.lambda-url.us-east-1.on.aws/
------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Enter fullscreen mode Exit fullscreen mode
  • Take notice on the URL output https://mxtyk5msee54ddynby4h6wvkt40xoyab.lambda-url.us-east-1.on.aws/ that we are going to use to test.

Test

To test our Lambda endpoint we can run this following command in our terminal like this

curl -k \
-X POST \
-H 'Content-Type: application/json' \
-d '{"prompt":"what is serverless"}' \
https://mxtyk5msee54ddynby4h6wvkt40xoyab.lambda-url.us-east-1.on.aws/
Enter fullscreen mode Exit fullscreen mode

The result will looks like this
streams

[BONUS] Stream Response to Browser

To stream response to browser you can create a HTML page and use this following script

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Send Data to Server</title>
  </head>
  <body>
    <div>
      <label for="dataInput">Enter data:</label>
      <input type="text" id="dataInput" />
      <button onclick="sendData()">Send Data</button>
      <div id="streamed-data"></div>
    </div>

    <script>
      function sendData() {
        const dataInput = document.getElementById("dataInput");
        const inputData = dataInput.value;
        const streamedData = document.getElementById("streamed-data");
        streamedData.innerHTML = "";
        fetch(
          "https://mxtyk5msee54ddynby4h6wvkt40xoyab.lambda-url.us-east-1.on.aws/",
          {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
            },
            body: JSON.stringify({ prompt: inputData }),
          }
        )
          .then((response) => {
            const reader = response.body.getReader();

            // Read data from the stream
            const readData = () => {
              return reader.read().then(({ done, value }) => {
                if (done) {
                  console.log("Stream complete");
                } else {
                  streamedData.innerHTML += `${new TextDecoder()
                    .decode(value)
                    .replace(/\n/g, "<br>")}`;
                  // Continue reading data
                  return readData();
                }
              });
            };

            // Start reading data from the stream
            return readData();
          })
          .catch((error) => console.error("Error:", error));
      }
    </script>
  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

The above script will allow us to stream response into browser like this

stream to browser

Cleanup

To delete all your resources you can run this command

sam delete --region us-east-1
Enter fullscreen mode Exit fullscreen mode

happy

Conclusions

As you can see, with above configuration our Lambda can return the response from Amazon Bedrock in stream mode instead of buffering the response first. This can improve performance for web and mobile applications.

Response streaming currently supports the Node.js 14.x and subsequent managed runtimes. You can also implement response streaming using custom runtimes. You can progressively stream response payloads through Lambda function URLs, including as an Amazon CloudFront origin, along with using the AWS SDK or using Lambda’s invoke API. You can not use Amazon API Gateway and Application Load Balancer to progressively stream response payloads, but you can use the functionality to return larger payloads with API Gateway.

Check out my previous post

References

Top comments (1)

Collapse
 
rdarrylr profile image
Darryl Ruggles

Well done!