DEV Community

Cover image for A single API for all your conversational generative AI applications
Abhishek Gupta for AWS

Posted on • Originally published at community.aws

A single API for all your conversational generative AI applications

Use the Converse API in Amazon Bedrock to create generative AI applications using single API across multiple foundation models

You can now use the Converse API in Amazon Bedrock to create conversational applications like chatbots and support assistants. It is a consistent, unified API that works with all Amazon Bedrock models that support messages. The benefit is that you have a single code-base (application) and use it with different models – this makes it preferable to use the Converse API over InvokeModel (or InvokeModelWithResponseStream) APIs.

I will walk you through how to use this API with the AWS SDK for Go v2.

Converse API overview

Here is a super-high level overview of the API - you will see these in action when we go through some of the examples.

  • The API consists of two operations - Converse and ConverseStream
  • The conversations are in the form of a Message object, which are encapsulated in a ContentBlock.
  • A ContentBlock can also have images, which are represented by an ImageBlock.
  • A message can have one of two roles - user or assistant
  • For streaming response, use the ConverseStream API
  • The streaming output (ConverseStreamOutput) has multiple events, each of which has different response items such as the text output, metadata etc.

Let's explore a few sample apps now.

Basic example

Refer to **Before You Begin* section in this blog post to complete the prerequisites for running the examples. This includes installing Go, configuring Amazon Bedrock access and providing necessary IAM permissions.*

Let's start off with a simple example. You can refer to the complete code here.

To run the example:



git clone https://github.com/abhirockzz/converse-api-bedrock-go
cd converse-api-bedrock-go

go run basic/main.go


Enter fullscreen mode Exit fullscreen mode

The response may be different in your case:

Image description

The crux of the app is a for loop in which:

  • A types.Message instance is created with the appropriate role (user or assistant)
  • Sent using the Converse API
  • The response is collected and added to existing list of messages
  • The conversation continues, until the app is exited


//...
    for {
        fmt.Print("\nEnter your message: ")
        input, _ := reader.ReadString('\n')
        input = strings.TrimSpace(input)

        userMsg := types.Message{
            Role: types.ConversationRoleUser,
            Content: []types.ContentBlock{
                &types.ContentBlockMemberText{
                    Value: input,
                },
            },
        }

        converseInput.Messages = append(converseInput.Messages, userMsg)
        output, err := brc.Converse(context.Background(), converseInput)

        if err != nil {
            log.Fatal(err)
        }

        reponse, _ := output.Output.(*types.ConverseOutputMemberMessage)
        responseContentBlock := reponse.Value.Content[0]
        text, _ := responseContentBlock.(*types.ContentBlockMemberText)

        fmt.Println(text.Value)

        assistantMsg := types.Message{
            Role:    types.ConversationRoleAssistant,
            Content: reponse.Value.Content,
        }

        converseInput.Messages = append(converseInput.Messages, assistantMsg)
    }
//...


Enter fullscreen mode Exit fullscreen mode

I used the Claude Sonnet model in the example. Refer to Supported models and model features for a complete list.

Multi-modal conversations: Combine image and text

You can also use the Converse API to build multi-modal application that work images - note that they only return text, for now.

You can refer to the complete code here.

To run the example:



go run multi-modal-chat/main.go


Enter fullscreen mode Exit fullscreen mode

I used the following picture of pizza and asked "what's in the image?":

Here is the output:

Image description

The is a simple single-turn exchange, but feel free to continue using a combination of images and text to continue the conversation.

The conversation for loop is similar to the previous example, but it has an added benefit of using the image data type with the help of types.ImageBlock:



//...
types.ContentBlockMemberImage{
    Value: types.ImageBlock{
        Format: types.ImageFormatJpeg,
        Source: &types.ImageSourceMemberBytes{
            Value: imageContents,
        },
    },
}
//...


Enter fullscreen mode Exit fullscreen mode

**Note: *imageContents is nothing but a []byte representation of the image.*

Streaming chat

Streaming provide a better user experience because the client application does not need to wait for the complete response to be generated for it start showing up in the conversation.

You can refer to the complete code here.

To run the example:



go run chat-streaming/main.go


Enter fullscreen mode Exit fullscreen mode

Streaming based implementations can be a bit complicated. But in this case, it was simplified due to the clear API abstractions that the Converse API provided, including partial response types such as types.ContentBlockDeltaMemberText.

The application invokes ConverseStream API and then processes the output components in bedrockruntime.ConverseStreamOutput.



func processStreamingOutput(output *bedrockruntime.ConverseStreamOutput, handler StreamingOutputHandler) (types.Message, error) {

<span class="k">var</span> <span class="n">combinedResult</span> <span class="kt">string</span>

<span class="n">msg</span> <span class="o">:=</span> <span class="n">types</span><span class="o">.</span><span class="n">Message</span><span class="p">{}</span>

<span class="k">for</span> <span class="n">event</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">output</span><span class="o">.</span><span class="n">GetStream</span><span class="p">()</span><span class="o">.</span><span class="n">Events</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">switch</span> <span class="n">v</span> <span class="o">:=</span> <span class="n">event</span><span class="o">.</span><span class="p">(</span><span class="k">type</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ConverseStreamOutputMemberMessageStart</span><span class="o">:</span>

        <span class="n">msg</span><span class="o">.</span><span class="n">Role</span> <span class="o">=</span> <span class="n">v</span><span class="o">.</span><span class="n">Value</span><span class="o">.</span><span class="n">Role</span>

    <span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ConverseStreamOutputMemberContentBlockDelta</span><span class="o">:</span>

        <span class="n">textResponse</span> <span class="o">:=</span> <span class="n">v</span><span class="o">.</span><span class="n">Value</span><span class="o">.</span><span class="n">Delta</span><span class="o">.</span><span class="p">(</span><span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ContentBlockDeltaMemberText</span><span class="p">)</span>
        <span class="n">handler</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="n">textResponse</span><span class="o">.</span><span class="n">Value</span><span class="p">)</span>
        <span class="n">combinedResult</span> <span class="o">=</span> <span class="n">combinedResult</span> <span class="o">+</span> <span class="n">textResponse</span><span class="o">.</span><span class="n">Value</span>

    <span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">UnknownUnionMember</span><span class="o">:</span>
        <span class="n">fmt</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="s">"unknown tag:"</span><span class="p">,</span> <span class="n">v</span><span class="o">.</span><span class="n">Tag</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="n">msg</span><span class="o">.</span><span class="n">Content</span> <span class="o">=</span> <span class="nb">append</span><span class="p">(</span><span class="n">msg</span><span class="o">.</span><span class="n">Content</span><span class="p">,</span>
    <span class="o">&amp;</span><span class="n">types</span><span class="o">.</span><span class="n">ContentBlockMemberText</span><span class="p">{</span>
        <span class="n">Value</span><span class="o">:</span> <span class="n">combinedResult</span><span class="p">,</span>
    <span class="p">},</span>
<span class="p">)</span>

<span class="k">return</span> <span class="n">msg</span><span class="p">,</span> <span class="no">nil</span>
Enter fullscreen mode Exit fullscreen mode

}

Enter fullscreen mode Exit fullscreen mode




Wrap up

There are a few other awesome things the Converse API does to make your life easier.

  • It allows you to pass inference parameters specific to a model.
  • You can also use the Converse API to implement tool use in your applications.
  • If you are using Mistral AI or Llama 2 Chat models, the Converse API will embed your input in a model-specific prompt template that enables conversations - one less thing to worry about!

Like I always say, Python does not have to be the only way to build generative AI powered machine learning applications. As an AI engineer, choose the right tools (including foundation models) and programming languages for your solutions. I maybe biased towards Go but this applies equally well to Java, JS/TS, C# etc.

Happy building!

Top comments (0)