For the last two weeks, I've been learning about generative AI and various use cases, and here is my first technical blog about how to build your own QA AI app utilising AWS Kendra and the generative AI service.
This blog is for you if you want to learn more about the power of Generative AI on Amazon AWS.
I'll show you how to create a Q&A app with Amazon Bedrock, the Kendra database, and streamlit (UI).
Let's look at how the LLM responds to a query on the topics before we start developing the app.
There are two approaches to enable the LLM model to understand and answer enquiries.
Fine-tune the LLM on text data addressing the topic.
Using Retrieval Augmented generating (RAG), a technique that incorporates a retrieval component into the generating process. Allows you to retrieve relevant information and feed it into the generating model as a secondary source of data.
We will go with option 2.
RAG requires an external "knowledge database" to store and retrieve essential information.Consider this database to be our LLM's external long-term memory.
A semantic search database will be used to get information that is semantically connected to our query.
Database for semantic search
A semantic search database is one that uses semantic technology to understand the meanings and relationships between words and phrases in order to provide highly relevant search results.
Semantic search is a sort of search that makes use of natural language processing algorithms to understand the meaning and context of words and phrases in order to provide more accurate search results.
This strategy is based on the idea that search engines should aim to understand the user's purpose as well as the relationships between the words used, rather than simply matching keywords in a query.
Semantic search, rather than merely matching phrases, is designed to give more particular and meaningful search results that better represent the user's intent. This makes it particularly useful for sophisticated queries such as scientific research, medical information, and legal papers.
AWS services
For the Generative AI LLMs:
AWS Bedrock
For the knowledge database:
AWS Kendra
AWS S3
Diagram shows how the AWS services are going to interact between them:
How the Q&A App Work?
The personal documents are kept in an S3 bucket.
The Kendra Index is connected with a s3 connector. Every N minutes, the Index scans the s3 bucket for new data. When new content is uploaded in the bucket, it is automatically processed and saved to the Kendra database.
When a user runs a query using the Streamlit app, the app performs the following actions:
Kendra's relevant information for the supplied query is returned.
The prompt is put together.
Sends the prompt to one of the available Bedrock LLMs and outputs the response.
One of the best aspects of utilising AWS Kendra (in conjunction with AWS S3) as our knowledge database is that the "Ingest Process" (as shown in the diagram above) is totally automated, so you don't have to do anything.
When we add, update, or delete a document from the S3 bucket, the content is automatically processed and saved in Kendra.
Prerequisites
By default, in Bedrock you will have access only to the Amazon Titan LLM. To utilize any of the third-party LLMs (Anthropic and AI21 Labs LLM models), you must register for access separately.
Deploy the required AWS services
To make the app work we need to deploy the following AWS services:
An s3 bucket for uploading our private docs.
A Kendra index with an s3 connector.
An IAM role with the required permissions to make everything work.
Use the terraform files in the github repository to create the required services on your AWS account
https://github.com/selvakumarsai/ai-qa-app-awskendra-benrock-streamli.git
admin@192-168-1-191 infra % terraform apply
data.aws_caller_identity.current: Reading...
data.aws_region.current: Reading...
data.aws_region.current: Read complete after 0s [id=us-east-1]
.
.
.
.
aws_kendra_index.kendra_docs_index: Creating...
aws_kendra_index.kendra_docs_index: Still creating... [10s elapsed]
aws_kendra_index.kendra_docs_index: Still creating... [20s elapsed]
aws_kendra_index.kendra_docs_index: Still creating... [30s elapsed]
aws_kendra_index.kendra_docs_index: Creation complete after 38s [id=f40139ce-f7fb-4ca9-a95f-759431c91fdb]
aws_kendra_data_source.kendra_docs_s3_connector: Creating...
aws_kendra_data_source.kendra_docs_s3_connector: Creation complete after 4s [id=cbfb3da7-660b-4f38-b7f0-b3964548609e/f40139ce-f7fb-4ca9-a95f-759431c91fdb]
Simple UI:
A text input field where the users can type the question they want to ask.
A numeric input where the users can set the LLM max tokens.
A numeric input where the users can set the LLM temperature.
A dropdown to select which AWS Bedrock LLM we want to use to generate the response.
And a submit button.
How to run the app
The repository has a.env file that contains the environment variables required for the app to execute successfully:
KENDRA_INDEX='<kendra-index>'
AWS_BEDROCK_REGION='<bedrock-region>'
AWS_KENDRA_REGION='<region-where-kendra-index-is-deployed>'
Restore dependencies
pip install -r requirements.txt
When you install Streamlit, you also get a command-line (CLI) utility. This tool's intended is to run Streamlit programmes.
Simply run the following command to launch the app:
streamlit run app.py
Retrieve the relevant information from Kendra
The LangChainAmazonKendraRetriever class will be used to obtain the appropriate docs from our knowledge database (AWS Kendra).
The AmazonKendraRetriever class makes use of Amazon Kendra's Retrieve API to query the Amazon Kendra index and retrieve the docs most relevant to the user query.
To construct the RAG pattern, the AmazonKendraRetriever class will be plugged into a LangChain chain.
The boto3 Kendra client's retrieve method allows us to retrieve relevant documents from our knowledge database.
We combine the papers we obtained from Kendra into a single "string" after retrieving them.
This "string" indicates the context that will be added to the prompt, indicating to the Bedrock LLM that it will only be able to answer using the information provided in this context. It cannot develop a solution to our question using data from outside this context.
Prepare the prompt
We build the prompt that will be sent to a Bedrock LLM.
The placeholders query and docs can be found within the prompt.
The programme will insert the user query into the query placeholder.The app will add the context acquired from Kendra to the documents placeholder.
The final step is to transmit the prompt to one of the Bedrock LLMs via the invoke_model method from the boto3 Bedrock client and receive the response.
Testing the Q&A app
Let’s test if the Q&A app works correctly.
Remember that the Microsoft.NET Microservices book was used to populate the knowledge base, therefore any questions we ask should be about that specific topic.
This repository has a Dockerfile in case you prefer to execute the app on a container
docker build -t aws-rag-app .
Top comments (0)