Budiono Santoso

Posted on Jul 6 • Edited on Nov 27

RAG (Retrieval Augmented Generation) using Amazon Bedrock and Pinecone

UPDATE : AWS have Amazon Bedrock AgentCore that can create AI agent using memory, identity, gateway, observability, also connect with AI agent framework such as Langgraph, CrewAI, etc. Stay tune!

PROBLEM : Previous support have problems such as repeated customer questions, slow response times and limited availability.

SOLUTION : I build a Retrieval-Augmented Generation (RAG) system to improve financial customer service.
I created Bedrock Knowledge Base using Amazon Titan Embedding and Pinecone, store knowledge base (CSV or PDF or anything) to S3, testing knowledge base before deploy to Bedrock Agent.
After deploy to Bedrock Agent, try invoke this agent make sure successfully when LLM retrieve information from knowledge base then answer the question.

REQUIREMENTS :

AWS account, you can sign up/sign in here
Pinecone account to get Pinecone API key, you can sign up/sign in here

STEP-BY-STEP :

A. I use Pinecone because most widely used vector database product, has a free vector database index, and can connect to Amazon Bedrock. Open a Pinecone account and create a database index using the following steps.

Click "Create index".
Select text embedding. Click "llama-text-embed-v2".
Configure text embedding such as dimension to 512.
Select the capacity mode and cloud provider. However, for the starter plan, select serverless and AWS only. You cannot select pods or other cloud providers such as Google Cloud and Azure.
You cannot select other region. Click "Create index".

B. Open your AWS account, search for the AWS Secret Manager service, create a Pinecone secret key using the following steps.

Click "store a new secret".
Select "Other type of secret". Fill in the key/value as Pinecone API Key. It must be "apiKey" and Pinecone API Key because when I fill different other than "apiKey", such as "pineconeapikey", will failed.
Fill in the secret name and click "Store".
After Pinecone API key is stored, copy secret ARN.

C. Create Amazon Bedrock Knowledge Base and Agent using boto3 SDK. The source code is here.

Knowledge base.

Knowledge base overview.

Embedding model using Amazon Titan Embedding and vector database using Pinecone.

Data source from Amazon S3.

Data source configuration.

Sync all knowledge base. This knowledge base sync cannot be sync automatically, it must be updated manually.

Copy inference profile ARN. For this tutorial, copy US Nova Micro. I using Amazon Nova Micro because need text only, lowest latency response and very low cost.

Testing the knowledge base based question outside knowledge base before deploy to Bedrock Agent.

Testing the knowledge base based question inside knowledge base before deploy to Bedrock Agent.

But, if want to sync from another S3 bucket, how to sync? I explain this solution.

Go to IAM then click IAM role for Bedrock Knowledge Base like this. Click S3 policy.

This policy only can sync for "all-in-bedrock" S3 bucket. But if I want to sync another S3 bucket. Example I have S3 bucket name "bedrock-kb", can't sync because this S3 policy can't detect.

Editing S3 policy configuration like this screenshot to can sync all S3 bucket.

Go to Amazon Bedrock then click Agent overview.

Bedrock Agent detail.

Instruction for this agent using Amazon Nova Micro.

Associate agent with knowledge base that already created.

Bedrock Agent version and alias.

Copy inference profile ARN. For this tutorial, copy US Nova Micro. I using Amazon Nova Micro because need text only, lowest latency response and very low cost.

Testing Bedrock Agent with scenario if a question is not related with knowledge base. Result is the agent cannot answer because outside knowledge base.

Testing Bedrock Agent with scenario if a question is related with knowledge base. Result is the agent can answer because inside knowledge base.

You can see result of testing other question at this source code.

CONCLUSION : After created this project, I get impact such as get accurate and faster response, no need training or fine-tuning from scratch, less hallucination, appropriate handling of reworded questions, refuse to answer of off-topic question and efficiency operational.

Through this project, I learn hard skills such as vector databases (Pinecone), embeddings (Amazon Titan), and LLMs (Amazon Nova). I also learn soft skills such as continuous learning because AI trend change faster and detail orientation to make sure answer from question is right and according to dataset.

Thank you,
Budi :)

DEV Community

RAG (Retrieval Augmented Generation) using Amazon Bedrock and Pinecone

Top comments (0)