DEV Community

Budiono Santoso
Budiono Santoso

Posted on • Edited on

RAG (Retrieval Augmented Generation) using Amazon Bedrock and Pinecone

UPDATE : AWS have Amazon Bedrock AgentCore that can create AI agent using memory, identity, gateway, observability, also connect with AI agent framework such as Langgraph, CrewAI, etc. Stay tune!

PROBLEM : Previous support have problems such as repeated customer questions, slow response times and limited availability.

SOLUTION : I build a Retrieval-Augmented Generation (RAG) system to improve financial customer service.
I created Bedrock Knowledge Base using Amazon Titan Embedding and Pinecone, store knowledge base (CSV or PDF or anything) to S3, testing knowledge base before deploy to Bedrock Agent.
After deploy to Bedrock Agent, try invoke this agent make sure successfully when LLM retrieve information from knowledge base then answer the question.

REQUIREMENTS :

  1. AWS account, you can sign up/sign in here
  2. Pinecone account to get Pinecone API key, you can sign up/sign in here

STEP-BY-STEP :

A. I use Pinecone because most widely used vector database product, has a free vector database index, and can connect to Amazon Bedrock. Open a Pinecone account and create a database index using the following steps.

  1. Click "Create index".
    Create index

  2. Select text embedding. Click "llama-text-embed-v2".
    Select text embedding

  3. Configure text embedding such as dimension to 512.
    Text embedding configuration

  4. Select the capacity mode and cloud provider. However, for the starter plan, select serverless and AWS only. You cannot select pods or other cloud providers such as Google Cloud and Azure.
    For the starter plan, capacity mode serverless only and AWS as a cloud provider

  5. You cannot select other region. Click "Create index".
    Available region Virginia only then click Create index

My Pinecone database index for Amazon Bedrock Knowledge Base

B. Open your AWS account, search for the AWS Secret Manager service, create a Pinecone secret key using the following steps.

  1. Click "store a new secret".
    AWS Secret Manager

  2. Select "Other type of secret". Fill in the key/value as Pinecone API Key. It must be "apiKey" and Pinecone API Key because when I fill different other than "apiKey", such as "pineconeapikey", will failed.
    Pinecone API Key

  3. Fill in the secret name and click "Store".
    Secret name

  4. After Pinecone API key is stored, copy secret ARN.
    Secret ARN

C. Create Amazon Bedrock Knowledge Base and Agent using boto3 SDK. The source code is here.

Knowledge base.
Knowledge base

Knowledge base overview.
Knowledge base overview

Embedding model using Amazon Titan Embedding and vector database using Pinecone.
Embedding model and vector database

Data source from Amazon S3.
Data source

Data source configuration.
Data source configuration

Sync all knowledge base. This knowledge base sync cannot be sync automatically, it must be updated manually.
Sync all knowledge base

Copy inference profile ARN. For this tutorial, copy US Nova Micro. I using Amazon Nova Micro because need text only, lowest latency response and very low cost.
Inference profile ARN

Testing the knowledge base based question outside knowledge base before deploy to Bedrock Agent.
Test knowledge base

Testing the knowledge base based question inside knowledge base before deploy to Bedrock Agent.
Test knowledge base

But, if want to sync from another S3 bucket, how to sync? I explain this solution.

Go to IAM then click IAM role for Bedrock Knowledge Base like this. Click S3 policy.
IAM role for Bedrock Knowledge Base

This policy only can sync for "all-in-bedrock" S3 bucket. But if I want to sync another S3 bucket. Example I have S3 bucket name "bedrock-kb", can't sync because this S3 policy can't detect.

S3 policy

S3 policy

Editing S3 policy configuration like this screenshot to can sync all S3 bucket.
After edit S3 policy

Go to Amazon Bedrock then click Agent overview.
Bedrock Agent overview

Bedrock Agent detail.
Agent detail

Instruction for this agent using Amazon Nova Micro.
Instruction for this agent

Associate agent with knowledge base that already created.
Associate agent with knowledge base

Bedrock Agent version and alias.

Copy inference profile ARN. For this tutorial, copy US Nova Micro. I using Amazon Nova Micro because need text only, lowest latency response and very low cost.
Inference profile ARN

Testing Bedrock Agent with scenario if a question is not related with knowledge base. Result is the agent cannot answer because outside knowledge base.
Test Bedrock Agent

Testing Bedrock Agent with scenario if a question is related with knowledge base. Result is the agent can answer because inside knowledge base.
Test Bedrock Agent

You can see result of testing other question at this source code.

CONCLUSION : After created this project, I get impact such as get accurate and faster response, no need training or fine-tuning from scratch, less hallucination, appropriate handling of reworded questions, refuse to answer of off-topic question and efficiency operational.

Through this project, I learn hard skills such as vector databases (Pinecone), embeddings (Amazon Titan), and LLMs (Amazon Nova). I also learn soft skills such as continuous learning because AI trend change faster and detail orientation to make sure answer from question is right and according to dataset.

Thank you,
Budi :)

Top comments (0)