Gustavo Aleixo

Posted on Jun 14

Enhancing RAG Precision Using Bedrock Metadata

#rag #aws #ai #programming

Enhance Your Retrieval Accuracy with Bedrock Knowledge Base Metadata

In one of the projects where I worked with Retrieval-Augmented Generation (RAG) using AWS Bedrock and Knowledge Base, I encountered a retrieval accuracy issue. The client had multiple documents containing similar content, but each was tailored for different audiences.

Example Scenario:

Imagine a high school textbook and a college textbook. Both may cover the same subject matter, but typically, the college version contains more advanced content.

Now, how can we ensure that high school students only retrieve information from the high school materials, even when both files contain overlapping topics?

The solution I implemented for this challenge was to use metadata filters.

Why Use Metadata Filters?

Metadata filters allow you to classify your files in a way that Bedrock can understand and apply filters during retrieval. This ensures that responses are generated only from the relevant, filtered files, making your searches both faster and more accurate.

In my case, since I needed to separate and categorize the search results based on the target audience, I created a custom metadata attribute for each document.

How to Use Metadata with Bedrock Knowledge Base

If you’re using an S3 bucket as the source for your Knowledge Base, you can create a .metadata.json file for each document you upload. This JSON file should have the same name as the indexed file, with the .metadata.json extension.

Example:

For a college textbook (college-book.pdf), create:

File: college-book.pdf.metadata.json

Content:

{
  "metadataAttributes": {
    "level": "college"
  }
}

For a college textbook (high-school-book.pdf), create:

File: high-school-book.pdf.metadata.json

Content:

{
  "metadataAttributes": {
    "level": "high-school"
  }
}

Upload this files to the bucket S3:

After, you need run the sync on Knowledge base, await few minutes and that’s it !!!

Your sync history will show amount source and metadata files find in your bucket

Testing your metadata

To test your metadata, go to the Knowledge base scream and click in “Test Knowledge Base” button and:

Select your model
Search for “Data manipulation” section
Click in Filters
Select “Manual filters” option
Put in your metadata filter

Final Tip

When making a query to the Knowledge Base, use the metadata filter option provided by Bedrock (e.g., filtering by "level": "high-school"), so only documents with the specified metadata will be considered during retrieval.

By using this approach, you can dramatically improve the precision and relevance of your RAG applications with AWS Bedrock.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.