Introduction
Recently, while working with Bedrock KnowledgeBase in my daily work, I encountered some challenges related to its specifications that I'd like to share.
Background
Currently, I'm developing a multi-tenant application using Bedrock KnowledgeBase (referred to as KB below). To briefly explain KB, it's an orchestrator for implementing LLM RAG that handles vectorization of files into vector stores and can generate context-aware conversations when combined with Bedrock Agent.
We're using OpenSearch as our vector store, and our design creates separate KBs and indices for each tenant. This approach ensures data isolation between tenants, which seemed like a natural design choice at the time.

The Problem
At some point, when checking Bedrock's quotas, I found (Knowledge Bases) Knowledge bases per account quota. This is a limit on the maximum number of KBs you can create within an account, with a hard limit of 100. With our initial design, this meant our application could only support up to 100 tenants. Therefore, we needed to reconsider our design.
Solution
In reconsidering the design, we modified it so that KBs and indices are shared across multiple tenants. Since KBs have several parameters related to vectorization, such as ChunkStrategy, we created several combinations of ChunkStrategy and MaxToken parameters and let users select from these options for sharing.
An important consideration with this approach is ensuring that tenant data isn't referenced during conversations with other tenants. KB provides functionality to attach custom metadata during vectorization, so we adopted a method of attaching tenant_id-like metadata and filtering documents by that ID during conversations.
https://docs.aws.amazon.com/bedrock/latest/userguide/kb-metadata.html
Here's the conceptual approach:
Architecture:
- Shared Knowledge Base across multiple tenants
- Custom metadata (tenant_id) attached to each document
- Metadata filtering during retrieval to ensure data isolation
Below is sample code for attaching metadata to documents:
# ingest_documents
response = client.ingest_knowledge_base_documents(
knowledgeBaseId='string',
dataSourceId='string',
clientToken='string',
documents=[
{
'metadata': {
'type': "IN_LINE_ATTRIBUTE",
'inlineAttributes': [
{
'key': 'tenant_id',
'value': {
'type': "STRING",
'stringValue': "$tenant_id",
}
},
]
},
'content': {
...
}
},
]
)
To filter documents by metadata when conversing with the agent, you can implement it with code like this:
## invoke_agent
response = boto3.client.invoke_agent(
'knowledgeBaseConfigurations': [
{
"knowledgeBaseId": "$vector_store_id",
"description": "Knowledge base for document retrieval",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": {
"equals": {
"key": "tenant_id", "value": "$tenant_id"
}
}
}
},
}
]
)
Future Plans
With the above implementation, we can now build the application while successfully avoiding the constraints.
In multi-tenant applications, it's crucial to monitor that one tenant cannot access another tenant's data. I'm thinking of creating a monitoring mechanism to ensure this isn't possible. For example, I'm considering creating multiple test tenants, inserting different documents into the same vector store for each, asking questions about other tenants' documents, and verifying that no answers are returned. This script could be executed regularly in staging environments. While monitoring system resources like CPU is important, I believe it's equally crucial to monitor data to ensure that data that shouldn't exist according to system specifications doesn't exist.
Conclusion
These are the issues related to KB specifications and our countermeasures. Through this experience, I realized the importance of checking cloud service specifications before deciding on system design. I hope this article will be helpful to you.
Top comments (0)