OpenSearch Serverless costs a minimum of ~$700/month even at zero traffic. S3 Vectors cuts vector storage costs by up to 90% with zero infrastructure to manage. Here's how to wire it into Bedrock Knowledge Bases with Terraform.
In RAG Post 1, we deployed a Bedrock Knowledge Base with OpenSearch Serverless as the vector store. It works, but there's a painful cost problem: OpenSearch Serverless has a minimum of 4 OCUs (2 indexing + 2 search) that run 24/7, costing roughly $700/month even with zero queries.
For dev environments, internal tools, or any RAG pipeline where you're not serving real-time user-facing search, that's a hard number to justify. Amazon S3 Vectors changes the equation entirely. It's a new S3 bucket type with native vector storage and similarity search built in - no infrastructure to provision, no minimum costs, and up to 90% cheaper than conventional vector databases. Now GA with support for 2 billion vectors per index. This post covers how to set it up with Terraform and when to use it. π―
πΈ The Cost Problem with OpenSearch Serverless
Let's look at what a dev/test Knowledge Base actually costs with OpenSearch Serverless:
| Component | Monthly Cost (us-east-1) |
|---|---|
| OpenSearch Serverless (4 OCU minimum) | ~$700 |
| S3 data source bucket | ~$1 |
| Embedding model (Titan V2, light usage) | ~$5 |
| Total | ~$706 |
Even with standby replicas disabled (our Post 1 dev config), you're still paying for compute that sits idle most of the time. For a team running 3-4 Knowledge Bases in dev, that's $2,800/month in vector store costs alone.
πͺ£ S3 Vectors: What It Is
S3 Vectors is a purpose-built S3 bucket type for storing and querying vector embeddings. Inside a vector bucket, you create vector indexes that hold your embeddings along with metadata. The key differences from a regular S3 bucket:
- Native similarity search - query vectors directly with cosine or Euclidean distance
- Zero infrastructure - no clusters, OCUs, or provisioning
- Pay per use - storage at $0.06/GB/month, queries at $2.50 per million, PUTs at $0.20/GB
- Scale - up to 2 billion vectors per index, 10,000 indexes per bucket
- Sub-second queries - 100ms for frequent queries, up to 800ms for infrequent
Cost Comparison: Same RAG Pipeline
| Component | OpenSearch Serverless | S3 Vectors |
|---|---|---|
| Vector store (10K docs, light queries) | ~$700/month | ~$2/month |
| S3 data source | ~$1 | ~$1 |
| Embedding model | ~$5 | ~$5 |
| Total | ~$706 | ~$8 |
That's a 98% cost reduction for a typical dev/test workload.
π§ Terraform Setup
S3 Vectors support in the Terraform AWS provider landed in v6.27.0. Here's the full setup:
Vector Bucket and Index
# rag/s3_vectors.tf
resource "aws_s3vectors_bucket" "kb_vectors" {
bucket_name = "${var.environment}-${var.project}-kb-vectors"
encryption_configuration {
sse_algorithm = "aws:kms"
kms_key_arn = var.kms_key_arn
}
tags = var.tags
}
resource "aws_s3vectors_index" "kb_index" {
bucket_name = aws_s3vectors_bucket.kb_vectors.bucket_name
index_name = "bedrock-knowledge-base-default-index"
dimension = var.embedding_dimensions
distance_metric = "cosine"
tags = var.tags
}
IAM Role for Bedrock
The Knowledge Base service role needs permissions for both S3 Vectors APIs and the data source bucket:
# rag/iam.tf
resource "aws_iam_role" "kb_role" {
name = "${var.environment}-${var.project}-kb-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "bedrock.amazonaws.com" }
Condition = {
StringEquals = {
"aws:SourceAccount" = data.aws_caller_identity.current.account_id
}
}
}]
})
}
# S3 Vectors permissions
resource "aws_iam_role_policy" "kb_s3vectors" {
name = "s3-vectors-access"
role = aws_iam_role.kb_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"s3vectors:CreateVector",
"s3vectors:DeleteVector",
"s3vectors:GetVector",
"s3vectors:PutVector",
"s3vectors:QueryVectors",
"s3vectors:ListVectors"
]
Resource = [
aws_s3vectors_bucket.kb_vectors.arn,
"${aws_s3vectors_bucket.kb_vectors.arn}/*"
]
}]
})
}
# Embedding model access
resource "aws_iam_role_policy" "kb_bedrock" {
name = "bedrock-invoke"
role = aws_iam_role.kb_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "bedrock:InvokeModel"
Resource = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
}]
})
}
# S3 data source read access
resource "aws_iam_role_policy" "kb_s3_read" {
name = "s3-data-source-read"
role = aws_iam_role.kb_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["s3:GetObject", "s3:ListBucket"]
Resource = [
aws_s3_bucket.kb_docs.arn,
"${aws_s3_bucket.kb_docs.arn}/*"
]
}]
})
}
Knowledge Base with S3 Vectors
# rag/knowledge_base.tf
resource "aws_bedrockagent_knowledge_base" "this" {
name = "${var.environment}-${var.project}-kb"
role_arn = aws_iam_role.kb_role.arn
knowledge_base_configuration {
type = "VECTOR"
vector_knowledge_base_configuration {
embedding_model_arn = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
}
}
storage_configuration {
type = "S3_VECTORS"
s3_vectors_configuration {
s3_vector_bucket_arn = aws_s3vectors_bucket.kb_vectors.arn
vector_index_arn = aws_s3vectors_index.kb_index.arn
}
}
}
resource "aws_bedrockagent_data_source" "s3" {
name = "${var.environment}-${var.project}-s3-source"
knowledge_base_id = aws_bedrockagent_knowledge_base.this.id
data_source_configuration {
type = "S3"
s3_configuration {
bucket_arn = aws_s3_bucket.kb_docs.arn
}
}
vector_ingestion_configuration {
chunking_configuration {
chunking_strategy = var.chunking_strategy
dynamic "fixed_size_chunking_configuration" {
for_each = var.chunking_strategy == "FIXED_SIZE" ? [1] : []
content {
max_tokens = var.chunk_max_tokens
overlap_percentage = var.chunk_overlap_pct
}
}
}
}
}
β οΈ Limitations: What S3 Vectors Can't Do
Before you migrate everything, know the trade-offs:
| Feature | S3 Vectors | OpenSearch Serverless |
|---|---|---|
| Hybrid search | β Semantic only | β Vector + keyword |
| Query latency | 100-800ms | 10-100ms |
| Throughput | Hundreds QPS/bucket | Thousands QPS |
| Hierarchical chunking | β οΈ Limited (metadata size constraints) | β Full support |
| Metadata filtering | β Post-search filtering | β Pre-search filtering |
| Minimum cost | $0 (pay per use) | ~$700/month (4 OCUs) |
| Max vectors/index | 2 billion | Varies by OCU |
The big one: No hybrid search. S3 Vectors supports semantic (vector) search only. If your users search for product codes, policy numbers, or exact identifiers, you need keyword matching - and that means OpenSearch. For natural language questions against internal docs, semantic-only search works well.
π When to Use What
| Scenario | Recommended Store | Why |
|---|---|---|
| Dev/test environments | S3 Vectors | Zero idle cost, adequate performance |
| Internal doc search (< 100 QPS) | S3 Vectors | Cost-effective, latency acceptable |
| Customer-facing chatbot | OpenSearch Serverless | Low latency, hybrid search |
| Mixed content with codes/IDs | OpenSearch Serverless | Needs keyword matching |
| Staging (prod parity needed) | OpenSearch Serverless | Match prod behavior |
| Archive/long-tail vectors | S3 Vectors | 90% cheaper long-term storage |
The Tiered Strategy
For production at scale, AWS recommends combining both: S3 Vectors for long-term, infrequently queried vectors, and OpenSearch for hot, real-time search. You can export from S3 Vectors to OpenSearch with a one-click migration when vectors need higher performance. This isn't automated yet - you manage the tiering through API calls.
π Environment-Specific Configuration
Use Terraform variables to swap vector stores per environment:
# environments/dev.tfvars
vector_store_type = "S3_VECTORS"
embedding_model = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy = "FIXED_SIZE"
chunk_max_tokens = 300
chunk_overlap_pct = 10
# environments/prod.tfvars
vector_store_type = "OPENSEARCH_SERVERLESS"
embedding_model = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy = "HIERARCHICAL"
chunk_max_tokens = 1500 # parent
chunk_overlap_pct = 20
Then use dynamic blocks in your Knowledge Base resource to switch between S3 Vectors and OpenSearch based on the variable. This gives you S3 Vectors in dev (near-zero cost) and OpenSearch Serverless in prod (full capabilities).
π§ͺ Querying: Same API, Different Store
The best part - your application code doesn't change. RetrieveAndGenerate and Retrieve work identically regardless of the backing vector store:
response = client.retrieve_and_generate(
input={"text": "What is our return policy?"},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"knowledgeBaseId": "YOUR_KB_ID",
"modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-20250514"
}
}
)
Same code, same API, same citations. The only difference is what's behind the curtain.
βοΈ What's Next
This is Post 3 of the AWS RAG Pipeline with Terraform series.
- Post 1: Bedrock Knowledge Base - Basic Setup π
- Post 2: Advanced RAG - Chunking, Search, Reranking π§
- Post 3: S3 Vectors - Cheapest Vector Store (you are here) π°
Your dev Knowledge Base just went from $706/month to $8/month. S3 Vectors gives you the same Bedrock APIs with zero infrastructure and pay-per-use pricing. Use it in dev, use it for internal tools, and keep OpenSearch Serverless where you need real-time hybrid search. π°
Found this helpful? Follow for the full RAG Pipeline with Terraform series! π¬
Top comments (2)
love the comparison "When to Use What".
most of the times when i hear requirements and architects wants to use S3 because "S3 is cheap". Cheap doesn't mean that it will meet the production requirements. the answer is always - it depends!
Thank you :)