DEV Community

Cover image for S3 Vectors for RAG: The Cheapest Vector Store for Bedrock Knowledge Bases with Terraform πŸ’°
Suhas Mallesh
Suhas Mallesh

Posted on

S3 Vectors for RAG: The Cheapest Vector Store for Bedrock Knowledge Bases with Terraform πŸ’°

OpenSearch Serverless costs a minimum of ~$700/month even at zero traffic. S3 Vectors cuts vector storage costs by up to 90% with zero infrastructure to manage. Here's how to wire it into Bedrock Knowledge Bases with Terraform.

In RAG Post 1, we deployed a Bedrock Knowledge Base with OpenSearch Serverless as the vector store. It works, but there's a painful cost problem: OpenSearch Serverless has a minimum of 4 OCUs (2 indexing + 2 search) that run 24/7, costing roughly $700/month even with zero queries.

For dev environments, internal tools, or any RAG pipeline where you're not serving real-time user-facing search, that's a hard number to justify. Amazon S3 Vectors changes the equation entirely. It's a new S3 bucket type with native vector storage and similarity search built in - no infrastructure to provision, no minimum costs, and up to 90% cheaper than conventional vector databases. Now GA with support for 2 billion vectors per index. This post covers how to set it up with Terraform and when to use it. 🎯

πŸ’Έ The Cost Problem with OpenSearch Serverless

Let's look at what a dev/test Knowledge Base actually costs with OpenSearch Serverless:

Component Monthly Cost (us-east-1)
OpenSearch Serverless (4 OCU minimum) ~$700
S3 data source bucket ~$1
Embedding model (Titan V2, light usage) ~$5
Total ~$706

Even with standby replicas disabled (our Post 1 dev config), you're still paying for compute that sits idle most of the time. For a team running 3-4 Knowledge Bases in dev, that's $2,800/month in vector store costs alone.

πŸͺ£ S3 Vectors: What It Is

S3 Vectors is a purpose-built S3 bucket type for storing and querying vector embeddings. Inside a vector bucket, you create vector indexes that hold your embeddings along with metadata. The key differences from a regular S3 bucket:

  • Native similarity search - query vectors directly with cosine or Euclidean distance
  • Zero infrastructure - no clusters, OCUs, or provisioning
  • Pay per use - storage at $0.06/GB/month, queries at $2.50 per million, PUTs at $0.20/GB
  • Scale - up to 2 billion vectors per index, 10,000 indexes per bucket
  • Sub-second queries - 100ms for frequent queries, up to 800ms for infrequent

Cost Comparison: Same RAG Pipeline

Component OpenSearch Serverless S3 Vectors
Vector store (10K docs, light queries) ~$700/month ~$2/month
S3 data source ~$1 ~$1
Embedding model ~$5 ~$5
Total ~$706 ~$8

That's a 98% cost reduction for a typical dev/test workload.

πŸ”§ Terraform Setup

S3 Vectors support in the Terraform AWS provider landed in v6.27.0. Here's the full setup:

Vector Bucket and Index

# rag/s3_vectors.tf

resource "aws_s3vectors_bucket" "kb_vectors" {
  bucket_name = "${var.environment}-${var.project}-kb-vectors"

  encryption_configuration {
    sse_algorithm = "aws:kms"
    kms_key_arn   = var.kms_key_arn
  }

  tags = var.tags
}

resource "aws_s3vectors_index" "kb_index" {
  bucket_name = aws_s3vectors_bucket.kb_vectors.bucket_name
  index_name  = "bedrock-knowledge-base-default-index"

  dimension       = var.embedding_dimensions
  distance_metric = "cosine"

  tags = var.tags
}
Enter fullscreen mode Exit fullscreen mode

IAM Role for Bedrock

The Knowledge Base service role needs permissions for both S3 Vectors APIs and the data source bucket:

# rag/iam.tf

resource "aws_iam_role" "kb_role" {
  name = "${var.environment}-${var.project}-kb-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "bedrock.amazonaws.com" }
      Condition = {
        StringEquals = {
          "aws:SourceAccount" = data.aws_caller_identity.current.account_id
        }
      }
    }]
  })
}

# S3 Vectors permissions
resource "aws_iam_role_policy" "kb_s3vectors" {
  name = "s3-vectors-access"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Action = [
        "s3vectors:CreateVector",
        "s3vectors:DeleteVector",
        "s3vectors:GetVector",
        "s3vectors:PutVector",
        "s3vectors:QueryVectors",
        "s3vectors:ListVectors"
      ]
      Resource = [
        aws_s3vectors_bucket.kb_vectors.arn,
        "${aws_s3vectors_bucket.kb_vectors.arn}/*"
      ]
    }]
  })
}

# Embedding model access
resource "aws_iam_role_policy" "kb_bedrock" {
  name = "bedrock-invoke"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "bedrock:InvokeModel"
      Resource = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
    }]
  })
}

# S3 data source read access
resource "aws_iam_role_policy" "kb_s3_read" {
  name = "s3-data-source-read"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:ListBucket"]
      Resource = [
        aws_s3_bucket.kb_docs.arn,
        "${aws_s3_bucket.kb_docs.arn}/*"
      ]
    }]
  })
}
Enter fullscreen mode Exit fullscreen mode

Knowledge Base with S3 Vectors

# rag/knowledge_base.tf

resource "aws_bedrockagent_knowledge_base" "this" {
  name     = "${var.environment}-${var.project}-kb"
  role_arn = aws_iam_role.kb_role.arn

  knowledge_base_configuration {
    type = "VECTOR"
    vector_knowledge_base_configuration {
      embedding_model_arn = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
    }
  }

  storage_configuration {
    type = "S3_VECTORS"
    s3_vectors_configuration {
      s3_vector_bucket_arn = aws_s3vectors_bucket.kb_vectors.arn
      vector_index_arn     = aws_s3vectors_index.kb_index.arn
    }
  }
}

resource "aws_bedrockagent_data_source" "s3" {
  name              = "${var.environment}-${var.project}-s3-source"
  knowledge_base_id = aws_bedrockagent_knowledge_base.this.id

  data_source_configuration {
    type = "S3"
    s3_configuration {
      bucket_arn = aws_s3_bucket.kb_docs.arn
    }
  }

  vector_ingestion_configuration {
    chunking_configuration {
      chunking_strategy = var.chunking_strategy

      dynamic "fixed_size_chunking_configuration" {
        for_each = var.chunking_strategy == "FIXED_SIZE" ? [1] : []
        content {
          max_tokens         = var.chunk_max_tokens
          overlap_percentage = var.chunk_overlap_pct
        }
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

⚠️ Limitations: What S3 Vectors Can't Do

Before you migrate everything, know the trade-offs:

Feature S3 Vectors OpenSearch Serverless
Hybrid search ❌ Semantic only βœ… Vector + keyword
Query latency 100-800ms 10-100ms
Throughput Hundreds QPS/bucket Thousands QPS
Hierarchical chunking ⚠️ Limited (metadata size constraints) βœ… Full support
Metadata filtering βœ… Post-search filtering βœ… Pre-search filtering
Minimum cost $0 (pay per use) ~$700/month (4 OCUs)
Max vectors/index 2 billion Varies by OCU

The big one: No hybrid search. S3 Vectors supports semantic (vector) search only. If your users search for product codes, policy numbers, or exact identifiers, you need keyword matching - and that means OpenSearch. For natural language questions against internal docs, semantic-only search works well.

πŸ“ When to Use What

Scenario Recommended Store Why
Dev/test environments S3 Vectors Zero idle cost, adequate performance
Internal doc search (< 100 QPS) S3 Vectors Cost-effective, latency acceptable
Customer-facing chatbot OpenSearch Serverless Low latency, hybrid search
Mixed content with codes/IDs OpenSearch Serverless Needs keyword matching
Staging (prod parity needed) OpenSearch Serverless Match prod behavior
Archive/long-tail vectors S3 Vectors 90% cheaper long-term storage

The Tiered Strategy

For production at scale, AWS recommends combining both: S3 Vectors for long-term, infrequently queried vectors, and OpenSearch for hot, real-time search. You can export from S3 Vectors to OpenSearch with a one-click migration when vectors need higher performance. This isn't automated yet - you manage the tiering through API calls.

πŸ”„ Environment-Specific Configuration

Use Terraform variables to swap vector stores per environment:

# environments/dev.tfvars
vector_store_type    = "S3_VECTORS"
embedding_model      = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy    = "FIXED_SIZE"
chunk_max_tokens     = 300
chunk_overlap_pct    = 10

# environments/prod.tfvars
vector_store_type    = "OPENSEARCH_SERVERLESS"
embedding_model      = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy    = "HIERARCHICAL"
chunk_max_tokens     = 1500  # parent
chunk_overlap_pct    = 20
Enter fullscreen mode Exit fullscreen mode

Then use dynamic blocks in your Knowledge Base resource to switch between S3 Vectors and OpenSearch based on the variable. This gives you S3 Vectors in dev (near-zero cost) and OpenSearch Serverless in prod (full capabilities).

πŸ§ͺ Querying: Same API, Different Store

The best part - your application code doesn't change. RetrieveAndGenerate and Retrieve work identically regardless of the backing vector store:

response = client.retrieve_and_generate(
    input={"text": "What is our return policy?"},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": "YOUR_KB_ID",
            "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-20250514"
        }
    }
)
Enter fullscreen mode Exit fullscreen mode

Same code, same API, same citations. The only difference is what's behind the curtain.

⏭️ What's Next

This is Post 3 of the AWS RAG Pipeline with Terraform series.


Your dev Knowledge Base just went from $706/month to $8/month. S3 Vectors gives you the same Bedrock APIs with zero infrastructure and pay-per-use pricing. Use it in dev, use it for internal tools, and keep OpenSearch Serverless where you need real-time hybrid search. πŸ’°

Found this helpful? Follow for the full RAG Pipeline with Terraform series! πŸ’¬

Top comments (2)

Collapse
 
iseecodepeople profile image
Varun S

love the comparison "When to Use What".

most of the times when i hear requirements and architects wants to use S3 because "S3 is cheap". Cheap doesn't mean that it will meet the production requirements. the answer is always - it depends!

Collapse
 
suhas_mallesh profile image
Suhas Mallesh

Thank you :)