Suhas Mallesh

Posted on Mar 3

S3 Vectors for RAG: The Cheapest Vector Store for Bedrock Knowledge Bases with Terraform 💰

#aws #terraform #cloud #ai

OpenSearch Serverless costs a minimum of ~$700/month even at zero traffic. S3 Vectors cuts vector storage costs by up to 90% with zero infrastructure to manage. Here's how to wire it into Bedrock Knowledge Bases with Terraform.

In RAG Post 1, we deployed a Bedrock Knowledge Base with OpenSearch Serverless as the vector store. It works, but there's a painful cost problem: OpenSearch Serverless has a minimum of 4 OCUs (2 indexing + 2 search) that run 24/7, costing roughly $700/month even with zero queries.

For dev environments, internal tools, or any RAG pipeline where you're not serving real-time user-facing search, that's a hard number to justify. Amazon S3 Vectors changes the equation entirely. It's a new S3 bucket type with native vector storage and similarity search built in - no infrastructure to provision, no minimum costs, and up to 90% cheaper than conventional vector databases. Now GA with support for 2 billion vectors per index. This post covers how to set it up with Terraform and when to use it. 🎯

💸 The Cost Problem with OpenSearch Serverless

Let's look at what a dev/test Knowledge Base actually costs with OpenSearch Serverless:

Component	Monthly Cost (us-east-1)
OpenSearch Serverless (4 OCU minimum)	~$700
S3 data source bucket	~$1
Embedding model (Titan V2, light usage)	~$5
Total	~$706

Even with standby replicas disabled (our Post 1 dev config), you're still paying for compute that sits idle most of the time. For a team running 3-4 Knowledge Bases in dev, that's $2,800/month in vector store costs alone.

🪣 S3 Vectors: What It Is

S3 Vectors is a purpose-built S3 bucket type for storing and querying vector embeddings. Inside a vector bucket, you create vector indexes that hold your embeddings along with metadata. The key differences from a regular S3 bucket:

Native similarity search - query vectors directly with cosine or Euclidean distance
Zero infrastructure - no clusters, OCUs, or provisioning
Pay per use - storage at $0.06/GB/month, queries at $2.50 per million, PUTs at $0.20/GB
Scale - up to 2 billion vectors per index, 10,000 indexes per bucket
Sub-second queries - 100ms for frequent queries, up to 800ms for infrequent

Cost Comparison: Same RAG Pipeline

Component	OpenSearch Serverless	S3 Vectors
Vector store (10K docs, light queries)	~$700/month	~$2/month
S3 data source	~$1	~$1
Embedding model	~$5	~$5
Total	~$706	~$8

That's a 98% cost reduction for a typical dev/test workload.

🔧 Terraform Setup

S3 Vectors support in the Terraform AWS provider landed in v6.27.0. Here's the full setup:

Vector Bucket and Index

# rag/s3_vectors.tf

resource "aws_s3vectors_bucket" "kb_vectors" {
  bucket_name = "${var.environment}-${var.project}-kb-vectors"

  encryption_configuration {
    sse_algorithm = "aws:kms"
    kms_key_arn   = var.kms_key_arn
  }

  tags = var.tags
}

resource "aws_s3vectors_index" "kb_index" {
  bucket_name = aws_s3vectors_bucket.kb_vectors.bucket_name
  index_name  = "bedrock-knowledge-base-default-index"

  dimension       = var.embedding_dimensions
  distance_metric = "cosine"

  tags = var.tags
}

IAM Role for Bedrock

The Knowledge Base service role needs permissions for both S3 Vectors APIs and the data source bucket:

# rag/iam.tf

resource "aws_iam_role" "kb_role" {
  name = "${var.environment}-${var.project}-kb-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "bedrock.amazonaws.com" }
      Condition = {
        StringEquals = {
          "aws:SourceAccount" = data.aws_caller_identity.current.account_id
        }
      }
    }]
  })
}

# S3 Vectors permissions
resource "aws_iam_role_policy" "kb_s3vectors" {
  name = "s3-vectors-access"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Action = [
        "s3vectors:CreateVector",
        "s3vectors:DeleteVector",
        "s3vectors:GetVector",
        "s3vectors:PutVector",
        "s3vectors:QueryVectors",
        "s3vectors:ListVectors"
      ]
      Resource = [
        aws_s3vectors_bucket.kb_vectors.arn,
        "${aws_s3vectors_bucket.kb_vectors.arn}/*"
      ]
    }]
  })
}

# Embedding model access
resource "aws_iam_role_policy" "kb_bedrock" {
  name = "bedrock-invoke"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "bedrock:InvokeModel"
      Resource = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
    }]
  })
}

# S3 data source read access
resource "aws_iam_role_policy" "kb_s3_read" {
  name = "s3-data-source-read"
  role = aws_iam_role.kb_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:ListBucket"]
      Resource = [
        aws_s3_bucket.kb_docs.arn,
        "${aws_s3_bucket.kb_docs.arn}/*"
      ]
    }]
  })
}

Knowledge Base with S3 Vectors

# rag/knowledge_base.tf

resource "aws_bedrockagent_knowledge_base" "this" {
  name     = "${var.environment}-${var.project}-kb"
  role_arn = aws_iam_role.kb_role.arn

  knowledge_base_configuration {
    type = "VECTOR"
    vector_knowledge_base_configuration {
      embedding_model_arn = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
    }
  }

  storage_configuration {
    type = "S3_VECTORS"
    s3_vectors_configuration {
      s3_vector_bucket_arn = aws_s3vectors_bucket.kb_vectors.arn
      vector_index_arn     = aws_s3vectors_index.kb_index.arn
    }
  }
}

resource "aws_bedrockagent_data_source" "s3" {
  name              = "${var.environment}-${var.project}-s3-source"
  knowledge_base_id = aws_bedrockagent_knowledge_base.this.id

  data_source_configuration {
    type = "S3"
    s3_configuration {
      bucket_arn = aws_s3_bucket.kb_docs.arn
    }
  }

  vector_ingestion_configuration {
    chunking_configuration {
      chunking_strategy = var.chunking_strategy

      dynamic "fixed_size_chunking_configuration" {
        for_each = var.chunking_strategy == "FIXED_SIZE" ? [1] : []
        content {
          max_tokens         = var.chunk_max_tokens
          overlap_percentage = var.chunk_overlap_pct
        }
      }
    }
  }
}

⚠️ Limitations: What S3 Vectors Can't Do

Before you migrate everything, know the trade-offs:

Feature	S3 Vectors	OpenSearch Serverless
Hybrid search	❌ Semantic only	✅ Vector + keyword
Query latency	100-800ms	10-100ms
Throughput	Hundreds QPS/bucket	Thousands QPS
Hierarchical chunking	⚠️ Limited (metadata size constraints)	✅ Full support
Metadata filtering	✅ Post-search filtering	✅ Pre-search filtering
Minimum cost	$0 (pay per use)	~$700/month (4 OCUs)
Max vectors/index	2 billion	Varies by OCU

The big one: No hybrid search. S3 Vectors supports semantic (vector) search only. If your users search for product codes, policy numbers, or exact identifiers, you need keyword matching - and that means OpenSearch. For natural language questions against internal docs, semantic-only search works well.

📐 When to Use What

Scenario	Recommended Store	Why
Dev/test environments	S3 Vectors	Zero idle cost, adequate performance
Internal doc search (< 100 QPS)	S3 Vectors	Cost-effective, latency acceptable
Customer-facing chatbot	OpenSearch Serverless	Low latency, hybrid search
Mixed content with codes/IDs	OpenSearch Serverless	Needs keyword matching
Staging (prod parity needed)	OpenSearch Serverless	Match prod behavior
Archive/long-tail vectors	S3 Vectors	90% cheaper long-term storage

The Tiered Strategy

For production at scale, AWS recommends combining both: S3 Vectors for long-term, infrequently queried vectors, and OpenSearch for hot, real-time search. You can export from S3 Vectors to OpenSearch with a one-click migration when vectors need higher performance. This isn't automated yet - you manage the tiering through API calls.

🔄 Environment-Specific Configuration

Use Terraform variables to swap vector stores per environment:

# environments/dev.tfvars
vector_store_type    = "S3_VECTORS"
embedding_model      = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy    = "FIXED_SIZE"
chunk_max_tokens     = 300
chunk_overlap_pct    = 10

# environments/prod.tfvars
vector_store_type    = "OPENSEARCH_SERVERLESS"
embedding_model      = "amazon.titan-embed-text-v2:0"
embedding_dimensions = 1024
chunking_strategy    = "HIERARCHICAL"
chunk_max_tokens     = 1500  # parent
chunk_overlap_pct    = 20

Then use dynamic blocks in your Knowledge Base resource to switch between S3 Vectors and OpenSearch based on the variable. This gives you S3 Vectors in dev (near-zero cost) and OpenSearch Serverless in prod (full capabilities).

🧪 Querying: Same API, Different Store

The best part - your application code doesn't change. RetrieveAndGenerate and Retrieve work identically regardless of the backing vector store:

response = client.retrieve_and_generate(
    input={"text": "What is our return policy?"},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": "YOUR_KB_ID",
            "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-20250514"
        }
    }
)

Same code, same API, same citations. The only difference is what's behind the curtain.

⏭️ What's Next

This is Post 3 of the AWS RAG Pipeline with Terraform series.

Post 1: Bedrock Knowledge Base - Basic Setup 🔍
Post 2: Advanced RAG - Chunking, Search, Reranking 🧠
Post 3: S3 Vectors - Cheapest Vector Store (you are here) 💰

Your dev Knowledge Base just went from $706/month to $8/month. S3 Vectors gives you the same Bedrock APIs with zero infrastructure and pay-per-use pricing. Use it in dev, use it for internal tools, and keep OpenSearch Serverless where you need real-time hybrid search. 💰

Found this helpful? Follow for the full RAG Pipeline with Terraform series! 💬

Top comments (2)

Varun Seth • Mar 5

love the comparison "When to Use What".

most of the times when i hear requirements and architects wants to use S3 because "S3 is cheap". Cheap doesn't mean that it will meet the production requirements. the answer is always - it depends!

Suhas Mallesh • Mar 5

Thank you :)