Suhas Mallesh

Posted on Feb 23

Bedrock Knowledge Base with Terraform: Your First RAG Pipeline on AWS 🔍

#ai #terraform #devops #aws

Bedrock Knowledge Bases handle chunking, embedding, and retrieval so you don't have to. But the infrastructure underneath - OpenSearch Serverless, S3, IAM policies - needs Terraform to be production-ready.

You have a Bedrock endpoint that can answer general questions. But ask it about your company's internal docs and it hallucinates confidently. That's the gap RAG fills - Retrieval-Augmented Generation grounds model responses in your actual data.

AWS Bedrock Knowledge Bases is a fully managed RAG service. You point it at an S3 bucket containing your documents, it chunks the content, generates embeddings, stores vectors in OpenSearch Serverless, and handles retrieval at query time. No custom embedding pipelines, no vector database management, no retrieval logic to write.

The catch: the console "quick create" abstracts away a dozen resources. In production, you need to own every one of them - the OpenSearch collection policies, the IAM roles, the S3 bucket, the knowledge base configuration. Terraform makes that explicit. 🎯

🏗️ Architecture Overview

┌──────────────┐     ┌──────────────┐     ┌────────────────────┐
│  S3 Bucket   │────>│  Bedrock     │────>│ OpenSearch         │
│  (Documents) │     │  Knowledge   │     │ Serverless         │
│              │     │  Base        │     │ (Vector Collection)│
└──────────────┘     └──────┬───────┘     └────────────────────┘
                            │
                    ┌───────┴───────┐
                    │ RetrieveAnd   │
                    │ Generate API  │
                    └───────────────┘

Data flow: Documents in S3 are chunked and embedded during sync. Queries hit the RetrieveAndGenerate API, which embeds the question, searches OpenSearch for relevant chunks, and passes them as context to the foundation model.

📦 Step 1: S3 Bucket for Documents

This bucket holds your source documents - PDFs, Markdown, HTML, Word, and plain text:

# rag/s3.tf

resource "aws_s3_bucket" "knowledge_base_docs" {
  bucket        = "${var.environment}-${var.kb_name}-documents"
  force_destroy = var.environment != "prod"

  tags = {
    Environment = var.environment
    Purpose     = "bedrock-knowledge-base-source"
  }
}

resource "aws_s3_bucket_versioning" "knowledge_base_docs" {
  bucket = aws_s3_bucket.knowledge_base_docs.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "knowledge_base_docs" {
  bucket = aws_s3_bucket.knowledge_base_docs.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

🔐 Step 2: OpenSearch Serverless Collection

OpenSearch Serverless is the vector store. It requires three policies before the collection can be created - encryption, network, and data access:

# rag/opensearch.tf

data "aws_caller_identity" "current" {}

resource "aws_opensearchserverless_security_policy" "encryption" {
  name = "${var.environment}-${var.kb_name}-enc"
  type = "encryption"
  policy = jsonencode({
    Rules = [{
      Resource     = ["collection/${var.environment}-${var.kb_name}"]
      ResourceType = "collection"
    }]
    AWSOwnedKey = true
  })
}

resource "aws_opensearchserverless_security_policy" "network" {
  name = "${var.environment}-${var.kb_name}-net"
  type = "network"
  policy = jsonencode([{
    Rules = [
      {
        ResourceType = "collection"
        Resource     = ["collection/${var.environment}-${var.kb_name}"]
      },
      {
        ResourceType = "dashboard"
        Resource     = ["collection/${var.environment}-${var.kb_name}"]
      }
    ]
    AllowFromPublic = true
  }])
}

resource "aws_opensearchserverless_access_policy" "data" {
  name = "${var.environment}-${var.kb_name}-data"
  type = "data"
  policy = jsonencode([{
    Rules = [{
      ResourceType = "index"
      Resource     = ["index/${var.environment}-${var.kb_name}/*"]
      Permission = [
        "aoss:CreateIndex",
        "aoss:DeleteIndex",
        "aoss:DescribeIndex",
        "aoss:ReadDocument",
        "aoss:WriteDocument",
        "aoss:UpdateIndex"
      ]
    }]
    Principal = [
      aws_iam_role.knowledge_base.arn,
      data.aws_caller_identity.current.arn
    ]
  }])
}

resource "aws_opensearchserverless_collection" "kb" {
  name             = "${var.environment}-${var.kb_name}"
  type             = "VECTORSEARCH"
  standby_replicas = var.environment == "prod" ? "ENABLED" : "DISABLED"

  depends_on = [
    aws_opensearchserverless_security_policy.encryption,
    aws_opensearchserverless_security_policy.network,
    aws_opensearchserverless_access_policy.data
  ]
}

Critical: All three policies must exist before creating the collection. The depends_on block ensures correct ordering. In dev, disable standby replicas to reduce costs.

🔑 Step 3: IAM Role

The knowledge base needs permissions for Bedrock models (embedding), S3 (reading documents), and OpenSearch (reading/writing vectors):

# rag/iam.tf

resource "aws_iam_role" "knowledge_base" {
  name = "${var.environment}-${var.kb_name}-kb-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "bedrock.amazonaws.com" }
      Action    = "sts:AssumeRole"
      Condition = {
        StringEquals = { "aws:SourceAccount" = data.aws_caller_identity.current.account_id }
      }
    }]
  })
}

resource "aws_iam_role_policy" "kb_bedrock" {
  name = "bedrock-model-access"
  role = aws_iam_role.knowledge_base.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["bedrock:InvokeModel"]
      Resource = ["arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"]
    }]
  })
}

resource "aws_iam_role_policy" "kb_s3" {
  name = "s3-read-access"
  role = aws_iam_role.knowledge_base.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:ListBucket"]
      Resource = [
        aws_s3_bucket.knowledge_base_docs.arn,
        "${aws_s3_bucket.knowledge_base_docs.arn}/*"
      ]
    }]
  })
}

resource "aws_iam_role_policy" "kb_opensearch" {
  name = "opensearch-access"
  role = aws_iam_role.knowledge_base.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["aoss:APIAccessAll"]
      Resource = [aws_opensearchserverless_collection.kb.arn]
    }]
  })
}

🧠 Step 4: Knowledge Base and Data Source

Now wire everything together - the knowledge base links the embedding model to the vector store, and the data source points at S3:

# rag/knowledge_base.tf

resource "aws_bedrockagent_knowledge_base" "this" {
  name     = "${var.environment}-${var.kb_name}"
  role_arn = aws_iam_role.knowledge_base.arn

  knowledge_base_configuration {
    type = "VECTOR"
    vector_knowledge_base_configuration {
      embedding_model_arn = "arn:aws:bedrock:${var.region}::foundation-model/${var.embedding_model}"
    }
  }

  storage_configuration {
    type = "OPENSEARCH_SERVERLESS"
    opensearch_serverless_configuration {
      collection_arn    = aws_opensearchserverless_collection.kb.arn
      vector_index_name = var.vector_index_name

      field_mapping {
        vector_field   = "bedrock-knowledge-base-default-vector"
        text_field     = "AMAZON_BEDROCK_TEXT_CHUNK"
        metadata_field = "AMAZON_BEDROCK_METADATA"
      }
    }
  }

  depends_on = [
    aws_iam_role_policy.kb_bedrock,
    aws_iam_role_policy.kb_s3,
    aws_iam_role_policy.kb_opensearch
  ]
}

resource "aws_bedrockagent_data_source" "s3" {
  name                 = "${var.environment}-${var.kb_name}-s3-source"
  knowledge_base_id    = aws_bedrockagent_knowledge_base.this.id

  data_source_configuration {
    type = "S3"
    s3_configuration {
      bucket_arn = aws_s3_bucket.knowledge_base_docs.arn
    }
  }

  vector_ingestion_configuration {
    chunking_configuration {
      chunking_strategy = var.chunking_strategy

      dynamic "fixed_size_chunking_configuration" {
        for_each = var.chunking_strategy == "FIXED_SIZE" ? [1] : []
        content {
          max_tokens         = var.chunk_max_tokens
          overlap_percentage = var.chunk_overlap_percentage
        }
      }
    }
  }
}

🔧 Step 5: Variables

# rag/variables.tf

variable "environment" { type = string }
variable "region" { type = string }
variable "kb_name" { type = string }

variable "embedding_model" {
  type    = string
  default = "amazon.titan-embed-text-v2:0"
}

variable "vector_index_name" {
  type    = string
  default = "bedrock-knowledge-base-default-index"
}

variable "chunking_strategy" {
  type    = string
  default = "FIXED_SIZE"
}

variable "chunk_max_tokens" {
  type    = number
  default = 512
}

variable "chunk_overlap_percentage" {
  type    = number
  default = 20
}

Per-environment configs:

# environments/dev.tfvars
kb_name                  = "company-docs"
chunking_strategy        = "FIXED_SIZE"
chunk_max_tokens         = 300
chunk_overlap_percentage = 10

# environments/prod.tfvars
kb_name                  = "company-docs"
chunking_strategy        = "FIXED_SIZE"
chunk_max_tokens         = 512
chunk_overlap_percentage = 20

🔍 Step 6: Querying Your Knowledge Base

After syncing documents (triggered via console or CLI), query using the RetrieveAndGenerate API:

import boto3

client = boto3.client("bedrock-agent-runtime")

response = client.retrieve_and_generate(
    input={"text": "What is our refund policy?"},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": "YOUR_KB_ID",
            "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-20250514"
        }
    }
)

print(response["output"]["text"])
# Includes citations back to source documents

The response includes source citations, so users can verify answers against the original documents.

💰 Cost Considerations

Component	Pricing Model	Dev Optimization
OpenSearch Serverless	OCU-hours (min 2 OCUs for indexing, 2 for search)	Disable standby replicas
Titan Embeddings V2	$0.00002/1K tokens	Smaller chunks = fewer tokens
S3	Standard storage pricing	Minimal cost
Bedrock queries	Per-model pricing for generation	Use smaller models in dev

OpenSearch Serverless is the biggest cost driver. At minimum, you're paying for 4 OCUs (2 indexing + 2 search) even at zero traffic. For dev/test, consider whether a simpler vector store like S3 Vectors (newer, cheaper option) fits your needs.

Your first RAG pipeline is deployed. Documents in S3, vectors in OpenSearch, retrieval via Bedrock - all managed by Terraform, all repeatable across environments. 🔍

Found this helpful? Follow for the full RAG Pipeline with Terraform series! 💬

DEV Community