DEV Community

Cover image for Your AWS Account is AI-Ready: Deploy your first AI endpoint with Terraform in 10 minutes⚡
Suhas Mallesh
Suhas Mallesh

Posted on • Edited on

Your AWS Account is AI-Ready: Deploy your first AI endpoint with Terraform in 10 minutes⚡

Amazon Bedrock gives you instant access to Claude, Llama, and Titan models with zero ML experience. Here's how to deploy your first AI endpoint with Terraform in 10 minutes.

Every AWS account has access to Claude, Llama, Titan, and other foundation models right now. No GPU provisioning. No Docker containers. No ML degree required.

Amazon Bedrock is a fully managed service that lets you call AI models via API — just like calling DynamoDB or S3. You send a prompt, you get a response. That's it.

The problem? Most teams still set this up manually through the console. Let's fix that with Terraform and build something production-ready from day one. 🏗️

🤔 Bedrock vs SageMaker — Which One Do You Need?

Before writing any code, let's clear this up:

Bedrock SageMaker
What it is API access to pre-trained models Platform to build/train/deploy custom models
Infrastructure Serverless (zero to manage) You manage instances, endpoints, clusters
Use when You want to USE AI models You want to BUILD AI models
Models Claude, Llama, Titan, Mistral, etc. Your own custom models
ML expertise needed None Significant
Time to first API call 10 minutes Days to weeks

Rule of thumb: Start with Bedrock. Move to SageMaker only when you need custom model training. 90% of enterprise AI use cases (chatbots, summarization, RAG, code generation) work great with Bedrock. 🎯

🏗️ Step 1: Enable Model Access with Terraform

Before calling any model, AWS requires you to explicitly request access. This is a one-time setup per account/region.

# bedrock/main.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.60.0"  # Bedrock support
    }
  }
}

provider "aws" {
  region = var.region
}

variable "region" {
  type    = string
  default = "us-east-1"  # Best Bedrock model availability
}
Enter fullscreen mode Exit fullscreen mode

⚠️ Note: Model access activation currently requires a one-time console click or API call. Terraform manages the infrastructure around Bedrock, but the initial model enablement is done via the AWS console under Amazon Bedrock → Model access → Manage model access. This is intentional — AWS wants humans to explicitly approve which models an account can use.

🔐 Step 2: IAM Role for Bedrock Access

This is where Terraform shines. Lock down exactly which models your application can call:

# bedrock/iam.tf

# Role for your application to invoke Bedrock
resource "aws_iam_role" "bedrock_invoker" {
  name = "${var.environment}-bedrock-invoker"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "lambda.amazonaws.com"  # Or ecs-tasks, ec2, etc.
      }
    }]
  })

  tags = {
    Environment = var.environment
    Purpose     = "bedrock-ai"
    ManagedBy   = "terraform"
  }
}

# Granular Bedrock permissions
resource "aws_iam_role_policy" "bedrock_invoke" {
  name = "bedrock-invoke-policy"
  role = aws_iam_role.bedrock_invoker.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowModelInvocation"
        Effect = "Allow"
        Action = [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream"
        ]
        # Lock down to specific models only 👇
        Resource = [
          "arn:aws:bedrock:${var.region}::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0",
          "arn:aws:bedrock:${var.region}::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
          "arn:aws:bedrock:${var.region}::foundation-model/amazon.titan-embed-text-v2:0"
        ]
      },
      {
        Sid    = "AllowModelListing"
        Effect = "Allow"
        Action = [
          "bedrock:ListFoundationModels",
          "bedrock:GetFoundationModel"
        ]
        Resource = "*"
      }
    ]
  })
}

# CloudWatch Logs for Lambda
resource "aws_iam_role_policy_attachment" "lambda_logs" {
  role       = aws_iam_role.bedrock_invoker.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

variable "environment" {
  type    = string
  default = "dev"
}
Enter fullscreen mode Exit fullscreen mode

Why this matters: Most tutorials use bedrock:* on Resource: "*". That means any developer can call any model, including expensive ones. Locking down to specific model ARNs is how enterprises do it. 🔒

⚡ Step 3: Your First AI-Powered Lambda

Let's build a Lambda function that calls Claude and returns a response:

# bedrock/lambda.tf

resource "aws_lambda_function" "ai_endpoint" {
  filename         = data.archive_file.ai_lambda.output_path
  function_name    = "${var.environment}-bedrock-ai"
  role             = aws_iam_role.bedrock_invoker.arn
  handler          = "index.handler"
  runtime          = "python3.12"
  timeout          = 30
  memory_size      = 256
  source_code_hash = data.archive_file.ai_lambda.output_base64sha256

  environment {
    variables = {
      MODEL_ID    = "anthropic.claude-3-5-sonnet-20241022-v2:0"
      AWS_REGION_ = var.region  # SDK uses this
    }
  }

  tags = {
    Environment = var.environment
    Purpose     = "bedrock-ai"
    ManagedBy   = "terraform"
  }
}

data "archive_file" "ai_lambda" {
  type        = "zip"
  output_path = "${path.module}/ai_lambda.zip"

  source {
    content  = <<-PYTHON
import boto3
import json
import os

bedrock = boto3.client('bedrock-runtime')

def handler(event, context):
    """
    Simple Bedrock invocation endpoint.

    Event format:
    {
      "prompt": "Explain what Kubernetes is in 2 sentences.",
      "max_tokens": 500,
      "temperature": 0.7
    }
    """
    prompt = event.get('prompt', 'Say hello!')
    max_tokens = event.get('max_tokens', 500)
    temperature = event.get('temperature', 0.7)
    model_id = os.environ.get('MODEL_ID')

    # Bedrock Messages API format
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": max_tokens,
        "temperature": temperature,
        "messages": [
            {
                "role": "user",
                "content": prompt
            }
        ]
    })

    try:
        response = bedrock.invoke_model(
            modelId=model_id,
            contentType='application/json',
            accept='application/json',
            body=body
        )

        result = json.loads(response['body'].read())

        return {
            'statusCode': 200,
            'body': {
                'response': result['content'][0]['text'],
                'model': model_id,
                'input_tokens': result['usage']['input_tokens'],
                'output_tokens': result['usage']['output_tokens']
            }
        }

    except Exception as e:
        return {
            'statusCode': 500,
            'body': {'error': str(e)}
        }
    PYTHON
    filename = "index.py"
  }
}

# Function URL for quick testing (no API Gateway needed)
resource "aws_lambda_function_url" "ai_endpoint" {
  function_name      = aws_lambda_function.ai_endpoint.function_name
  authorization_type = var.environment == "prod" ? "AWS_IAM" : "NONE"
}

output "ai_endpoint_url" {
  value       = aws_lambda_function_url.ai_endpoint.function_url
  description = "URL to invoke your AI endpoint"
}

output "lambda_function_name" {
  value = aws_lambda_function.ai_endpoint.function_name
}
Enter fullscreen mode Exit fullscreen mode

🧪 Step 4: Test It

After terraform apply, test your AI endpoint:

# Via Lambda Function URL
curl -X POST $(terraform output -raw ai_endpoint_url) \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain what Kubernetes is in 2 sentences.",
    "max_tokens": 200
  }'

# Or via AWS CLI
aws lambda invoke \
  --function-name dev-bedrock-ai \
  --payload '{"prompt": "What is Terraform?", "max_tokens": 200}' \
  --cli-binary-format raw-in-base64-out \
  response.json

cat response.json
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "statusCode": 200,
  "body": {
    "response": "Kubernetes is an open-source container orchestration platform...",
    "model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
    "input_tokens": 14,
    "output_tokens": 52
  }
}
Enter fullscreen mode Exit fullscreen mode

You just called Claude from your own AWS infrastructure. No API keys to manage. No third-party billing. Just IAM roles and Bedrock. ✅

🏢 Step 5: Environment-Aware Module

For production, wrap everything in a reusable module:

# modules/bedrock-endpoint/variables.tf

variable "environment" {
  type = string

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Must be: dev, staging, or prod."
  }
}

variable "allowed_models" {
  type        = list(string)
  description = "List of Bedrock model IDs this environment can access"
  default     = ["anthropic.claude-3-haiku-20240307-v1:0"]  # Cheapest by default
}

variable "default_model" {
  type        = string
  description = "Default model for the Lambda endpoint"
  default     = "anthropic.claude-3-haiku-20240307-v1:0"
}

variable "region" {
  type    = string
  default = "us-east-1"
}
Enter fullscreen mode Exit fullscreen mode

Usage per environment:

# Dev: Cheapest model only
module "bedrock_dev" {
  source      = "./modules/bedrock-endpoint"
  environment = "dev"
  allowed_models = [
    "anthropic.claude-3-haiku-20240307-v1:0"
  ]
  default_model = "anthropic.claude-3-haiku-20240307-v1:0"
}

# Production: Multiple models, stricter access
module "bedrock_prod" {
  source      = "./modules/bedrock-endpoint"
  environment = "prod"
  allowed_models = [
    "anthropic.claude-3-5-sonnet-20241022-v2:0",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "amazon.titan-embed-text-v2:0"
  ]
  default_model = "anthropic.claude-3-5-sonnet-20241022-v2:0"
}
Enter fullscreen mode Exit fullscreen mode

Why this pattern matters: Dev gets the cheapest model (Haiku), prod gets Sonnet. Nobody accidentally racks up a bill testing with the most expensive model. IAM enforces it — not developer discipline. 🛡️

💡 Bedrock Model Quick Reference

Model Best For Input Cost/1K tokens Speed
Claude 3.5 Sonnet Complex reasoning, coding $0.003 Fast
Claude 3 Haiku Simple tasks, high volume $0.00025 Fastest
Llama 3.1 70B Open-source alternative $0.00099 Fast
Titan Text Express Basic text generation $0.0002 Fast
Titan Embeddings v2 Vector embeddings for RAG $0.00002 Fastest

💰 Pro tip: Use Haiku or Titan for dev/testing, Sonnet for production. The cost difference is 10x+.

🎯 What You Just Built

┌─────────────────────────────────────────┐
│  Your Application / curl / API client   │
└──────────────┬──────────────────────────┘
               │
               ▼
┌──────────────────────────┐
│  Lambda Function URL     │
│  (or API Gateway later)  │
└──────────────┬───────────┘
               │
               ▼
┌──────────────────────────┐
│  Lambda Function         │
│  (Python 3.12, 256MB)    │
│  Role: bedrock-invoker   │
└──────────────┬───────────┘
               │ IAM-authenticated
               ▼
┌──────────────────────────┐
│  Amazon Bedrock          │
│  Claude / Llama / Titan  │
│  (Fully managed, no GPU) │
└──────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

All deployed with Terraform. All version-controlled. All reproducible across environments. 🚀
⏭️ What's Next
This is Post 1 of the AWS AI Infrastructure with Terraform series. Coming up:

Post 2: Bedrock Guardrails — Stop your AI from leaking PII or going off-topic
Post 3: Invocation Logging — Track every AI call for compliance and debugging
Post 4: RAG Knowledge Base — Connect your company docs to AI with Bedrock + OpenSearch


You just deployed an AI endpoint in your AWS account with Terraform. No GPUs, no containers, no ML degree. That's the power of Bedrock. 🧠

Found this helpful? Follow for the full AWS AI Infrastructure with Terraform series! 💬

Top comments (0)