Suhas Mallesh

Posted on Mar 13

Deploy Your First Bedrock Agent with Terraform: Model-Agnostic and Future-Proof 🤖

#aws #terraform #devops #ai

Bedrock Agents add reasoning, planning, and tool use on top of foundation models. Here's how to deploy one with Terraform where upgrading to the next Claude or Nova model is a single variable change.

In Series 1, we deployed a Bedrock endpoint and called a foundation model directly. Send a prompt, get a response. Stateless, single-turn, no tools.

Bedrock Agents are different. An agent wraps a foundation model with an orchestration layer that can reason through multi-step tasks, maintain conversation state across turns, decide when to use tools, and loop until it reaches a satisfactory answer. The model is the brain, but the agent is the one making decisions about what to do next.

The problem: AWS ships new models constantly. Claude 3.5 became Claude 4, Nova Lite became Nova Premier. If you hardcode anthropic.claude-3-5-sonnet-20241022-v2:0 throughout your Terraform configs, upgrading means a search-and-replace across files. This post builds an agent where swapping to the next model is a single variable change in one .tfvars file. 🎯

🧠 Endpoint vs Agent: What's Different

Capability	Direct Endpoint (Series 1)	Bedrock Agent
Interaction	Single request/response	Multi-turn conversation
State	Stateless	Session memory (configurable TTL)
Tool use	Manual implementation	Built-in action groups
Reasoning	Whatever the model does	ReAct-style orchestration loop
Versioning	Model version only	Agent alias + version snapshots
Guardrails	Separate API call	Attach directly to agent

The agent adds a layer of autonomy. Instead of you deciding what API to call, the agent decides based on the user's request and the tools available.

🔧 Model Configuration: The Future-Proof Layer

The key to model agnosticism is separating model selection from agent logic. Define models as variables with sensible defaults, and override per environment:

# variables.tf

variable "agent_model" {
  description = "Foundation model ID for the agent. Change this to upgrade models."
  type = object({
    id      = string  # Bedrock model ID
    display = string  # Human-readable name for tags/logs
  })
  default = {
    id      = "anthropic.claude-sonnet-4-20250514-v1:0"
    display = "Claude Sonnet 4"
  }
}

variable "agent_name" {
  description = "Name of the Bedrock agent"
  type        = string
  default     = "assistant"
}

variable "agent_instruction" {
  description = "System instruction for the agent. Defines its behavior and persona."
  type        = string
}

variable "idle_session_ttl" {
  description = "How long agent sessions stay open (seconds). Max 3600."
  type        = number
  default     = 600
}

When the next model drops, you change one block in your .tfvars:

# environments/dev.tfvars
agent_model = {
  id      = "anthropic.claude-sonnet-4-20250514-v1:0"
  display = "Claude Sonnet 4"
}

# environments/prod.tfvars
agent_model = {
  id      = "anthropic.claude-sonnet-4-20250514-v1:0"
  display = "Claude Sonnet 4"
}

# When a new model launches, update ONE place:
# agent_model = {
#   id      = "anthropic.claude-next-model-v1:0"
#   display = "Claude Next"
# }

Run terraform apply and your agent uses the new model. No code changes, no redeployment of application logic.

🔧 Terraform: Deploy the Agent

Data Source for Model Validation

# agent/main.tf

data "aws_bedrock_foundation_model" "agent" {
  model_id = var.agent_model.id
}

This validates the model ID exists and is available in your region before Terraform creates anything. If you typo the model ID or it's not available, terraform plan fails early instead of during apply.

IAM Role

# agent/iam.tf

resource "aws_iam_role" "agent" {
  name = "${var.environment}-${var.agent_name}-agent-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "bedrock.amazonaws.com" }
      Condition = {
        StringEquals = {
          "aws:SourceAccount" = data.aws_caller_identity.current.account_id
        }
      }
    }]
  })
}

resource "aws_iam_role_policy" "agent_invoke" {
  name = "bedrock-invoke"
  role = aws_iam_role.agent.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "bedrock:InvokeModel"
      Resource = "arn:aws:bedrock:${var.region}::foundation-model/${var.agent_model.id}"
    }]
  })
}

The IAM policy references var.agent_model.id, so when you swap models the permissions update automatically. No manual IAM changes.

The Agent Resource

# agent/agent.tf

resource "aws_bedrockagent_agent" "this" {
  agent_name              = "${var.environment}-${var.agent_name}"
  agent_resource_role_arn = aws_iam_role.agent.arn
  description             = "Agent: ${var.agent_model.display} | Env: ${var.environment}"
  foundation_model        = data.aws_bedrock_foundation_model.agent.model_id
  instruction             = var.agent_instruction
  idle_session_ttl_in_seconds = var.idle_session_ttl
  prepare_agent           = true

  tags = {
    Environment = var.environment
    Model       = var.agent_model.display
    ManagedBy   = "terraform"
  }
}

prepare_agent = true is important. Bedrock agents have a lifecycle: after creating or updating, the agent must be "prepared" before it can serve requests. Setting this to true triggers preparation automatically on create and update.

Agent Alias (Deployable Endpoint)

# agent/alias.tf

resource "aws_bedrockagent_agent_alias" "live" {
  agent_alias_name = "${var.environment}-live"
  agent_id         = aws_bedrockagent_agent.this.agent_id
  description      = "Live alias for ${var.environment}"

  tags = {
    Environment = var.environment
    Model       = var.agent_model.display
  }
}

The alias is what your application calls. Think of it like a DNS name pointing to an agent version. When you update the agent (new model, new instructions), a new version is created and the alias points to it. Your application code never changes because it references the alias, not a specific version.

🧪 Invoke the Agent

Your application code uses the alias ARN, which stays stable across model upgrades:

import boto3

client = boto3.client("bedrock-agent-runtime")

response = client.invoke_agent(
    agentId="YOUR_AGENT_ID",
    agentAliasId="YOUR_ALIAS_ID",
    sessionId="user-session-123",
    inputText="What were our Q3 revenue numbers?"
)

# Stream the response
for event in response["completion"]:
    if "chunk" in event:
        print(event["chunk"]["bytes"].decode(), end="")

Notice: sessionId enables multi-turn conversation. The agent remembers context within a session up to the idle_session_ttl. No session management code needed on your side.

📐 The Upgrade Workflow

When a new model launches, the upgrade is three steps:

Update .tfvars - change agent_model.id and agent_model.display
Run terraform plan - review the changes (agent updated, IAM policy updated, tags updated)
Run terraform apply - agent re-prepared with new model, alias updated

Your application code doesn't change. The alias ID stays the same. The agent just got smarter.

This works because we externalized everything that changes (model ID, instructions, session TTL) into variables, and everything that stays stable (alias ID, agent ID, application integration) is infrastructure.

🛡️ Attaching Guardrails (Optional)

If you deployed guardrails in Series 1, attach them to the agent:

resource "aws_bedrockagent_agent" "this" {
  # ... all previous config ...

  guardrail_configuration {
    guardrail_identifier = var.guardrail_id
    guardrail_version    = var.guardrail_version
  }
}

The guardrails apply to every agent interaction automatically. No per-request configuration needed.

⚠️ Gotchas and Tips

Model availability varies by region. Not every model is available in every AWS region. The data.aws_bedrock_foundation_model data source catches this at plan time.

Agent preparation is async. When prepare_agent = true, Terraform waits for preparation to complete. If you're adding action groups (covered in the next post), note that changes to action groups also require re-preparation, which may need a null_resource with a local-exec provisioner to trigger.

Session TTL costs. Longer sessions mean more memory usage. For dev, 600 seconds (10 minutes) is fine. For production chatbots, consider 1800-3600 seconds. Sessions expire silently - the next message starts a new session.

Alias versioning. Each time you update the agent and apply, a new version is created. The alias always points to the latest. If you need rollback capability, create a separate alias pointing to a specific version number.

⏭️ What's Next

This is Post 1 of the AI Agents with Terraform series.

Post 1: Deploy First Bedrock Agent (you are here) 🤖
Post 2: Action Groups - Connect Your Agent to APIs
Post 3: Multi-Agent Orchestration
Post 4: Agent + Knowledge Base Grounding

Your first AI agent is deployed. It reasons, it remembers, and when the next model launches, upgrading is a one-line change. The foundation model is a variable, not a commitment. 🤖

Found this helpful? Follow for the full AI Agents with Terraform series! 💬

DEV Community