DEV Community

Cover image for Deploy Your First Azure AI Agent with Terraform: Model-Agnostic and Future-Proof 🤖
Suhas Mallesh
Suhas Mallesh

Posted on

Deploy Your First Azure AI Agent with Terraform: Model-Agnostic and Future-Proof 🤖

Azure AI Foundry hosts agents that reason, remember, and use tools. Here's how to deploy the infrastructure with Terraform and create your first agent with the Python SDK, where upgrading to the next GPT or Claude model is a single variable change.

In Series 1, we deployed an Azure AI Foundry endpoint and called GPT-4o directly. Single prompt, single response, stateless.

Azure AI agents are different. An agent wraps a model deployment with an orchestration layer that maintains conversation threads, decides when to call tools, and loops through reasoning steps until it resolves the user's request. The model generates text; the agent decides what to do with it.

Azure's agent stack lives inside AI Foundry: you create an AI Services resource with project management enabled, deploy a model, then use the Azure AI Agent SDK to create agents that use that deployment. Terraform provisions the infrastructure (Foundry resource, project, model deployment, IAM). The Python SDK creates and manages agents. When GPT-5.2 or the next Claude drops, you change one variable. 🎯

🧠 Endpoint vs Agent: What's Different

Capability Direct Endpoint (Series 1) Azure AI Agent
Interaction Single request/response Multi-turn conversation threads
State Stateless Thread-based message history
Tool use Manual implementation Function calling, code interpreter, file search
Reasoning Whatever the model does Run-based orchestration loop
Deployment Model deployment only Agent + model deployment + thread management

🔧 Model Configuration: The Future-Proof Layer

# variables.tf

variable "agent_model" {
  description = "Model for the agent. Change this to upgrade."
  type = object({
    name    = string  # Azure OpenAI model name
    version = string  # Model version
    sku     = string  # Deployment SKU (Standard, GlobalStandard)
    tpm     = number  # Tokens per minute capacity
  })
  default = {
    name    = "gpt-4o"
    version = "2024-11-20"
    sku     = "GlobalStandard"
    tpm     = 30
  }
}

variable "agent_name" {
  description = "Name of the AI agent"
  type        = string
  default     = "assistant"
}

variable "agent_instruction" {
  description = "System instruction for the agent"
  type        = string
}
Enter fullscreen mode Exit fullscreen mode

Environment-specific overrides:

# environments/dev.tfvars
agent_model = {
  name    = "gpt-4o-mini"
  version = "2024-07-18"
  sku     = "GlobalStandard"
  tpm     = 30
}

# environments/prod.tfvars
agent_model = {
  name    = "gpt-4o"
  version = "2024-11-20"
  sku     = "GlobalStandard"
  tpm     = 80
}

# When a new model launches:
# agent_model = {
#   name    = "gpt-5"
#   version = "2025-08-07"
#   sku     = "GlobalStandard"
#   tpm     = 80
# }
Enter fullscreen mode Exit fullscreen mode

🔧 Terraform: Provision the Infrastructure

AI Foundry Resource and Project

Azure AI Agent Service requires an AI Services resource with project_management_enabled = true. This is the newer "Foundry" resource type that supports agents natively:

# agent/foundry.tf

resource "azurerm_cognitive_account" "ai_foundry" {
  name                       = "${var.environment}-${var.project}-foundry"
  location                   = azurerm_resource_group.this.location
  resource_group_name        = azurerm_resource_group.this.name
  kind                       = "AIServices"
  sku_name                   = "S0"
  custom_subdomain_name      = "${var.environment}-${var.project}-foundry"
  project_management_enabled = true

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = var.environment
    Model       = var.agent_model.name
    ManagedBy   = "terraform"
  }
}

resource "azurerm_cognitive_account_project" "agent_project" {
  name                  = "${var.environment}-agent-project"
  cognitive_account_id  = azurerm_cognitive_account.ai_foundry.id
}
Enter fullscreen mode Exit fullscreen mode

project_management_enabled = true is the critical setting. Without it, the AI Services resource won't support agent operations. The S0 SKU is required for stateful features including the agent service.

Model Deployment

# agent/deployment.tf

resource "azurerm_cognitive_deployment" "agent_model" {
  name                 = "${var.environment}-${var.agent_model.name}"
  cognitive_account_id = azurerm_cognitive_account.ai_foundry.id

  sku {
    name     = var.agent_model.sku
    capacity = var.agent_model.tpm
  }

  model {
    format  = "OpenAI"
    name    = var.agent_model.name
    version = var.agent_model.version
  }
}
Enter fullscreen mode Exit fullscreen mode

When you change agent_model in .tfvars, Terraform updates the deployment. The agent SDK references the deployment name, which updates automatically.

RBAC for Agent Operations

# agent/iam.tf

data "azurerm_client_config" "current" {}

# Your identity needs Cognitive Services User to create agents
resource "azurerm_role_assignment" "agent_user" {
  scope                = azurerm_cognitive_account.ai_foundry.id
  role_definition_name = "Cognitive Services User"
  principal_id         = data.azurerm_client_config.current.object_id
}
Enter fullscreen mode Exit fullscreen mode

🐍 Create the Agent with Python SDK

Terraform provisions the infrastructure. The Python SDK creates and manages agents:

# agent_code/create_agent.py

from azure.ai.agents import AgentsClient
from azure.identity import DefaultAzureCredential
import json

# Load config from Terraform output
with open("agent_config.json") as f:
    config = json.load(f)

client = AgentsClient(
    endpoint=config["foundry_endpoint"],
    credential=DefaultAzureCredential(),
)

# Create the agent - model is a variable, not hardcoded
agent = client.create_agent(
    model=config["deployment_name"],
    name=config["agent_name"],
    instructions=config["instruction"],
)

print(f"Agent created: {agent.id}")
Enter fullscreen mode Exit fullscreen mode

Terraform Outputs for SDK

Bridge Terraform infrastructure to the Python SDK:

# agent/outputs.tf

resource "local_file" "agent_config" {
  filename = "${path.module}/agent_code/agent_config.json"
  content = jsonencode({
    foundry_endpoint = azurerm_cognitive_account.ai_foundry.endpoint
    deployment_name  = azurerm_cognitive_deployment.agent_model.name
    agent_name       = "${var.environment}-${var.agent_name}"
    instruction      = var.agent_instruction
  })
}
Enter fullscreen mode Exit fullscreen mode

🧪 Invoke the Agent

Azure agents use a thread-based conversation model: create a thread, add messages, create a run, and stream the response:

# Create a conversation thread
thread = client.threads.create()

# Add user message
client.messages.create(
    thread_id=thread.id,
    role="user",
    content="What were our Q3 revenue numbers?"
)

# Run the agent on the thread
run = client.runs.create_and_process(
    thread_id=thread.id,
    agent_id=agent.id,
)

# Get the response
if run.status == "completed":
    messages = client.messages.list(thread_id=thread.id)
    for msg in messages:
        if msg.role == "assistant":
            print(msg.content[0].text.value)
Enter fullscreen mode Exit fullscreen mode

Thread = conversation session. Each thread maintains full message history. The agent sees all previous messages when generating a response. Threads persist until you delete them.

📐 The Upgrade Workflow

When a new model launches:

  1. Update .tfvars - change agent_model.name, agent_model.version
  2. Run terraform apply - deployment updated, config regenerated
  3. Recreate agent - python create_agent.py uses the new deployment

Existing threads continue working. The agent just uses the new model for future runs. No application code changes needed because the deployment name is derived from the variable.

⚠️ Gotchas and Tips

Use azapi for gaps. The azurerm provider covers AI Foundry resources well, but some features (workspace connections, certain preview APIs) need the azapi provider as an escape hatch.

Model availability varies by region. GPT-4o is widely available, but newer models may be limited to specific regions. Check Azure OpenAI model availability before setting your region.

Thread cleanup. Threads don't auto-expire. In production, implement a cleanup policy to delete old threads and avoid storage buildup.

Agent recreation. Unlike AWS where you update the agent in place, Azure agents are typically recreated with a new create_agent call when instructions or the model changes. Store the agent ID in your application config.

⏭️ What's Next

This is Post 1 of the Azure AI Agents with Terraform series.

  • Post 1: Deploy First Azure AI Agent (you are here) 🤖
  • Post 2: Function Calling - Connect to APIs
  • Post 3: Multi-Agent with AI Agent Service
  • Post 4: Agent + Bing Grounding

Your first Azure agent is deployed. Terraform provisions the Foundry, the SDK creates the agent, and when the next model launches, upgrading is a variable change. The model is configuration, not commitment. 🤖

Found this helpful? Follow for the full AI Agents with Terraform series! 💬

Top comments (0)