DEV Community

Cover image for Azure AI Foundry Content Safety with Terraform: RAI Policies + Content Filters as Code πŸ›‘οΈ
Suhas Mallesh
Suhas Mallesh

Posted on

Azure AI Foundry Content Safety with Terraform: RAI Policies + Content Filters as Code πŸ›‘οΈ

Azure gives you RAI policies for content filtering and a standalone Content Safety service for PII and custom screening. Here's how to deploy both with Terraform so every safety rule is version-controlled.

You deployed your first Azure AI Foundry endpoint (Post 1). GPT responds, tokens flow. But what stops it from generating hate speech, leaking a customer's credit card number, or falling for a jailbreak?

Azure gives you two safety layers:

  1. RAI (Responsible AI) Policies - content filter policies attached directly to model deployments, filtering hate, violence, sexual content, self-harm, jailbreaks, and protected material
  2. Azure AI Content Safety - a standalone cognitive service for text/image moderation, custom blocklists, and PII detection via external API calls

The key architectural difference from AWS and GCP: on Azure, content filters are attached to the deployment itself via rai_policy_name. The policy and the deployment are tightly coupled. This means safety is enforced at the infrastructure level, not the application level. 🎯

🧱 Azure Safety Architecture

Layer Service What It Does Managed By
Content filtering RAI Policy (azurerm_cognitive_account_rai_policy) Blocks hate, violence, sexual, self-harm content Terraform
Jailbreak detection RAI Policy (content_filter: Jailbreak) Blocks prompt injection attempts Terraform
Indirect attacks RAI Policy (content_filter: Indirect Attack) Blocks indirect prompt injection via data Terraform
Protected material RAI Policy (content_filter) Blocks copyrighted text/code Terraform
Standalone moderation Content Safety Account (kind = "ContentSafety") Text/image moderation API, custom blocklists Terraform
PII detection Content Safety + Azure DLP Detects/redacts sensitive data Application code

πŸ—οΈ Step 1: RAI Policy with Content Filters

The azurerm_cognitive_account_rai_policy resource defines content filter rules. Each filter needs both a Prompt (input) and Completion (output) entry:

# content-safety/rai_policy.tf

resource "azurerm_cognitive_account_rai_policy" "ai_safety" {
  name                = "${var.environment}-content-filter"
  cognitive_account_id = var.cognitive_account_id
  base_policy_name    = "Microsoft.Default"

  # ━━━ Hate ━━━
  content_filter {
    name               = "Hate"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["hate"]
    source             = "Prompt"
  }
  content_filter {
    name               = "Hate"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["hate"]
    source             = "Completion"
  }

  # ━━━ Sexual ━━━
  content_filter {
    name               = "Sexual"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["sexual"]
    source             = "Prompt"
  }
  content_filter {
    name               = "Sexual"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["sexual"]
    source             = "Completion"
  }

  # ━━━ Violence ━━━
  content_filter {
    name               = "Violence"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["violence"]
    source             = "Prompt"
  }
  content_filter {
    name               = "Violence"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["violence"]
    source             = "Completion"
  }

  # ━━━ Self-Harm ━━━
  content_filter {
    name               = "SelfHarm"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["selfharm"]
    source             = "Prompt"
  }
  content_filter {
    name               = "SelfHarm"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = var.content_filter_thresholds["selfharm"]
    source             = "Completion"
  }

  # ━━━ Jailbreak Detection ━━━
  content_filter {
    name               = "Jailbreak"
    filter_enabled     = true
    block_enabled      = true
    severity_threshold = "High"  # Required by provider, not used by Azure
    source             = "Prompt"
  }

  # ━━━ Indirect Prompt Attack ━━━
  content_filter {
    name               = "Indirect Attack"
    filter_enabled     = var.enable_indirect_attack_filter
    block_enabled      = var.enable_indirect_attack_filter
    severity_threshold = "High"
    source             = "Prompt"
  }

  # ━━━ Protected Material ━━━
  content_filter {
    name               = "Protected Material Text"
    filter_enabled     = var.enable_protected_material
    block_enabled      = var.enable_protected_material
    severity_threshold = "High"
    source             = "Completion"
  }
  content_filter {
    name               = "Protected Material Code"
    filter_enabled     = var.enable_protected_material
    block_enabled      = var.enable_protected_material
    severity_threshold = "High"
    source             = "Completion"
  }
}
Enter fullscreen mode Exit fullscreen mode

πŸ”§ Step 2: Variable-Driven Configuration

# content-safety/variables.tf

variable "environment" {
  type = string
}

variable "cognitive_account_id" {
  type        = string
  description = "ID of the AI Foundry cognitive account from Post 1"
}

variable "content_filter_thresholds" {
  type        = map(string)
  description = "Severity threshold per category: Low, Medium, High"
  default = {
    hate     = "Medium"
    sexual   = "Medium"
    violence = "Medium"
    selfharm = "Low"
  }
}

variable "enable_indirect_attack_filter" {
  type    = bool
  default = true
}

variable "enable_protected_material" {
  type    = bool
  default = true
}
Enter fullscreen mode Exit fullscreen mode

Per-environment configs:

# environments/dev.tfvars
environment = "dev"
content_filter_thresholds = {
  hate     = "High"      # Lenient - only block extreme content
  sexual   = "Medium"
  violence = "High"
  selfharm = "Medium"
}
enable_indirect_attack_filter = false  # Skip in dev
enable_protected_material     = false

# environments/prod.tfvars
environment = "prod"
content_filter_thresholds = {
  hate     = "Low"       # Strict - block anything borderline
  sexual   = "Low"
  violence = "Low"
  selfharm = "Low"
}
enable_indirect_attack_filter = true
enable_protected_material     = true
Enter fullscreen mode Exit fullscreen mode

Threshold behavior: Low blocks the most content (anything at low severity or above). High only blocks clearly extreme content. This is the opposite of what you might expect - think of it as "block when severity reaches this level."

πŸ”Œ Step 3: Attach RAI Policy to Your Deployment

The RAI policy links to your model deployment via rai_policy_name. Update your deployment from Post 1:

resource "azurerm_cognitive_deployment" "primary" {
  name                 = "${var.environment}-${var.primary_model.name}"
  cognitive_account_id = azurerm_cognitive_account.ai_foundry.id
  rai_policy_name      = azurerm_cognitive_account_rai_policy.ai_safety.name

  model {
    format  = "OpenAI"
    name    = var.primary_model.name
    version = var.primary_model.version
  }

  sku {
    name     = "Standard"
    capacity = var.primary_model.tpm
  }
}
Enter fullscreen mode Exit fullscreen mode

That's it. Every request to this deployment now runs through your content filter. No application code changes. No SDK wrappers. The filtering happens at Azure's infrastructure layer before your app ever sees the response. πŸ”’

πŸ›‘οΈ Step 4: Standalone Content Safety (Optional Layer)

For additional screening beyond RAI policies - custom blocklists, standalone text moderation, or screening content from non-Azure-OpenAI sources - deploy the Content Safety service:

resource "azurerm_cognitive_account" "content_safety" {
  name                  = "${var.environment}-content-safety"
  location              = var.location
  resource_group_name   = var.resource_group_name
  kind                  = "ContentSafety"
  sku_name              = "S0"
  custom_subdomain_name = "${var.environment}-content-safety-${var.unique_suffix}"

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = var.environment
    Purpose     = "ai-content-moderation"
  }
}
Enter fullscreen mode Exit fullscreen mode

Use it in your Function App for additional screening:

from azure.ai.contentsafety import ContentSafetyClient
from azure.identity import DefaultAzureCredential

client = ContentSafetyClient(
    endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
    credential=DefaultAzureCredential(),
)

def screen_text(text):
    """Screen content through Azure AI Content Safety."""
    from azure.ai.contentsafety.models import AnalyzeTextOptions
    result = client.analyze_text(AnalyzeTextOptions(text=text))

    for category in result.categories_analysis:
        if category.severity and category.severity >= 4:
            return False, f"Blocked: {category.category}"
    return True, None
Enter fullscreen mode Exit fullscreen mode

This gives you the same standalone moderation pattern as AWS Bedrock's ApplyGuardrail API or GCP's Model Armor - useful for screening content from any source.

πŸ“Š Tri-Cloud Safety Comparison

Capability AWS Bedrock GCP Vertex AI Azure AI Foundry
Content filtering Guardrail resource Model Armor + Gemini SafetySettings RAI Policy (attached to deployment)
Filter attachment Per-request parameter Per-request + standalone API Per-deployment (infrastructure-level)
PII/DLP Built-in to guardrail Model Armor + SDP Content Safety API + Azure DLP
Jailbreak detection Content filter category Model Armor filter RAI Policy filter
Custom blocklists Word policy in guardrail System instructions RAI Policy + Content Safety
Protected material Not available Not available RAI Policy filter (text + code)
Terraform resource aws_bedrock_guardrail google_model_armor_template azurerm_cognitive_account_rai_policy

Azure's unique advantage: Protected material detection. Azure can detect and block copyrighted text and code in model outputs - a feature neither AWS nor GCP offers natively. For enterprises concerned about IP liability, this is significant. 🎯

🎯 What You Just Built

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Input                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  RAI Policy (Prompt Filters)     β”‚
β”‚  βœ“ Hate / Sexual / Violence      β”‚
β”‚  βœ“ Self-harm detection           β”‚
β”‚  βœ“ Jailbreak detection           β”‚
β”‚  βœ“ Indirect attack detection     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚ Passed?
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Azure OpenAI Model              β”‚
β”‚  (GPT-4.1, o4-mini, GPT-5, etc.) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  RAI Policy (Completion Filters) β”‚
β”‚  βœ“ Content safety filters        β”‚
β”‚  βœ“ Protected material (text+code)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚ Passed?
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Response                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

RAI policy attached at the deployment level. Every request filtered automatically. All managed by Terraform with environment-specific .tfvars. πŸš€

⏭️ What's Next

This is Post 2 of the AI Infra on Azure with Terraform series.


Content safety attached at the infrastructure level - not bolted on in application code. When compliance asks how content is filtered, point them to a Terraform plan, not a portal screenshot. πŸ”’

Found this helpful? Follow for the full AI Infra on Azure with Terraform series! πŸ’¬

Top comments (0)