Azure doesn't send AI diagnostic logs anywhere by default. One Terraform resource changes that - azurerm_monitor_diagnostic_setting routes audit logs, request/response data, and metrics to Log Analytics and Storage.
You've deployed your Azure AI Foundry endpoint (Post 1) and added content safety policies (Post 2). Your models are serving responses in production. Then your compliance team asks:
"Can you prove who called which model, when, and how long each request took?"
Azure AI services emit three categories of diagnostic logs - Audit, RequestResponse, and Trace. But they don't go anywhere by default. Without a diagnostic setting, every API call vanishes into the void. One Terraform resource fixes this: azurerm_monitor_diagnostic_setting routes those logs to Log Analytics for real-time queries and Storage for long-term compliance retention. π―
π§± What Gets Logged
Azure Cognitive Services (the resource backing AI Foundry) emits three log categories:
| Category | What It Captures | Compliance Value |
|---|---|---|
| Audit | Key access events (ListKeys operations) | Security audit trail |
| RequestResponse | Every API call - model, operation, duration, caller IP, status code | Usage tracking, performance monitoring |
| Trace | Internal service trace data | Debugging |
The RequestResponse category is the workhorse. Each entry includes the operation name (completions, chat, embeddings), duration in milliseconds, HTTP status code, and partial caller IP. As of late 2024, Entra ID object IDs are also included when using AAD authentication instead of API keys.
Important limitation: RequestResponse logs capture metadata about calls (which model, how long, success/failure) but do not include the actual prompt or response content. For full prompt/response capture, you need Azure API Management (APIM) in front of your endpoint or application-level logging.
ποΈ Step 1: Log Analytics Workspace
Log Analytics is where you'll run real-time KQL queries against your AI logs:
# logging/log_analytics.tf
resource "azurerm_log_analytics_workspace" "ai_logs" {
name = "${var.environment}-ai-foundry-logs"
location = var.location
resource_group_name = var.resource_group_name
sku = "PerGB2018"
retention_in_days = var.log_analytics_retention_days
tags = {
Environment = var.environment
Purpose = "ai-diagnostic-logging"
}
}
π¦ Step 2: Storage Account for Long-Term Retention
For compliance retention beyond Log Analytics limits (up to 730 days), archive to a Storage Account with lifecycle management:
# logging/storage.tf
resource "azurerm_storage_account" "ai_logs" {
name = "${var.environment}ailogstore"
resource_group_name = var.resource_group_name
location = var.location
account_tier = "Standard"
account_replication_type = var.environment == "prod" ? "GRS" : "LRS"
min_tls_version = "TLS1_2"
tags = {
Environment = var.environment
Purpose = "ai-diagnostic-logging"
}
}
resource "azurerm_storage_management_policy" "ai_logs" {
storage_account_id = azurerm_storage_account.ai_logs.id
rule {
name = "archive-old-logs"
enabled = true
filters {
blob_types = ["blockBlob"]
}
actions {
base_blob {
tier_to_cool_after_days_since_modification_greater_than = var.cool_tier_days
tier_to_archive_after_days_since_modification_greater_than = var.archive_tier_days
delete_after_days_since_modification_greater_than = var.storage_retention_days
}
}
}
}
βοΈ Step 3: The Diagnostic Setting
This is the core resource. It attaches to your Cognitive Services account and routes all three log categories plus metrics to both destinations:
# logging/diagnostic_setting.tf
resource "azurerm_monitor_diagnostic_setting" "ai_foundry" {
name = "${var.environment}-ai-foundry-diagnostics"
target_resource_id = var.cognitive_account_id
log_analytics_workspace_id = azurerm_log_analytics_workspace.ai_logs.id
storage_account_id = azurerm_storage_account.ai_logs.id
enabled_log {
category = "Audit"
}
enabled_log {
category = "RequestResponse"
}
enabled_log {
category = "Trace"
}
metric {
category = "AllMetrics"
}
}
That's it. Every API call to your AI Foundry endpoint now flows to both Log Analytics and Storage. No IAM roles to configure, no bucket policies - Azure handles the plumbing internally.
Key detail: The target_resource_id is your azurerm_cognitive_account resource ID from Post 1. If you have multiple cognitive accounts (e.g., separate ones per environment), each needs its own diagnostic setting.
π§ Step 4: Variables
# logging/variables.tf
variable "environment" { type = string }
variable "location" { type = string }
variable "resource_group_name" { type = string }
variable "cognitive_account_id" { type = string }
variable "log_analytics_retention_days" {
type = number
default = 30
}
variable "cool_tier_days" {
type = number
default = 30
}
variable "archive_tier_days" {
type = number
default = 90
}
variable "storage_retention_days" {
type = number
default = 365
}
Per-environment configs:
# environments/dev.tfvars
log_analytics_retention_days = 30
cool_tier_days = 14
archive_tier_days = 30
storage_retention_days = 90
# environments/prod.tfvars
log_analytics_retention_days = 90
cool_tier_days = 90
archive_tier_days = 365
storage_retention_days = 2555 # 7 years for regulated industries
π Step 5: Query Your Logs
Once diagnostic settings are active, query your AI logs using KQL in Log Analytics:
// All AI API calls in the last 24 hours
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where TimeGenerated > ago(24h)
| project TimeGenerated, OperationName, DurationMs, ResultType
| order by TimeGenerated desc
// Average response time by operation
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize avg(DurationMs), count() by OperationName
| order by count_ desc
// Error rate over time (5-minute buckets)
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize
total = count(),
errors = countif(ResultType != "Success")
by bin(TimeGenerated, 5m)
| extend error_rate = round(errors * 100.0 / total, 2)
| order by TimeGenerated desc
// Request volume by hour (capacity planning)
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize count() by bin(TimeGenerated, 1h), OperationName
| render timechart
π¨ Step 6: Alerting
Set up alerts for error spikes and latency degradation:
# logging/alerts.tf
resource "azurerm_monitor_metric_alert" "ai_latency" {
name = "${var.environment}-ai-high-latency"
resource_group_name = var.resource_group_name
scopes = [var.cognitive_account_id]
description = "AI Foundry response latency exceeds threshold"
severity = 2
frequency = "PT5M"
window_size = "PT15M"
criteria {
metric_namespace = "Microsoft.CognitiveServices/accounts"
metric_name = "Latency"
aggregation = "Average"
operator = "GreaterThan"
threshold = 5000
}
action {
action_group_id = var.action_group_id
}
}
resource "azurerm_monitor_metric_alert" "ai_errors" {
name = "${var.environment}-ai-high-error-rate"
resource_group_name = var.resource_group_name
scopes = [var.cognitive_account_id]
description = "AI Foundry error rate spike"
severity = 1
frequency = "PT1M"
window_size = "PT5M"
criteria {
metric_namespace = "Microsoft.CognitiveServices/accounts"
metric_name = "ClientErrors"
aggregation = "Total"
operator = "GreaterThan"
threshold = 50
}
action {
action_group_id = var.action_group_id
}
}
π Production Architecture
ββββββββββββββββββββββββββββββββββββ
β Azure AI Foundry API Call β
β (Chat / Completions / Embed) β
βββββββββββββββββ¬βββββββββββββββββββ
β
Diagnostic Setting
(azurerm_monitor_diagnostic_setting)
β
βββββββββββββΌββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββ ββββββββββ ββββββββββββ
β Log β βStorage β β Metric β
βAnalyt. β βAccount β β Alerts β
β β β β β β
β KQL β β Long β β Latency β
β query β β term β β & error β
β & dash β β archiveβ β alerts β
ββββββββββ ββββββββββ ββββββββββββ
Dual-destination pattern: Log Analytics for real-time queries and dashboards (shorter retention, KQL access). Storage Account for compliance retention (lifecycle to Cool/Archive tiers, years of data).
π‘ Tri-Cloud Comparison: Logging Architecture
| Aspect | AWS (Bedrock) | GCP (Vertex AI) | Azure (AI Foundry) |
|---|---|---|---|
| Core Terraform resource | aws_bedrock_model_invocation_logging_configuration |
google_project_iam_audit_config + log sinks |
azurerm_monitor_diagnostic_setting |
| Prompt/response bodies | Included in logs (inline) | Separate BigQuery logging (API config) | Not in diagnostic logs (needs APIM) |
| Real-time query engine | CloudWatch Insights | BigQuery SQL | KQL (Log Analytics) |
| Long-term storage | S3 + Glacier lifecycle | GCS + Nearline/Coldline lifecycle | Storage Account + Cool/Archive lifecycle |
| Scope | Per-region, per-account | Per-project, per-service | Per-resource |
| Log categories | Single config (text/image/embedding toggles) | Three audit log types (Admin/Data Read/Data Write) | Three categories (Audit/RequestResponse/Trace) |
The biggest Azure difference: diagnostic settings are a generic Azure Monitor pattern that works identically across all Azure services, not an AI-specific feature. The same azurerm_monitor_diagnostic_setting you'd use for a SQL database or Key Vault works for AI Foundry. This makes it familiar if you're already on Azure, but it also means the AI-specific logging depth (like full prompt/response capture) requires additional architecture.
βοΈ What's Next
This is Post 3 of the Azure AI Infrastructure with Terraform series.
- Post 1: Deploy AI Foundry: First AI Endpoint
- Post 2: AI Foundry Content Safety π‘οΈ
- Post 3: Diagnostic Logging (you are here) π
Every AI Foundry call now has a paper trail. Operation names, durations, status codes, and caller identity - all flowing to Log Analytics and Storage, all managed by Terraform, all queryable with KQL. π
Found this helpful? Follow for the full Azure AI Infrastructure with Terraform series! π¬
Top comments (0)