Brian Spann

Posted on Feb 18

Azure AI Agent Service Part 5: Deploying Multi-Agent Systems to Azure AI Foundry

#ai #csharp #dotnet #azure

Deploying Multi-Agent Systems to Azure AI Foundry

Part 5 of 5: From Development to Production

Welcome to the finale of our Azure AI Agent Service series! We've covered a lot of ground—from foundational concepts to building your first agent, advanced patterns, and multi-agent orchestration. Now it's time to take everything we've built and deploy it to production.

In this final installment, we'll walk through deploying multi-agent systems to Azure AI Foundry, covering infrastructure setup, CI/CD pipelines, scaling strategies, and monitoring. By the end, you'll have a complete production-ready deployment pipeline.

Understanding Azure AI Foundry

Azure AI Foundry (formerly Azure AI Studio) is Microsoft's unified platform for building, deploying, and managing AI applications. Think of it as the production home for your agents—providing the infrastructure, security, and observability you need for enterprise deployments.

Key Components for Agent Deployments

Component	Purpose
AI Hub	Central governance and resource sharing
AI Project	Isolated workspace for your agent application
Connections	Secure credentials for Azure OpenAI, storage, etc.
Deployments	Model endpoints your agents consume
Prompt Flow	Orchestration runtime for complex agent flows

Why Azure AI Foundry for Agents?

Integrated Security - Managed identities, key vault integration, and VNET support
Unified Monitoring - Built-in tracing with Azure Monitor and Application Insights
Model Management - Version control and A/B testing for model deployments
Cost Controls - Budgets, quotas, and consumption tracking
Compliance - Enterprise-grade data residency and privacy controls

Infrastructure as Code: Setting Up with Bicep

Let's define our production infrastructure. We'll use Bicep (Azure's declarative IaC language) to create a repeatable, version-controlled deployment.

The Complete Infrastructure Template

// main.bicep - Azure AI Foundry Infrastructure for Multi-Agent System

@description('Environment name (dev, staging, prod)')
param environment string = 'prod'

@description('Azure region for deployment')
param location string = resourceGroup().location

@description('Base name for all resources')
param baseName string = 'aiagents'

// Variables
var uniqueSuffix = uniqueString(resourceGroup().id)
var hubName = '${baseName}-hub-${environment}'
var projectName = '${baseName}-project-${environment}'
var openAiName = '${baseName}-openai-${uniqueSuffix}'
var storageAccountName = '${baseName}storage${uniqueSuffix}'
var appInsightsName = '${baseName}-insights-${environment}'
var keyVaultName = '${baseName}-kv-${uniqueSuffix}'

// Log Analytics Workspace (required for App Insights)
resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
  name: '${baseName}-logs-${environment}'
  location: location
  properties: {
    sku: {
      name: 'PerGB2018'
    }
    retentionInDays: 30
  }
}

// Application Insights for monitoring
resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: appInsightsName
  location: location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalytics.id
    publicNetworkAccessForIngestion: 'Enabled'
    publicNetworkAccessForQuery: 'Enabled'
  }
}

// Key Vault for secrets
resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
  name: keyVaultName
  location: location
  properties: {
    sku: {
      family: 'A'
      name: 'standard'
    }
    tenantId: subscription().tenantId
    enableRbacAuthorization: true
    enableSoftDelete: true
    softDeleteRetentionInDays: 7
  }
}

// Storage Account for agent artifacts
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: storageAccountName
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
  properties: {
    minimumTlsVersion: 'TLS1_2'
    supportsHttpsTrafficOnly: true
  }
}

// Azure OpenAI Service
resource openAi 'Microsoft.CognitiveServices/accounts@2023-10-01-preview' = {
  name: openAiName
  location: location
  kind: 'OpenAI'
  sku: {
    name: 'S0'
  }
  properties: {
    customSubDomainName: openAiName
    publicNetworkAccess: 'Enabled'
  }
}

// GPT-4o Deployment for agents
resource gpt4Deployment 'Microsoft.CognitiveServices/accounts/deployments@2023-10-01-preview' = {
  parent: openAi
  name: 'gpt-4o'
  sku: {
    name: 'Standard'
    capacity: 30 // TPM in thousands
  }
  properties: {
    model: {
      format: 'OpenAI'
      name: 'gpt-4o'
      version: '2024-08-06'
    }
  }
}

// AI Hub - Central governance
resource aiHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
  name: hubName
  location: location
  kind: 'Hub'
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    friendlyName: 'AI Agents Hub (${environment})'
    storageAccount: storageAccount.id
    keyVault: keyVault.id
    applicationInsights: appInsights.id
    publicNetworkAccess: 'Enabled'
  }
}

// AI Project - Your agent workspace
resource aiProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
  name: projectName
  location: location
  kind: 'Project'
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    friendlyName: 'Multi-Agent System (${environment})'
    hubResourceId: aiHub.id
    publicNetworkAccess: 'Enabled'
  }
}

// Connection to Azure OpenAI
resource openAiConnection 'Microsoft.MachineLearningServices/workspaces/connections@2024-04-01' = {
  parent: aiHub
  name: 'azure-openai-connection'
  properties: {
    category: 'AzureOpenAI'
    target: openAi.properties.endpoint
    authType: 'AAD'
    metadata: {
      ApiType: 'Azure'
      ResourceId: openAi.id
    }
  }
}

// Outputs for CI/CD
output projectId string = aiProject.id
output projectName string = aiProject.name
output openAiEndpoint string = openAi.properties.endpoint
output appInsightsConnectionString string = appInsights.properties.ConnectionString
output storageAccountName string = storageAccount.name

Deploying the Infrastructure

# Create resource group
az group create --name rg-aiagents-prod --location eastus2

# Deploy infrastructure
az deployment group create \
  --resource-group rg-aiagents-prod \
  --template-file main.bicep \
  --parameters environment=prod baseName=myagents

# Get outputs for application configuration
az deployment group show \
  --resource-group rg-aiagents-prod \
  --name main \
  --query properties.outputs

Deploying Agents to Production

With infrastructure in place, let's deploy our multi-agent system. We'll create a containerized application that hosts our agents.

Production Agent Host

// Program.cs - Production Agent Host
using Azure.AI.Projects;
using Azure.Identity;
using Azure.Monitor.OpenTelemetry.AspNetCore;
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using System.Text.Json;

var builder = WebApplication.CreateBuilder(args);

// Configure Azure Monitor for distributed tracing
builder.Services.AddOpenTelemetry()
    .UseAzureMonitor(options =>
    {
        options.ConnectionString = builder.Configuration["ApplicationInsights:ConnectionString"];
    });

// Register AI Project client
builder.Services.AddSingleton(sp =>
{
    var connectionString = builder.Configuration["AzureAI:ProjectConnectionString"];
    return new AIProjectClient(connectionString, new DefaultAzureCredential());
});

// Register our agent orchestrator
builder.Services.AddSingleton<MultiAgentOrchestrator>();
builder.Services.AddHostedService<AgentInitializationService>();

// Health checks
builder.Services.AddHealthChecks()
    .AddCheck<AgentHealthCheck>("agents")
    .AddCheck<OpenAIHealthCheck>("openai");

var app = builder.Build();

// Health endpoints for Kubernetes/Container Apps
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = _ => false // Just checks if app is running
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains("ready")
});

// Agent API endpoints
app.MapPost("/api/orchestrate", async (
    OrchestrationRequest request,
    MultiAgentOrchestrator orchestrator,
    CancellationToken ct) =>
{
    var result = await orchestrator.ExecuteAsync(request.Task, request.Context, ct);
    return Results.Ok(result);
});

app.MapGet("/api/agents", (MultiAgentOrchestrator orchestrator) =>
{
    return Results.Ok(orchestrator.GetAgentStatus());
});

app.Run();

// Request model
public record OrchestrationRequest(string Task, Dictionary<string, object>? Context = null);

Agent Initialization Service

// Services/AgentInitializationService.cs
public class AgentInitializationService : IHostedService
{
    private readonly AIProjectClient _client;
    private readonly MultiAgentOrchestrator _orchestrator;
    private readonly ILogger<AgentInitializationService> _logger;
    private readonly IConfiguration _configuration;

    public AgentInitializationService(
        AIProjectClient client,
        MultiAgentOrchestrator orchestrator,
        ILogger<AgentInitializationService> logger,
        IConfiguration configuration)
    {
        _client = client;
        _orchestrator = orchestrator;
        _logger = logger;
        _configuration = configuration;
    }

    public async Task StartAsync(CancellationToken cancellationToken)
    {
        _logger.LogInformation("Initializing production agents...");

        var agentsClient = _client.GetAgentsClient();

        // Load agent configurations from settings
        var agentConfigs = _configuration
            .GetSection("Agents")
            .Get<List<AgentConfiguration>>() ?? new();

        foreach (var config in agentConfigs)
        {
            try
            {
                var agent = await GetOrCreateAgentAsync(agentsClient, config, cancellationToken);
                _orchestrator.RegisterAgent(config.Role, agent);
                _logger.LogInformation("Registered agent: {Role} ({Id})", config.Role, agent.Id);
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to initialize agent: {Role}", config.Role);
                throw; // Fail fast in production
            }
        }

        _logger.LogInformation("All {Count} agents initialized successfully", agentConfigs.Count);
    }

    private async Task<Agent> GetOrCreateAgentAsync(
        AgentsClient client, 
        AgentConfiguration config,
        CancellationToken ct)
    {
        // Try to find existing agent by name (idempotent deployments)
        await foreach (var agent in client.GetAgentsAsync(ct))
        {
            if (agent.Name == config.Name)
            {
                _logger.LogInformation("Found existing agent: {Name}", config.Name);

                // Update if instructions changed
                if (agent.Instructions != config.Instructions)
                {
                    return await client.UpdateAgentAsync(
                        agent.Id,
                        instructions: config.Instructions,
                        cancellationToken: ct);
                }
                return agent;
            }
        }

        // Create new agent
        _logger.LogInformation("Creating new agent: {Name}", config.Name);
        return await client.CreateAgentAsync(
            model: config.Model,
            name: config.Name,
            instructions: config.Instructions,
            tools: config.GetToolDefinitions(),
            cancellationToken: ct);
    }

    public Task StopAsync(CancellationToken cancellationToken)
    {
        _logger.LogInformation("Agent host shutting down");
        return Task.CompletedTask;
    }
}

// Configuration model
public class AgentConfiguration
{
    public string Name { get; set; } = "";
    public string Role { get; set; } = "";
    public string Model { get; set; } = "gpt-4o";
    public string Instructions { get; set; } = "";
    public List<string> Tools { get; set; } = new();

    public IEnumerable<ToolDefinition> GetToolDefinitions()
    {
        foreach (var tool in Tools)
        {
            yield return tool switch
            {
                "code_interpreter" => new CodeInterpreterToolDefinition(),
                "file_search" => new FileSearchToolDefinition(),
                _ => throw new ArgumentException($"Unknown tool: {tool}")
            };
        }
    }
}

Production Configuration

// appsettings.Production.json
{
  "AzureAI": {
    "ProjectConnectionString": "Set via environment variable"
  },
  "ApplicationInsights": {
    "ConnectionString": "Set via environment variable"
  },
  "Agents": [
    {
      "Name": "coordinator-agent-prod",
      "Role": "Coordinator",
      "Model": "gpt-4o",
      "Instructions": "You are the coordinator agent. Analyze incoming requests and delegate to specialist agents. Always validate outputs before returning.",
      "Tools": []
    },
    {
      "Name": "research-agent-prod",
      "Role": "Researcher",
      "Model": "gpt-4o",
      "Instructions": "You are a research specialist. Gather and synthesize information accurately. Always cite sources.",
      "Tools": ["file_search"]
    },
    {
      "Name": "analyst-agent-prod",
      "Role": "Analyst",
      "Model": "gpt-4o",
      "Instructions": "You are a data analyst. Process data, generate insights, and create visualizations when helpful.",
      "Tools": ["code_interpreter"]
    }
  ],
  "Orchestration": {
    "MaxConcurrentTasks": 10,
    "TaskTimeoutSeconds": 300,
    "RetryPolicy": {
      "MaxRetries": 3,
      "DelaySeconds": 2
    }
  }
}

Scaling Considerations

Multi-agent systems have unique scaling challenges. Here's how to handle them:

Horizontal Scaling with Azure Container Apps

// container-app.bicep
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
  name: 'aiagent-host'
  location: location
  properties: {
    environmentId: containerAppEnvironment.id
    configuration: {
      ingress: {
        external: true
        targetPort: 8080
        transport: 'http'
      }
      secrets: [
        {
          name: 'ai-connection-string'
          keyVaultUrl: '${keyVault.properties.vaultUri}secrets/ai-connection-string'
          identity: 'system'
        }
      ]
    }
    template: {
      containers: [
        {
          name: 'agent-host'
          image: '${containerRegistry.properties.loginServer}/agent-host:${imageTag}'
          resources: {
            cpu: json('1.0')
            memory: '2Gi'
          }
          env: [
            {
              name: 'AzureAI__ProjectConnectionString'
              secretRef: 'ai-connection-string'
            }
          ]
          probes: [
            {
              type: 'Liveness'
              httpGet: {
                path: '/health/live'
                port: 8080
              }
              periodSeconds: 10
            }
            {
              type: 'Readiness'
              httpGet: {
                path: '/health/ready'
                port: 8080
              }
              periodSeconds: 5
            }
          ]
        }
      ]
      scale: {
        minReplicas: 2
        maxReplicas: 10
        rules: [
          {
            name: 'http-scaling'
            http: {
              metadata: {
                concurrentRequests: '20'
              }
            }
          }
          {
            name: 'cpu-scaling'
            custom: {
              type: 'cpu'
              metadata: {
                type: 'Utilization'
                value: '70'
              }
            }
          }
        ]
      }
    }
  }
}

Rate Limiting & Token Management

// Services/TokenBudgetManager.cs
public class TokenBudgetManager
{
    private readonly SemaphoreSlim _semaphore;
    private readonly ILogger<TokenBudgetManager> _logger;
    private int _tokensUsedThisMinute;
    private DateTime _windowStart = DateTime.UtcNow;
    private readonly int _tokensPerMinute;

    public TokenBudgetManager(IConfiguration config, ILogger<TokenBudgetManager> logger)
    {
        _tokensPerMinute = config.GetValue<int>("RateLimits:TokensPerMinute", 30000);
        _semaphore = new SemaphoreSlim(1, 1);
        _logger = logger;
    }

    public async Task<bool> TryAcquireAsync(int estimatedTokens, CancellationToken ct)
    {
        await _semaphore.WaitAsync(ct);
        try
        {
            ResetWindowIfNeeded();

            if (_tokensUsedThisMinute + estimatedTokens > _tokensPerMinute)
            {
                _logger.LogWarning(
                    "Token budget exceeded. Used: {Used}, Requested: {Requested}, Limit: {Limit}",
                    _tokensUsedThisMinute, estimatedTokens, _tokensPerMinute);
                return false;
            }

            _tokensUsedThisMinute += estimatedTokens;
            return true;
        }
        finally
        {
            _semaphore.Release();
        }
    }

    public void RecordActualUsage(int actualTokens, int estimated)
    {
        var difference = actualTokens - estimated;
        Interlocked.Add(ref _tokensUsedThisMinute, difference);
    }

    private void ResetWindowIfNeeded()
    {
        if (DateTime.UtcNow - _windowStart > TimeSpan.FromMinutes(1))
        {
            _tokensUsedThisMinute = 0;
            _windowStart = DateTime.UtcNow;
        }
    }
}

CI/CD Pipeline for Agent Deployments

Here's a production-ready GitHub Actions pipeline:

# .github/workflows/deploy-agents.yml
name: Deploy Multi-Agent System

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  AZURE_RESOURCE_GROUP: rg-aiagents-prod
  CONTAINER_REGISTRY: myagentscr.azurecr.io
  IMAGE_NAME: agent-host

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup .NET
        uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '8.0.x'

      - name: Restore dependencies
        run: dotnet restore

      - name: Build
        run: dotnet build --no-restore

      - name: Run unit tests
        run: dotnet test --no-build --verbosity normal

      - name: Run integration tests
        run: dotnet test --filter Category=Integration --verbosity normal
        env:
          AZURE_AI_CONNECTION_STRING: ${{ secrets.AZURE_AI_CONNECTION_STRING_TEST }}

  build-and-push:
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    outputs:
      image-tag: ${{ steps.meta.outputs.version }}

    steps:
      - uses: actions/checkout@v4

      - name: Azure Login
        uses: azure/login@v2
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}

      - name: Login to ACR
        run: az acr login --name myagentscr

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=
            type=raw,value=latest

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

  deploy-staging:
    needs: build-and-push
    runs-on: ubuntu-latest
    environment: staging

    steps:
      - uses: actions/checkout@v4

      - name: Azure Login
        uses: azure/login@v2
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}

      - name: Deploy to staging
        run: |
          az containerapp update \
            --name aiagent-host-staging \
            --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
            --image ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.build-and-push.outputs.image-tag }}

      - name: Run smoke tests
        run: |
          STAGING_URL=$(az containerapp show --name aiagent-host-staging --resource-group ${{ env.AZURE_RESOURCE_GROUP }} --query properties.configuration.ingress.fqdn -o tsv)
          curl -f https://$STAGING_URL/health/ready || exit 1
          curl -f -X POST https://$STAGING_URL/api/orchestrate \
            -H "Content-Type: application/json" \
            -d '{"task": "smoke test: return OK"}' || exit 1

  deploy-production:
    needs: [build-and-push, deploy-staging]
    runs-on: ubuntu-latest
    environment: production

    steps:
      - uses: actions/checkout@v4

      - name: Azure Login
        uses: azure/login@v2
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}

      - name: Deploy to production
        run: |
          az containerapp update \
            --name aiagent-host \
            --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
            --image ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.build-and-push.outputs.image-tag }}

      - name: Verify deployment
        run: |
          PROD_URL=$(az containerapp show --name aiagent-host --resource-group ${{ env.AZURE_RESOURCE_GROUP }} --query properties.configuration.ingress.fqdn -o tsv)
          # Wait for healthy status
          for i in {1..30}; do
            if curl -sf https://$PROD_URL/health/ready; then
              echo "Deployment successful!"
              exit 0
            fi
            sleep 10
          done
          echo "Deployment verification failed"
          exit 1

Monitoring Deployed Agents

Visibility into your agent system is crucial. Here's a comprehensive monitoring setup:

Custom Telemetry

// Services/AgentTelemetry.cs
public class AgentTelemetry
{
    private readonly ILogger<AgentTelemetry> _logger;
    private readonly TelemetryClient _telemetry;

    public AgentTelemetry(ILogger<AgentTelemetry> logger, TelemetryClient telemetry)
    {
        _logger = logger;
        _telemetry = telemetry;
    }

    public IDisposable TrackAgentOperation(string agentName, string operation)
    {
        var operationId = Activity.Current?.Id ?? Guid.NewGuid().ToString();

        return new AgentOperationScope(
            _telemetry,
            agentName,
            operation,
            operationId,
            _logger);
    }

    public void TrackAgentMessage(string agentName, string role, int tokenCount)
    {
        _telemetry.TrackEvent("AgentMessage", new Dictionary<string, string>
        {
            ["AgentName"] = agentName,
            ["MessageRole"] = role
        }, new Dictionary<string, double>
        {
            ["TokenCount"] = tokenCount
        });
    }

    public void TrackToolExecution(string agentName, string toolName, bool success, TimeSpan duration)
    {
        _telemetry.TrackDependency(
            "AgentTool",
            toolName,
            agentName,
            DateTimeOffset.UtcNow - duration,
            duration,
            success);
    }

    public void TrackOrchestrationComplete(
        string taskType, 
        int agentsInvolved, 
        TimeSpan totalDuration,
        bool success)
    {
        _telemetry.TrackEvent("OrchestrationComplete", new Dictionary<string, string>
        {
            ["TaskType"] = taskType,
            ["Success"] = success.ToString()
        }, new Dictionary<string, double>
        {
            ["AgentsInvolved"] = agentsInvolved,
            ["DurationMs"] = totalDuration.TotalMilliseconds
        });

        _telemetry.GetMetric("orchestration.duration").TrackValue(totalDuration.TotalMilliseconds);
        _telemetry.GetMetric("orchestration.agents_per_task").TrackValue(agentsInvolved);
    }

    private class AgentOperationScope : IDisposable
    {
        private readonly TelemetryClient _telemetry;
        private readonly string _agentName;
        private readonly string _operation;
        private readonly Stopwatch _stopwatch;
        private readonly ILogger _logger;
        private bool _disposed;

        public AgentOperationScope(
            TelemetryClient telemetry, 
            string agentName, 
            string operation,
            string operationId,
            ILogger logger)
        {
            _telemetry = telemetry;
            _agentName = agentName;
            _operation = operation;
            _logger = logger;
            _stopwatch = Stopwatch.StartNew();

            _logger.LogInformation(
                "Starting agent operation: {Agent}.{Operation} [{OperationId}]",
                agentName, operation, operationId);
        }

        public void Dispose()
        {
            if (_disposed) return;
            _disposed = true;

            _stopwatch.Stop();

            _telemetry.TrackMetric($"agent.{_agentName}.{_operation}.duration", 
                _stopwatch.ElapsedMilliseconds);

            _logger.LogInformation(
                "Completed agent operation: {Agent}.{Operation} in {Duration}ms",
                _agentName, _operation, _stopwatch.ElapsedMilliseconds);
        }
    }
}

Azure Monitor Dashboard (ARM Template)

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "resources": [
    {
      "type": "Microsoft.Portal/dashboards",
      "apiVersion": "2020-09-01-preview",
      "name": "agent-system-dashboard",
      "location": "[resourceGroup().location]",
      "properties": {
        "lenses": [
          {
            "parts": [
              {
                "position": { "x": 0, "y": 0, "colSpan": 6, "rowSpan": 4 },
                "metadata": {
                  "type": "Extension/Microsoft_Azure_Monitoring/PartType/MetricsChartPart",
                  "settings": {
                    "title": "Agent Response Times",
                    "chartType": "Line",
                    "metrics": [
                      {
                        "resourceId": "[parameters('appInsightsId')]",
                        "name": "customMetrics/orchestration.duration",
                        "aggregationType": "Average"
                      }
                    ]
                  }
                }
              },
              {
                "position": { "x": 6, "y": 0, "colSpan": 6, "rowSpan": 4 },
                "metadata": {
                  "type": "Extension/Microsoft_Azure_Monitoring/PartType/MetricsChartPart",
                  "settings": {
                    "title": "Token Usage by Agent",
                    "chartType": "Bar"
                  }
                }
              }
            ]
          }
        ]
      }
    }
  ]
}

Key Metrics to Monitor

Metric	Alert Threshold	Action
Agent Response Time (p95)	> 30s	Scale out or optimize prompts
Token Usage Rate	> 80% of TPM	Upgrade quota or add throttling
Error Rate	> 5%	Investigate logs, check model health
Concurrent Tasks	> 80% capacity	Scale out
Tool Execution Failures	> 10%	Check tool configurations

Series Wrap-Up

Congratulations! You've made it through the complete Azure AI Agent Service journey. Let's recap what we've covered:

Series Summary

Part	Topic	Key Takeaways
1	Foundations	Understanding agents, Azure AI Agent Service architecture
2	First Agent	Creating agents, conversations, and basic tool usage
3	Advanced Patterns	Stateful agents, complex tools, error handling
4	Multi-Agent Systems	Orchestration patterns, agent communication
5	Production Deployment	Infrastructure, CI/CD, scaling, monitoring

What's Next?

The agent ecosystem is evolving rapidly. Here's where to focus next:

Explore Semantic Kernel - Microsoft's SDK for AI orchestration complements Azure AI Agent Service beautifully for complex workflows.
Experiment with Specialized Models - As new models emerge, consider specialized agents for different tasks (fast models for routing, powerful models for complex reasoning).
Implement Evaluation Pipelines - Build automated evaluations for agent responses using frameworks like Promptflow evaluations.
Consider Hybrid Architectures - Combine agents with traditional APIs and workflows for the best of both worlds.
Stay Current - Follow the Azure AI documentation and Azure updates for new capabilities.

Resources

GitHub Repository: Azure AI Agent Service Samples
Documentation: Azure AI Foundry Docs
Community: Azure AI Discord
SDK Reference: Azure.AI.Projects NuGet

Final Thoughts

Building production multi-agent systems is a journey that combines software engineering fundamentals with the emerging patterns of AI development. The key principles remain the same: observability, reliability, and iterative improvement.

Start small, measure everything, and evolve your system based on real-world feedback. The infrastructure and patterns we've covered give you a solid foundation—now it's time to build something amazing.

Thank you for joining me on this series. I'd love to hear about what you're building with Azure AI Agent Service. Drop a comment below.

Happy building! 🚀

This is Part 5 of 5 in the Azure AI Agent Service series. Check out Part 1, Part 2, Part 3, and Part 4 if you haven't already.

DEV Community