Suhas Mallesh

Posted on Mar 23

Multi-Agent Orchestration on Azure: Workflow Patterns with Agent Framework and Terraform 🧠

#azure #ai #agents #devops

Single agents hit their limits on complex workflows. Microsoft Agent Framework gives you sequential, concurrent, handoff, and group chat orchestration patterns for multi-agent systems. Here's how to build them with Terraform provisioning the infrastructure.

In the previous posts, we deployed a single Azure AI agent with function calling. That handles focused tasks. But real business processes span multiple domains: a customer onboarding flow needs compliance checking, document processing, account creation, and welcome messaging. One agent with all those responsibilities performs poorly.

Microsoft Agent Framework is the open-source SDK (successor to Semantic Kernel and AutoGen) that provides orchestration patterns for multi-agent systems. You define specialized agents, then compose them into workflows - sequential pipelines, concurrent execution, handoff chains, and group chat patterns. Terraform provisions the Azure infrastructure; Python defines the agent team. 🎯

🏗️ Three Agent Types in Foundry

Before diving into orchestration, understand the three agent types Foundry Agent Service supports:

Agent Type	Definition	Orchestration	Best For
Prompt agents	Configuration only (portal/API)	Managed by Foundry	Rapid prototyping, simple tasks
Workflow agents	Visual/YAML definitions	Sequential, branching, group chat	Multi-step business processes
Hosted agents	Your code in a container	Full custom control	Complex multi-agent systems

This post focuses on building multi-agent systems with Agent Framework's orchestration patterns, deployable as hosted agents or run locally.

🔧 Four Orchestration Patterns

Pattern 1: Sequential Pipeline

Agents run one after another. The output of one feeds the next:

from agent_framework.azure import AzureOpenAIChatClient
from agent_framework.orchestrations import SequentialBuilder
from azure.identity import AzureCliCredential

client = AzureOpenAIChatClient(credential=AzureCliCredential())

researcher = client.as_agent(
    name="researcher",
    instructions="Research the given topic thoroughly. Provide detailed findings.",
)

writer = client.as_agent(
    name="writer",
    instructions="Based on the research provided, write a clear blog post draft.",
)

editor = client.as_agent(
    name="editor",
    instructions="Review the draft for clarity, grammar, and tone. Output the final version.",
)

# Build sequential pipeline
pipeline = (
    SequentialBuilder()
    .add(researcher)
    .add(writer)
    .add(editor)
    .build()
)

result = await pipeline.invoke("Write about the impact of AI on healthcare")
print(result)

Each agent receives the previous agent's output as context. The pipeline produces a final result after all agents have processed.

Pattern 2: Concurrent Execution

Independent tasks run simultaneously for speed:

from agent_framework.orchestrations import ConcurrentBuilder

energy_analyst = client.as_agent(
    name="energy_analyst",
    instructions="Analyze renewable energy market trends.",
)

ev_analyst = client.as_agent(
    name="ev_analyst",
    instructions="Analyze electric vehicle adoption trends.",
)

# Run both analysts concurrently
concurrent = (
    ConcurrentBuilder()
    .add(energy_analyst)
    .add(ev_analyst)
    .build()
)

results = await concurrent.invoke("Analyze 2025 sustainability trends")
# results contains outputs from both analysts

Use concurrent execution when tasks are independent. Both agents run at the same time, reducing total latency.

Pattern 3: Handoff Chain

One agent transfers control to another based on the conversation:

from agent_framework.orchestrations import HandoffBuilder

order_agent = client.as_agent(
    name="order_agent",
    instructions="""You handle order lookups and cancellations.
    If the customer asks about payments or refunds,
    transfer to payments_agent.""",
)

payments_agent = client.as_agent(
    name="payments_agent",
    instructions="""You handle refunds and billing inquiries.
    If the customer asks about order status,
    transfer to order_agent.""",
)

# Build handoff chain
handoff = (
    HandoffBuilder()
    .add(order_agent)
    .add(payments_agent)
    .build()
)

result = await handoff.invoke(
    "I need to cancel order #1234 and get a refund"
)

The handoff pattern is ideal for customer support where the conversation naturally moves between domains. Each agent decides when to transfer based on its instructions.

Pattern 4: Group Chat

Multiple agents collaborate on a shared conversation:

from agent_framework.orchestrations import GroupChatBuilder

analyst = client.as_agent(
    name="analyst",
    instructions="You analyze data and provide insights. Contribute your analysis to the discussion.",
)

strategist = client.as_agent(
    name="strategist",
    instructions="You develop strategies based on analysis. Build on the analyst's insights.",
)

critic = client.as_agent(
    name="critic",
    instructions="You challenge assumptions and identify risks. Push back on weak reasoning.",
)

# Build group chat
group = (
    GroupChatBuilder()
    .add(analyst)
    .add(strategist)
    .add(critic)
    .set_max_rounds(3)
    .build()
)

result = await group.invoke("Should we expand into the European market?")

Group chat creates a collaborative discussion where agents build on each other's contributions. max_rounds limits the conversation length. The result is the synthesized output after all rounds complete.

🔧 Terraform: Infrastructure for Multi-Agent

Terraform provisions the Foundry resource, model deployments, and supporting infrastructure. The multi-agent logic lives in Python:

# agents/deployments.tf

# Powerful model for the supervisor/orchestrator
resource "azurerm_cognitive_deployment" "orchestrator_model" {
  name                 = "${var.environment}-orchestrator"
  cognitive_account_id = azurerm_cognitive_account.ai_foundry.id

  sku {
    name     = var.orchestrator_model.sku
    capacity = var.orchestrator_model.tpm
  }

  model {
    format  = "OpenAI"
    name    = var.orchestrator_model.name
    version = var.orchestrator_model.version
  }
}

# Cost-effective model for specialist agents
resource "azurerm_cognitive_deployment" "specialist_model" {
  name                 = "${var.environment}-specialist"
  cognitive_account_id = azurerm_cognitive_account.ai_foundry.id

  sku {
    name     = var.specialist_model.sku
    capacity = var.specialist_model.tpm
  }

  model {
    format  = "OpenAI"
    name    = var.specialist_model.name
    version = var.specialist_model.version
  }
}

Environment Configuration

# environments/prod.tfvars

orchestrator_model = {
  name    = "gpt-4o"
  version = "2024-11-20"
  sku     = "GlobalStandard"
  tpm     = 80
}

specialist_model = {
  name    = "gpt-4o-mini"
  version = "2024-07-18"
  sku     = "GlobalStandard"
  tpm     = 120
}

Use the powerful model for orchestration decisions and complex reasoning. Use the cheaper model for specialist agents that execute focused tasks. The cost difference adds up fast when multiple agents process every request.

📐 Combining Patterns

Real systems nest patterns. A customer support pipeline might use sequential processing with concurrent data gathering inside:

# Concurrent: gather order + payment data simultaneously
data_gathering = (
    ConcurrentBuilder()
    .add(order_lookup_agent)
    .add(payment_lookup_agent)
    .build()
)

# Sequential: gather data, then draft response, then review
full_pipeline = (
    SequentialBuilder()
    .add_orchestration(data_gathering)
    .add(response_drafter)
    .add(quality_reviewer)
    .build()
)

result = await full_pipeline.invoke("Customer asking about order #1234 refund status")

🔧 Workflow Agents in Foundry Portal

For teams that prefer visual orchestration over code, Foundry Agent Service supports workflow agents defined through the portal or YAML. These are declarative definitions that support:

Sequential and branching execution paths
Human-in-the-loop approval steps
Agent-to-agent coordination with context sharing
Built-in error handling and retries

Workflow agents are managed by Foundry with no custom hosting needed. Use them when the orchestration logic is straightforward and you want zero infrastructure management.

⚠️ Gotchas and Tips

Agent instructions drive routing quality. In handoff patterns, agents decide when to transfer based on their instructions. Vague instructions lead to agents trying to handle everything themselves. Be explicit about boundaries: "If the customer asks about X, transfer to Y_agent."

Cost multiplier. Every agent in a pipeline or group chat consumes tokens independently. A 4-agent sequential pipeline uses roughly 4x the tokens of a single agent. Use gpt-4o-mini for specialist agents to keep costs manageable.

Group chat rounds. Always set max_rounds in group chat patterns. Without it, agents can loop indefinitely, debating each other and burning tokens.

State management. Agent Framework provides session-based state management for long-running workflows. For workflows that span minutes or hours, leverage the built-in state persistence rather than building your own.

Streaming support. All orchestration patterns support streaming. For user-facing applications, stream the final agent's output while the pipeline processes internally.

⏭️ What's Next

This is Post 3 of the Azure AI Agents with Terraform series.

Post 1: Deploy First Azure AI Agent 🤖
Post 2: Function Calling - Connect to APIs 🔌
Post 3: Multi-Agent Orchestration (you are here) 🧠
Post 4: Agent + Bing Grounding

Your single agent is now a team. Sequential pipelines, concurrent fan-out, handoff chains, and group chat debates - Agent Framework gives you the patterns. Terraform provisions the infrastructure. Python defines the workflow. 🧠

Found this helpful? Follow for the full AI Agents with Terraform series! 💬

DEV Community