Single agents hit their limits on complex workflows. Microsoft Agent Framework gives you sequential, concurrent, handoff, and group chat orchestration patterns for multi-agent systems. Here's how to build them with Terraform provisioning the infrastructure.
In the previous posts, we deployed a single Azure AI agent with function calling. That handles focused tasks. But real business processes span multiple domains: a customer onboarding flow needs compliance checking, document processing, account creation, and welcome messaging. One agent with all those responsibilities performs poorly.
Microsoft Agent Framework is the open-source SDK (successor to Semantic Kernel and AutoGen) that provides orchestration patterns for multi-agent systems. You define specialized agents, then compose them into workflows - sequential pipelines, concurrent execution, handoff chains, and group chat patterns. Terraform provisions the Azure infrastructure; Python defines the agent team. π―
ποΈ Three Agent Types in Foundry
Before diving into orchestration, understand the three agent types Foundry Agent Service supports:
| Agent Type | Definition | Orchestration | Best For |
|---|---|---|---|
| Prompt agents | Configuration only (portal/API) | Managed by Foundry | Rapid prototyping, simple tasks |
| Workflow agents | Visual/YAML definitions | Sequential, branching, group chat | Multi-step business processes |
| Hosted agents | Your code in a container | Full custom control | Complex multi-agent systems |
This post focuses on building multi-agent systems with Agent Framework's orchestration patterns, deployable as hosted agents or run locally.
π§ Four Orchestration Patterns
Pattern 1: Sequential Pipeline
Agents run one after another. The output of one feeds the next:
from agent_framework.azure import AzureOpenAIChatClient
from agent_framework.orchestrations import SequentialBuilder
from azure.identity import AzureCliCredential
client = AzureOpenAIChatClient(credential=AzureCliCredential())
researcher = client.as_agent(
name="researcher",
instructions="Research the given topic thoroughly. Provide detailed findings.",
)
writer = client.as_agent(
name="writer",
instructions="Based on the research provided, write a clear blog post draft.",
)
editor = client.as_agent(
name="editor",
instructions="Review the draft for clarity, grammar, and tone. Output the final version.",
)
# Build sequential pipeline
pipeline = (
SequentialBuilder()
.add(researcher)
.add(writer)
.add(editor)
.build()
)
result = await pipeline.invoke("Write about the impact of AI on healthcare")
print(result)
Each agent receives the previous agent's output as context. The pipeline produces a final result after all agents have processed.
Pattern 2: Concurrent Execution
Independent tasks run simultaneously for speed:
from agent_framework.orchestrations import ConcurrentBuilder
energy_analyst = client.as_agent(
name="energy_analyst",
instructions="Analyze renewable energy market trends.",
)
ev_analyst = client.as_agent(
name="ev_analyst",
instructions="Analyze electric vehicle adoption trends.",
)
# Run both analysts concurrently
concurrent = (
ConcurrentBuilder()
.add(energy_analyst)
.add(ev_analyst)
.build()
)
results = await concurrent.invoke("Analyze 2025 sustainability trends")
# results contains outputs from both analysts
Use concurrent execution when tasks are independent. Both agents run at the same time, reducing total latency.
Pattern 3: Handoff Chain
One agent transfers control to another based on the conversation:
from agent_framework.orchestrations import HandoffBuilder
order_agent = client.as_agent(
name="order_agent",
instructions="""You handle order lookups and cancellations.
If the customer asks about payments or refunds,
transfer to payments_agent.""",
)
payments_agent = client.as_agent(
name="payments_agent",
instructions="""You handle refunds and billing inquiries.
If the customer asks about order status,
transfer to order_agent.""",
)
# Build handoff chain
handoff = (
HandoffBuilder()
.add(order_agent)
.add(payments_agent)
.build()
)
result = await handoff.invoke(
"I need to cancel order #1234 and get a refund"
)
The handoff pattern is ideal for customer support where the conversation naturally moves between domains. Each agent decides when to transfer based on its instructions.
Pattern 4: Group Chat
Multiple agents collaborate on a shared conversation:
from agent_framework.orchestrations import GroupChatBuilder
analyst = client.as_agent(
name="analyst",
instructions="You analyze data and provide insights. Contribute your analysis to the discussion.",
)
strategist = client.as_agent(
name="strategist",
instructions="You develop strategies based on analysis. Build on the analyst's insights.",
)
critic = client.as_agent(
name="critic",
instructions="You challenge assumptions and identify risks. Push back on weak reasoning.",
)
# Build group chat
group = (
GroupChatBuilder()
.add(analyst)
.add(strategist)
.add(critic)
.set_max_rounds(3)
.build()
)
result = await group.invoke("Should we expand into the European market?")
Group chat creates a collaborative discussion where agents build on each other's contributions. max_rounds limits the conversation length. The result is the synthesized output after all rounds complete.
π§ Terraform: Infrastructure for Multi-Agent
Terraform provisions the Foundry resource, model deployments, and supporting infrastructure. The multi-agent logic lives in Python:
# agents/deployments.tf
# Powerful model for the supervisor/orchestrator
resource "azurerm_cognitive_deployment" "orchestrator_model" {
name = "${var.environment}-orchestrator"
cognitive_account_id = azurerm_cognitive_account.ai_foundry.id
sku {
name = var.orchestrator_model.sku
capacity = var.orchestrator_model.tpm
}
model {
format = "OpenAI"
name = var.orchestrator_model.name
version = var.orchestrator_model.version
}
}
# Cost-effective model for specialist agents
resource "azurerm_cognitive_deployment" "specialist_model" {
name = "${var.environment}-specialist"
cognitive_account_id = azurerm_cognitive_account.ai_foundry.id
sku {
name = var.specialist_model.sku
capacity = var.specialist_model.tpm
}
model {
format = "OpenAI"
name = var.specialist_model.name
version = var.specialist_model.version
}
}
Environment Configuration
# environments/prod.tfvars
orchestrator_model = {
name = "gpt-4o"
version = "2024-11-20"
sku = "GlobalStandard"
tpm = 80
}
specialist_model = {
name = "gpt-4o-mini"
version = "2024-07-18"
sku = "GlobalStandard"
tpm = 120
}
Use the powerful model for orchestration decisions and complex reasoning. Use the cheaper model for specialist agents that execute focused tasks. The cost difference adds up fast when multiple agents process every request.
π Combining Patterns
Real systems nest patterns. A customer support pipeline might use sequential processing with concurrent data gathering inside:
# Concurrent: gather order + payment data simultaneously
data_gathering = (
ConcurrentBuilder()
.add(order_lookup_agent)
.add(payment_lookup_agent)
.build()
)
# Sequential: gather data, then draft response, then review
full_pipeline = (
SequentialBuilder()
.add_orchestration(data_gathering)
.add(response_drafter)
.add(quality_reviewer)
.build()
)
result = await full_pipeline.invoke("Customer asking about order #1234 refund status")
π§ Workflow Agents in Foundry Portal
For teams that prefer visual orchestration over code, Foundry Agent Service supports workflow agents defined through the portal or YAML. These are declarative definitions that support:
- Sequential and branching execution paths
- Human-in-the-loop approval steps
- Agent-to-agent coordination with context sharing
- Built-in error handling and retries
Workflow agents are managed by Foundry with no custom hosting needed. Use them when the orchestration logic is straightforward and you want zero infrastructure management.
β οΈ Gotchas and Tips
Agent instructions drive routing quality. In handoff patterns, agents decide when to transfer based on their instructions. Vague instructions lead to agents trying to handle everything themselves. Be explicit about boundaries: "If the customer asks about X, transfer to Y_agent."
Cost multiplier. Every agent in a pipeline or group chat consumes tokens independently. A 4-agent sequential pipeline uses roughly 4x the tokens of a single agent. Use gpt-4o-mini for specialist agents to keep costs manageable.
Group chat rounds. Always set max_rounds in group chat patterns. Without it, agents can loop indefinitely, debating each other and burning tokens.
State management. Agent Framework provides session-based state management for long-running workflows. For workflows that span minutes or hours, leverage the built-in state persistence rather than building your own.
Streaming support. All orchestration patterns support streaming. For user-facing applications, stream the final agent's output while the pipeline processes internally.
βοΈ What's Next
This is Post 3 of the Azure AI Agents with Terraform series.
- Post 1: Deploy First Azure AI Agent π€
- Post 2: Function Calling - Connect to APIs π
- Post 3: Multi-Agent Orchestration (you are here) π§
- Post 4: Agent + Bing Grounding
Your single agent is now a team. Sequential pipelines, concurrent fan-out, handoff chains, and group chat debates - Agent Framework gives you the patterns. Terraform provisions the infrastructure. Python defines the workflow. π§
Found this helpful? Follow for the full AI Agents with Terraform series! π¬
Top comments (0)