Brian Spann

Posted on Feb 23

Agent Framework Workflows: Beyond Chat — Orchestrating Complex AI Tasks

#azure #csharp #dotnet #ai

Introduction

In Part 1 of this series, we explored how Microsoft Agent Framework unifies Semantic Kernel and AutoGen into a cohesive SDK. We built simple agents, added tools, and managed conversations.

But real-world AI applications often require more than a single agent responding to queries. You need:

Multi-step processes with explicit ordering
Multiple agents collaborating on different aspects of a task
Conditional branching based on intermediate results
Human approval at critical decision points
Durability so long-running tasks survive failures

This is where Workflows come in.

When to Use Workflows vs. Single Agents

Before diving in, let's clarify when workflows make sense:

Scenario	Recommendation
Simple Q&A, chat interfaces	Single agent
Content generation with review cycles	Workflow
Data processing pipelines	Workflow
Tasks requiring human approval	Workflow
Complex research with multiple perspectives	Workflow
Long-running processes (hours/days)	Workflow with checkpointing

The rule of thumb: if your task has explicit steps that should happen in a defined order, or if multiple agents need to collaborate, use a workflow.

Workflow Fundamentals

A workflow in Agent Framework consists of:

Steps: Individual units of work
Transitions: How steps connect to each other
Context: Shared state that flows through the workflow
Agents: The AI agents that execute steps

Your First Workflow

Let's build a simple content creation workflow:

using Microsoft.Agents.AI;
using Microsoft.Agents.AI.Workflows;

// Create agents
var researcher = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "Researcher",
    Instructions = """
        You are a research specialist. Given a topic, you:
        1. Identify key aspects to cover
        2. Find relevant facts and statistics
        3. Note any controversies or debates
        4. Summarize your findings in a structured format
        """
});

var writer = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "Writer",
    Instructions = """
        You are a content writer. Given research notes, you:
        1. Create an engaging narrative
        2. Use clear, accessible language
        3. Include relevant examples
        4. Structure with headers and bullet points
        """
});

var editor = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "Editor",
    Instructions = """
        You are an editor. Review content for:
        1. Factual accuracy
        2. Grammar and style
        3. Clarity and flow
        4. Engagement
        Provide specific, actionable feedback.
        """
});

// Build the workflow
var workflow = new WorkflowBuilder("content-pipeline")
    .AddStep("research", async ctx =>
    {
        var topic = ctx.GetInput<string>("topic");
        var result = await researcher.InvokeAsync(
            $"Research this topic thoroughly: {topic}");
        ctx.Set("research_notes", result.Content);
    })
    .AddStep("write", async ctx =>
    {
        var notes = ctx.Get<string>("research_notes");
        var result = await writer.InvokeAsync(
            $"Write an article based on these research notes:\n\n{notes}");
        ctx.Set("draft", result.Content);
    })
    .AddStep("edit", async ctx =>
    {
        var draft = ctx.Get<string>("draft");
        var result = await editor.InvokeAsync(
            $"Review and improve this article:\n\n{draft}");
        ctx.Set("final_content", result.Content);
    })
    .Connect("research", "write")
    .Connect("write", "edit")
    .Build();

// Run the workflow
var context = new WorkflowContext();
context.SetInput("topic", "The impact of AI on software development in 2026");

await workflow.RunAsync(context);

Console.WriteLine(context.Get<string>("final_content"));

Understanding Workflow Context

The WorkflowContext is the shared state container that flows through your workflow:

// Setting values
context.Set("key", value);           // Any serializable type
context.SetInput("inputKey", value); // Specifically for inputs

// Getting values
var value = context.Get<T>("key");
var input = context.GetInput<T>("inputKey");

// Check existence
if (context.TryGet<T>("key", out var result)) { ... }

// Metadata
context.Metadata["executionId"] = Guid.NewGuid();
context.Metadata["startedAt"] = DateTime.UtcNow;

Conditional Branching

Real workflows aren't always linear. Let's add quality checks and revision loops:

var workflow = new WorkflowBuilder("content-with-review")
    .AddStep("research", async ctx => { /* ... */ })
    .AddStep("write", async ctx => { /* ... */ })
    .AddStep("review", async ctx =>
    {
        var draft = ctx.Get<string>("draft");
        var result = await editor.InvokeAsync(
            $"""Review this article and respond with a JSON object:
            {{
                "quality": "approved" | "needs_revision",
                "feedback": "your detailed feedback",
                "score": 1-10
            }}

            Article:
            {draft}""");

        var review = JsonSerializer.Deserialize<ReviewResult>(result.Content);
        ctx.Set("review", review);
        ctx.Set("quality", review.Quality);
    })
    .AddConditionalStep("quality_gate", ctx =>
    {
        var quality = ctx.Get<string>("quality");
        return quality == "approved" ? "publish" : "revise";
    })
    .AddStep("revise", async ctx =>
    {
        var draft = ctx.Get<string>("draft");
        var review = ctx.Get<ReviewResult>("review");
        var revisionCount = ctx.GetOrDefault("revision_count", 0);

        if (revisionCount >= 3)
        {
            // Force approve after 3 attempts
            ctx.Set("quality", "approved");
            return;
        }

        var result = await writer.InvokeAsync(
            $"""Revise this article based on the feedback:

            Current draft:
            {draft}

            Feedback:
            {review.Feedback}

            Make specific improvements addressing each point.""");

        ctx.Set("draft", result.Content);
        ctx.Set("revision_count", revisionCount + 1);
    })
    .AddStep("publish", async ctx =>
    {
        var content = ctx.Get<string>("draft");
        // Publish logic here
        ctx.Set("published", true);
        ctx.Set("published_at", DateTime.UtcNow);
    })
    // Connections
    .Connect("research", "write")
    .Connect("write", "review")
    .Connect("review", "quality_gate")
    .Connect("quality_gate", "publish", when: "publish")
    .Connect("quality_gate", "revise", when: "revise")
    .Connect("revise", "review")  // Loop back for re-review
    .Build();

This creates a revision loop:

research → write → review → quality_gate
                              ↓         ↓
                           publish    revise
                                        ↓
                                      review (loop)

Parallel Execution

Some steps can run concurrently. Agent Framework makes this explicit:

var workflow = new WorkflowBuilder("parallel-research")
    .AddStep("init", ctx =>
    {
        ctx.Set("topic", ctx.GetInput<string>("topic"));
        return Task.CompletedTask;
    })
    // These three run in parallel
    .AddParallelSteps("gather",
        ("technical", async ctx =>
        {
            var topic = ctx.Get<string>("topic");
            var result = await technicalResearcher.InvokeAsync(
                $"Research technical aspects of: {topic}");
            ctx.Set("technical_notes", result.Content);
        }),
        ("market", async ctx =>
        {
            var topic = ctx.Get<string>("topic");
            var result = await marketResearcher.InvokeAsync(
                $"Research market trends for: {topic}");
            ctx.Set("market_notes", result.Content);
        }),
        ("competition", async ctx =>
        {
            var topic = ctx.Get<string>("topic");
            var result = await competitionAnalyst.InvokeAsync(
                $"Analyze competitors in: {topic}");
            ctx.Set("competition_notes", result.Content);
        })
    )
    // This waits for all parallel steps to complete
    .AddStep("synthesize", async ctx =>
    {
        var technical = ctx.Get<string>("technical_notes");
        var market = ctx.Get<string>("market_notes");
        var competition = ctx.Get<string>("competition_notes");

        var result = await synthesizer.InvokeAsync(
            $"""Create a comprehensive report combining these perspectives:

            Technical Analysis:
            {technical}

            Market Research:
            {market}

            Competitive Analysis:
            {competition}""");

        ctx.Set("report", result.Content);
    })
    .Connect("init", "gather")
    .Connect("gather", "synthesize")
    .Build();

Parallel with Different Completion Strategies

// Wait for all (default)
.AddParallelSteps("all-required", 
    ParallelCompletion.All, 
    steps...);

// First one wins
.AddParallelSteps("race", 
    ParallelCompletion.First, 
    steps...);

// Majority must complete
.AddParallelSteps("majority", 
    ParallelCompletion.Majority, 
    steps...);

// At least N must complete
.AddParallelSteps("quorum", 
    ParallelCompletion.AtLeast(2), 
    steps...);

Human-in-the-Loop

Critical workflows often need human oversight:

var workflow = new WorkflowBuilder("human-approval")
    .AddStep("generate", async ctx => { /* ... */ })
    .AddHumanStep("approval", new HumanStepOptions
    {
        Prompt = ctx => $"Please review this content:\n\n{ctx.Get<string>("draft")}",
        Timeout = TimeSpan.FromHours(24),
        OnTimeout = HumanStepTimeoutBehavior.Reject,
        AllowedResponses = new[] { "approve", "reject", "revise" },
        // Optional: notify via webhook, email, etc.
        NotificationHandler = async (stepId, ctx) =>
        {
            await emailService.SendAsync(
                to: "reviewer@company.com",
                subject: "Content awaiting approval",
                body: ctx.Get<string>("draft"));
        }
    })
    .AddConditionalStep("route", ctx =>
    {
        return ctx.Get<HumanResponse>("approval").Decision;
    })
    .Connect("generate", "approval")
    .Connect("approval", "route")
    .Connect("route", "publish", when: "approve")
    .Connect("route", "archive", when: "reject")
    .Connect("route", "revise", when: "revise")
    .Build();

Responding to Human Steps

When a workflow is waiting for human input:

// Get pending human steps
var pending = await workflowRunner.GetPendingHumanStepsAsync();

foreach (var step in pending)
{
    Console.WriteLine($"Workflow: {step.WorkflowId}");
    Console.WriteLine($"Step: {step.StepId}");
    Console.WriteLine($"Prompt: {step.Prompt}");
    Console.WriteLine($"Waiting since: {step.CreatedAt}");
}

// Submit a response
await workflowRunner.SubmitHumanResponseAsync(
    workflowInstanceId: "abc123",
    stepId: "approval",
    response: new HumanResponse
    {
        Decision = "approve",
        Comment = "Looks good! Minor typo on line 3, but acceptable.",
        RespondedBy = "jane@company.com",
        RespondedAt = DateTime.UtcNow
    });

Checkpointing and Durability

Long-running workflows need to survive failures. Checkpointing saves the workflow state after each step:

// Configure checkpoint storage
var checkpointStore = new AzureBlobCheckpointStore(
    connectionString: config["Storage:ConnectionString"],
    containerName: "workflow-checkpoints");

var runner = new WorkflowRunner(workflow)
{
    CheckpointStore = checkpointStore,
    CheckpointFrequency = CheckpointFrequency.AfterEachStep,
    OnError = WorkflowErrorBehavior.PauseAndCheckpoint
};

// Start a workflow
var instanceId = await runner.StartAsync(context);
Console.WriteLine($"Started workflow: {instanceId}");

// The workflow runs... then your server crashes...
// Later, after restart:

// Resume any incomplete workflows
var incomplete = await runner.GetIncompleteWorkflowsAsync();
foreach (var workflow in incomplete)
{
    Console.WriteLine($"Resuming {workflow.InstanceId} from step {workflow.LastCompletedStep}");
    await runner.ResumeAsync(workflow.InstanceId);
}

Checkpoint Store Options

// Azure Blob Storage
var store = new AzureBlobCheckpointStore(connectionString, container);

// Azure Table Storage (good for many small workflows)
var store = new AzureTableCheckpointStore(connectionString, tableName);

// SQL Server
var store = new SqlCheckpointStore(connectionString);

// File system (development only)
var store = new FileCheckpointStore("./checkpoints");

// In-memory (testing only)
var store = new InMemoryCheckpointStore();

Multi-Agent Orchestration Patterns

Agent Framework provides several built-in patterns for multi-agent collaboration:

Round-Robin Chat

Agents take turns in a fixed order:

var chat = new RoundRobinGroupChat(new[] 
{ 
    analyst, 
    critic, 
    synthesizer 
});

var result = await chat.RunAsync(
    "Analyze the pros and cons of microservices vs monoliths",
    maxRounds: 3);

Selector-Based Routing

An AI selector chooses the next speaker:

var selector = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "Selector",
    Instructions = """
        You are a conversation moderator. Based on the conversation so far,
        decide which agent should speak next. Choose from:
        - Researcher: for finding facts
        - Analyst: for interpreting data
        - Writer: for creating content
        - Critic: for reviewing work

        Respond with just the agent name.
        """
});

var chat = new SelectorGroupChat(
    selector: selector,
    agents: new[] { researcher, analyst, writer, critic },
    terminationCondition: conversation => 
        conversation.Messages.Last().Content.Contains("TASK COMPLETE"));

await chat.RunAsync("Write a market analysis report for electric vehicles");

Broadcast Pattern

All agents respond to each message:

var broadcast = new BroadcastGroupChat(new[] 
{ 
    optimist, 
    pessimist, 
    realist 
});

// Each agent will provide their perspective
var responses = await broadcast.CollectResponsesAsync(
    "Should we invest in quantum computing startups?");

foreach (var response in responses)
{
    Console.WriteLine($"{response.Agent.Name}: {response.Content}");
}

Hierarchical Teams

Nested group chats for complex organization:

// Research team
var researchTeam = new RoundRobinGroupChat(new[] 
{ 
    seniorResearcher, 
    juniorResearcher, 
    dataAnalyst 
});

// Writing team
var writingTeam = new RoundRobinGroupChat(new[] 
{ 
    contentWriter, 
    copyEditor, 
    factChecker 
});

// Executive summary
var executiveChat = new SelectorGroupChat(
    selector: projectManager,
    agents: new IAgent[] 
    { 
        researchTeam.AsAgent("ResearchTeam"), 
        writingTeam.AsAgent("WritingTeam"),
        stakeholderLiaison 
    });

await executiveChat.RunAsync("Create quarterly market report");

Magentic One Pattern

Magentic One is a research-proven pattern from Microsoft Research for complex, open-ended tasks. It features:

An Orchestrator that decomposes tasks and coordinates
Specialized agents for different capabilities
Dynamic replanning based on progress

var magneticOne = new MagenticOneTeam(new MagenticOneOptions
{
    Orchestrator = new ChatClientAgent(chatClient, new ChatClientAgentOptions
    {
        Name = "Orchestrator",
        Instructions = """
            You are the orchestrator for a team of AI agents. Your job is to:
            1. Break down complex tasks into subtasks
            2. Assign subtasks to the most appropriate agent
            3. Monitor progress and adjust plans as needed
            4. Synthesize results into a coherent output

            Available agents:
            - WebSurfer: Can browse the web and extract information
            - Coder: Can write and execute code
            - FileSurfer: Can read and analyze files
            - ComputerTerminal: Can execute shell commands
            """
    }),
    Agents = new[]
    {
        CreateWebSurferAgent(chatClient),
        CreateCoderAgent(chatClient),
        CreateFileSurferAgent(chatClient),
        CreateTerminalAgent(chatClient)
    },
    MaxIterations = 10,
    TaskLedger = new AzureBlobTaskLedger(blobClient)
});

var result = await magneticOne.ExecuteAsync(
    "Research the latest developments in quantum error correction, " +
    "find the top 5 research papers from 2025, and create a summary " +
    "comparing their approaches.");

Error Handling and Retry Strategies

var workflow = new WorkflowBuilder("resilient")
    .AddStep("risky_operation", async ctx =>
    {
        // This might fail
        await externalApi.CallAsync();
    })
    .WithRetry("risky_operation", new RetryOptions
    {
        MaxAttempts = 3,
        Delay = TimeSpan.FromSeconds(1),
        BackoffMultiplier = 2.0,
        RetryOn = ex => ex is HttpRequestException or TimeoutException
    })
    .WithFallback("risky_operation", async (ctx, ex) =>
    {
        // If all retries fail, use cached data
        ctx.Set("result", await cache.GetLastKnownGoodAsync());
        ctx.Set("used_fallback", true);
    })
    .Build();

Global Error Handling

var runner = new WorkflowRunner(workflow)
{
    OnStepError = async (stepId, context, exception) =>
    {
        logger.LogError(exception, "Step {StepId} failed", stepId);

        await alertService.SendAsync(
            $"Workflow step failed: {stepId}",
            exception.Message);
    },

    OnWorkflowError = async (context, exception) =>
    {
        // Save partial results before failing
        await savePartialResults(context);
        throw; // Re-throw to mark workflow as failed
    }
};

Observability and Tracing

Workflows integrate with OpenTelemetry:

var runner = new WorkflowRunner(workflow)
{
    ActivitySource = new ActivitySource("Workflows.ContentPipeline")
};

// Each step creates a span
// Trace hierarchy:
// workflow:content-pipeline
//   ├── step:research
//   │     └── agent:Researcher.invoke
//   ├── step:write
//   │     └── agent:Writer.invoke
//   └── step:edit
//         └── agent:Editor.invoke

Custom Metrics

workflow.OnStepCompleted += (sender, args) =>
{
    stepDurationHistogram.Record(
        args.Duration.TotalMilliseconds,
        new KeyValuePair<string, object?>("step", args.StepId),
        new KeyValuePair<string, object?>("workflow", args.WorkflowId));

    if (args.Context.TryGet<int>("tokens_used", out var tokens))
    {
        tokenCounter.Add(tokens,
            new KeyValuePair<string, object?>("step", args.StepId));
    }
};

Best Practices

1. Keep Steps Focused

// ❌ Too much in one step
.AddStep("do_everything", async ctx =>
{
    // Research, write, edit, publish... 500 lines
})

// ✅ Single responsibility
.AddStep("research", async ctx => { /* just research */ })
.AddStep("write", async ctx => { /* just writing */ })
.AddStep("edit", async ctx => { /* just editing */ })

2. Use Typed Context Objects

public record ContentWorkflowState
{
    public string Topic { get; init; } = "";
    public string? ResearchNotes { get; set; }
    public string? Draft { get; set; }
    public ReviewResult? Review { get; set; }
    public int RevisionCount { get; set; }
}

// Extension for type safety
public static class ContextExtensions
{
    public static ContentWorkflowState GetState(this WorkflowContext ctx)
        => ctx.Get<ContentWorkflowState>("state");

    public static void SetState(this WorkflowContext ctx, ContentWorkflowState state)
        => ctx.Set("state", state);
}

3. Make Workflows Idempotent

.AddStep("publish", async ctx =>
{
    var articleId = ctx.Get<string>("article_id");

    // Check if already published (in case of retry)
    if (await cms.ExistsAsync(articleId))
    {
        ctx.Set("publish_result", "already_exists");
        return;
    }

    await cms.PublishAsync(articleId, ctx.Get<string>("content"));
    ctx.Set("publish_result", "published");
})

4. Plan for Long-Running Workflows

// Always use checkpointing for production
var runner = new WorkflowRunner(workflow)
{
    CheckpointStore = new AzureTableCheckpointStore(...),
    CheckpointFrequency = CheckpointFrequency.AfterEachStep,

    // Set reasonable timeouts
    StepTimeout = TimeSpan.FromMinutes(5),
    WorkflowTimeout = TimeSpan.FromHours(24),

    // Handle orphaned workflows
    OrphanedWorkflowTimeout = TimeSpan.FromHours(48)
};

What's Next

In Part 3, we'll explore the Model Context Protocol (MCP) — the universal tool standard that lets your agents use tools built in any language, and exposes your C# tools to agents everywhere.

DEV Community