DEV Community

Cover image for Microsoft Agent Framework (MAF) Middleware Basics: Add Compliance Fences to Your Agent
Peng Qian
Peng Qian

Posted on • Originally published at dataleadsfuture.com

Microsoft Agent Framework (MAF) Middleware Basics: Add Compliance Fences to Your Agent

Introduction

Microsoft added a middleware feature to their Agent Framework (MAF). This means you can use the chain-of-responsibility design pattern to add extra logic before or after the agent runs, function tools are called, or when the LLM is invoked, without changing the original business logic.

This feature matters a lot.

When building enterprise-level agent applications, teams typically collaborate across departments. Besides your own part, you might need to dynamically include permissions, logs, finance checks, and compliance reviews from other teams.

These parts shouldn’t affect the agent’s core ability, but should be easy to install or remove. Like middleware in FastAPI or other web frameworks, MAF middleware enables this capability for agents as well.

In today’s guide, I’ll show how I use MAF’s middleware and AG-UI to add a compliance review that checks user input before sending it to the agent.

Inducing an agent will be blocked by compliance rules specific to certain business scenarios. Image by Author

This will teach you how to use middleware in enterprise agent applications and give you a first look at using AG-UI for microservice distributed agent development. Let’s start.

You can get all the source code at the end. 👇


📫 Don’t forget to follow my blog to stay updated on my latest progress in AI application practices.


System Setup

Install the latest Microsoft Agent Framework

MAF is still updating quickly. Since APIs change a lot, this guide uses the newest version. It’s better to install the prerelease version.

pip install agent-framework --pre
Enter fullscreen mode Exit fullscreen mode

Or add the dependency in your pyproject.toml file:

"agent-framework-ag-ui>=1.0.0b251223"
Enter fullscreen mode Exit fullscreen mode

Install Microsoft Agent Framework AG-UI

MAF works with AG-UI to support distributed agent development. You’ll need this capability today, so install the latest version of ag-ui; otherwise, APIs won’t match up.

"agent-framework-ag-ui>=1.0.0b251223"
Enter fullscreen mode Exit fullscreen mode

After installing the needed Python packages, we can move on. First, let's get a quick background on what middleware is and what it can do.


Quick Intro to MAF Middleware

What is middleware

According to the MAF documentation:

Middleware in the Agent Framework intercepts, changes, and enhances >agent behavior at different execution points. You can use it for logging, >security checks, error handling, and result transformation without >changing the agent’s or function’s core logic.

That’s what we’ll learn today.

How middleware works

As I said before, MAF middleware uses the chain-of-responsibility pattern. Each piece of logic lives in its own node. Every node knows the next one. When a node finishes running, it passes control to the next node.

Here’s a simple example:

async def logging_agent_middleware(
    context: AgentRunContext,
    next: Callable[[AgentRunContext], Awaitable[None]]
) -> None:
    print("[Agent] Starting execution")
    await next(context)
    print("[Agent] Execution completed")
Enter fullscreen mode Exit fullscreen mode

The next parameter points to the next node. You can run code before or after calling it.

The actual agent logic acts as the last node. After all middleware nodes finish, the agent runs.

In MAF, middleware can run in three stages:

  • Before or after run or run_stream.
  • Before or after a function call.
  • Before or after calling the LLM.

The Microsoft Agent Framework middleware works at different stages of agent execution. Image by Author

Now let’s look at how different middleware types work.

Function-Based middleware

If your middleware is simple, like just logging agent runs, use a function-based middleware.

You only need a function with two parameters: context and next. The context keeps your runtime info, and next calls the next node.

MAF uses the type annotation of context to tell which stage this code belongs to. For example, if it runs at the agent stage, the type should be AgentRunContext:

async def logging_agent_middleware(
    context: AgentRunContext,
    next: Callable[[AgentRunContext], Awaitable[None]]
) -> None:
    print("[Agent] Starting execution")
    await next(context)
    print("[Agent] Execution completed")
Enter fullscreen mode Exit fullscreen mode

For a function call stage, use FunctionInvocationContext:

async def logging_function_middleware(
    context: FunctionInvocationContext,
    next: Callable[[FunctionInvocationContext], Awaitable[None]],
) -> None:
    print(f"[Function] Calling {context.function.name}")
    await next(context)
    print(f"[Function] {context.function.name} completed")
Enter fullscreen mode Exit fullscreen mode

And for the chat stage, use ChatContext:

async def logging_chat_middleware(
    context: ChatContext,
    next: Callable[[ChatContext], Awaitable[None]],
) -> None:
    print(f"[Chat] Sending {len(context.messages)} messages to AI.")
    await next(context)
    print(f"[Chat] AI response received.")
Enter fullscreen mode Exit fullscreen mode

If you dislike type annotations, you can use decorators.

@agent_middleware runs at the agent stage:

@agent_middleware    
async def logging_agent_middleware(context, next) -> None:
    print("[Agent] Starting execution")
    await next(context)
    print("[Agent] Execution completed")
Enter fullscreen mode Exit fullscreen mode

Then you don’t need to add type annotations anymore.

There are also @function_middleware and @chat_middleware for function calls and chat calls.

If your middleware needs to save state or handle more complex logic, function-based won’t be enough. Use class-based middleware.

Class-Based middleware

Class-based middleware organizes code with object-oriented methods. That lets middleware remember state and handle tricky logic.

A class-based middleware must meet two rules:

  1. Inherit from the right base class: AgentMiddleware, FunctionMiddleware, or ChatMiddleware.
  2. Have a process method with the same parameters as the function-based ones. They use the same contexts.

Here’s an example for a middleware class that runs at the function call stage:

class LoggingFunctionMiddleware(FunctionMiddleware):
    async def process(
        self,
        context: FunctionInvocationContext,
        next: Callable[[FunctionInvocationContext], Awaitable[None]]
    ) -> None:
        print(f"[Function Class] Calling {context.function.name}")
        await next(context)
        print(f"[Function Class] {context.function.name} completed.")
Enter fullscreen mode Exit fullscreen mode

Just make sure to pair the right base class with the right context type. The others follow the same rule.

How to use middleware

There are three stages for middleware and three ways to build it. Let’s put that in one grid chart to see how they connect.

Use a grid chart to describe the implementations of different middleware. Image by Author

The framework now only supports passing middleware when creating the agent:

agent = chat_client.create_agent(
    name="assistant",
    instructions="You are a helpful assistant",
    tools=[get_weather],
    middleware=[
        logging_agent_middleware,
        LoggingFunctionMiddleware(),  logging_chat_middleware,
        blocking_middleware, logging_function_middleware,
    ]
)
Enter fullscreen mode Exit fullscreen mode

You can mix all nine types freely.

But note that only the last function middleware you add actually works right now. I’m not sure if that’s a bug, but we’ll find out later.


Project Practice: Add Compliance Check to Your Agent

Now let’s get hands-on. I’ll show how to use MAF middleware to add compliance checking to an agent.

Why add compliance checks

Every LLM already has basic compliance setups built in based on local laws. When companies self-host LLMs, they also add custom checks in frameworks like vLLM. But those only watch the model’s input or output.

Now that agents are everywhere, we also need checks at the agent level: preventing prompt injection, checking MCP permissions, and so on. Middleware makes this possible.

In today’s demo, we’ll review every user message to make sure no one tries to make our finance assistant promise investment returns.

In the end, the agent will refuse to answer questions like “Will I lose money?” or “Can you guarantee profit?”

Inducing an agent will be blocked by compliance rules specific to certain business scenarios. Image by Author

How will you do it

Why use compliance checks as an example? Because in real web apps, product teams don’t manage compliance themselves. The compliance department creates the rules and sends them as microservices to each product.

That way, teams don’t touch those rules. They just plug them in using framework middleware. It’s common in normal web apps.

We’ll do the same with MAF agents, using middleware to insert compliance logic.

To simulate real setups, this project has two parts: one server and one client.

The compliance check middleware will include both server and client modules. Image by Author

On the compliance department side, we’ll deploy a separate agent that reviews messages. It uses an LLM to check user inputs for prompt injections or non-compliant content.

On the business side, we’ll have a middleware that intercepts user requests and sends them to that server. It decides whether the agent should respond.

The two parts communicate using the AG-UI protocol.

Server implementation

Let’s build the compliance-checking agent server.

Since it only checks user requests, I’ll use the Qwen3-30b-a3b-instruct-2507 model for speed.

agent = OpenAILikeChatClient(
    model_id=Qwen3.Q30B_A3B
).create_agent(
    name="Assistant",
    instructions=dedent("""
    You are a compliance review officer. You will review user requests or system-generated text for compliance.
    Your main task is to check user requests and determine whether they are trying to induce the system to produce content that guarantees investment returns or similar topics.

    You should output a JSON text, like {"is_compliance": 1, "reason": ""}

    Here, is_compliance being 1 means compliant, and 0 means non-compliant.

    reason should state the reason for compliance or non-compliance.

    Only output the JSON text without any markdown formatting, and do not add any introduction or explanation.
    """),
)
Enter fullscreen mode Exit fullscreen mode

We’ll make the output structured as JSON for clarity and speed.

Although MAF supports structured output when using Qwen models:

Make Microsoft Agent Framework’s Structured Output Work With Qwen and DeepSeek Models

For some reason, it doesn’t work when used as an AG-UI server.

So we have to tell the format in the prompt.

Next, use add_agent_framework_fastapi_endpoint from agent_framework_ag_ui to register it with FastAPI.

app = FastAPI(title="AG-UI Server")
add_agent_framework_fastapi_endpoint(app, agent, "/compliance")
Enter fullscreen mode Exit fullscreen mode

Finally, run it with uvicorn:

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8888)
Enter fullscreen mode Exit fullscreen mode

Middleware implementation

This middleware is more complex, so we’ll use class-based middleware.

Here’s the full code:

class ComplianceCheckMiddleware(ChatMiddleware):
    def __init__(self, *args, **kwargs):
        self._init_compliant_agent()
        super().__init__(*args, **kwargs)

    async def process(
        self,
        context: ChatContext,
        next: Callable[[ChatContext], Awaitable[None]],
    ):
        check_result: ReviewResults = await self._get_compliance_result(context)
        if not check_result.is_compliance:
            self._output_result(
                context,
                f"😒We can’t keep providing the service because:\n{fill(check_result.reason)}")
            return

        await next(context)

    @staticmethod
    def _output_result(context: ChatContext, response: str) -> None:
        if context.is_streaming: #4
            async def output_stream() -> AsyncIterable[AgentRunResponseUpdate]:
                yield AgentRunResponseUpdate(contents=[TextContent(text=response)])
            context.result = output_stream()
        else:
            context.result = AgentRunResponse(
                messages=[ChatMessage(role=Role.ASSISTANT, text=response)]
            )

    async def _get_compliance_result(self, context: ChatContext) -> ReviewResults:
        messages = [message for message in context.messages if message.role.value == "user"][-5:]
        response = await self.agent.run(messages) #2

        check_result = ReviewResults.model_validate_json(response.text) #3
        return check_result

    def _init_compliant_agent(self) -> None:
        client = AGUIChatClient(  #1
            endpoint="http://127.0.0.1:8888/compliance"
        )
        self.agent = client.create_agent(
            name="compliance_agent",
            instructions="You’re a compliance officer, and you review user requests."
        )
Enter fullscreen mode Exit fullscreen mode

A few details to watch:

  1. _init_compliant_agent creates the AG-UI client but works just like a normal chat client.
  2. I sent recent user messages for better review accuracy. But the AgentMiddleware context only holds the latest message. To get the message history, you must use ChatMiddleware.
  3. Since AG-UI doesn’t support response_format, I parse JSON manually.
  4. _output_result sends text output if a check fails. It switches based on context.is_streaming.

Now we can make a business agent. Use a bigger model and a normal system prompt; just remember to load the ComplianceCheckMiddleware.

chat_client = OpenAILikeChatClient(model_id=Qwen3.NEXT)
agent = chat_client.create_agent(
    name="chat_assistant",
    instructions="You are a helpful assistant. Answer the user's question in short and simple words.",
    middleware=[ComplianceCheckMiddleware()]
)
Enter fullscreen mode Exit fullscreen mode

Let’s test it with a multi-turn chat client:

async def main():
    thread = agent.get_new_thread()
    while True:
        user_input = input("\nUser: ")
        if user_input.startswith("exit"):
            break
        stream = agent.run_stream(user_input, thread=thread)
        print("\nAssistant: ")
        async for event in stream:
            print(event.text, end="", flush=True)
        print()
Enter fullscreen mode Exit fullscreen mode

You’ll see the agent chats normally most of the time.

If you ask about guaranteed returns, it refuses to answer but continues working fine afterward.

Task done.


Conclusion

In this guide, we explored how middleware works in Microsoft Agent Framework.

Middleware lets us add new logic for logging, permissions, or compliance without touching the main agent code or prompt text.

In the project section, I used class-based middleware to show how to review user inputs for compliance.

We also took a quick look at AG-UI for building agent microservices. This helps when many teams need to make agents collaborate, and I’ll cover AG-UI and A2A in detail later.

If you have questions or want to learn more, leave a comment.

Don’t forget to subscribe to my blog and share this article with your friends—maybe it’ll help someone build smarter agents 😁.


Enjoyed this read? Subscribe now to get more cutting-edge data science tips straight to your inbox! Your feedback and questions are welcome — let’s discuss in the comments below!

This article was originally published on Data Leads Future.

Top comments (0)