Introduction
Microsoft added a middleware feature to their Agent Framework (MAF). This means you can use the chain-of-responsibility design pattern to add extra logic before or after the agent runs, function tools are called, or when the LLM is invoked, without changing the original business logic.
This feature matters a lot.
When building enterprise-level agent applications, teams typically collaborate across departments. Besides your own part, you might need to dynamically include permissions, logs, finance checks, and compliance reviews from other teams.
These parts shouldn’t affect the agent’s core ability, but should be easy to install or remove. Like middleware in FastAPI or other web frameworks, MAF middleware enables this capability for agents as well.
In today’s guide, I’ll show how I use MAF’s middleware and AG-UI to add a compliance review that checks user input before sending it to the agent.
This will teach you how to use middleware in enterprise agent applications and give you a first look at using AG-UI for microservice distributed agent development. Let’s start.
You can get all the source code at the end. 👇
📫 Don’t forget to follow my blog to stay updated on my latest progress in AI application practices.
System Setup
Install the latest Microsoft Agent Framework
MAF is still updating quickly. Since APIs change a lot, this guide uses the newest version. It’s better to install the prerelease version.
pip install agent-framework --pre
Or add the dependency in your pyproject.toml file:
"agent-framework-ag-ui>=1.0.0b251223"
Install Microsoft Agent Framework AG-UI
MAF works with AG-UI to support distributed agent development. You’ll need this capability today, so install the latest version of ag-ui; otherwise, APIs won’t match up.
"agent-framework-ag-ui>=1.0.0b251223"
After installing the needed Python packages, we can move on. First, let's get a quick background on what middleware is and what it can do.
Quick Intro to MAF Middleware
What is middleware
According to the MAF documentation:
Middleware in the Agent Framework intercepts, changes, and enhances >agent behavior at different execution points. You can use it for logging, >security checks, error handling, and result transformation without >changing the agent’s or function’s core logic.
That’s what we’ll learn today.
How middleware works
As I said before, MAF middleware uses the chain-of-responsibility pattern. Each piece of logic lives in its own node. Every node knows the next one. When a node finishes running, it passes control to the next node.
Here’s a simple example:
async def logging_agent_middleware(
context: AgentRunContext,
next: Callable[[AgentRunContext], Awaitable[None]]
) -> None:
print("[Agent] Starting execution")
await next(context)
print("[Agent] Execution completed")
The next parameter points to the next node. You can run code before or after calling it.
The actual agent logic acts as the last node. After all middleware nodes finish, the agent runs.
In MAF, middleware can run in three stages:
- Before or after run or run_stream.
- Before or after a function call.
- Before or after calling the LLM.
Now let’s look at how different middleware types work.
Function-Based middleware
If your middleware is simple, like just logging agent runs, use a function-based middleware.
You only need a function with two parameters: context and next. The context keeps your runtime info, and next calls the next node.
MAF uses the type annotation of context to tell which stage this code belongs to. For example, if it runs at the agent stage, the type should be AgentRunContext:
async def logging_agent_middleware(
context: AgentRunContext,
next: Callable[[AgentRunContext], Awaitable[None]]
) -> None:
print("[Agent] Starting execution")
await next(context)
print("[Agent] Execution completed")
For a function call stage, use FunctionInvocationContext:
async def logging_function_middleware(
context: FunctionInvocationContext,
next: Callable[[FunctionInvocationContext], Awaitable[None]],
) -> None:
print(f"[Function] Calling {context.function.name}")
await next(context)
print(f"[Function] {context.function.name} completed")
And for the chat stage, use ChatContext:
async def logging_chat_middleware(
context: ChatContext,
next: Callable[[ChatContext], Awaitable[None]],
) -> None:
print(f"[Chat] Sending {len(context.messages)} messages to AI.")
await next(context)
print(f"[Chat] AI response received.")
If you dislike type annotations, you can use decorators.
@agent_middleware runs at the agent stage:
@agent_middleware
async def logging_agent_middleware(context, next) -> None:
print("[Agent] Starting execution")
await next(context)
print("[Agent] Execution completed")
Then you don’t need to add type annotations anymore.
There are also @function_middleware and @chat_middleware for function calls and chat calls.
If your middleware needs to save state or handle more complex logic, function-based won’t be enough. Use class-based middleware.
Class-Based middleware
Class-based middleware organizes code with object-oriented methods. That lets middleware remember state and handle tricky logic.
A class-based middleware must meet two rules:
- Inherit from the right base class: AgentMiddleware, FunctionMiddleware, or ChatMiddleware.
- Have a process method with the same parameters as the function-based ones. They use the same contexts.
Here’s an example for a middleware class that runs at the function call stage:
class LoggingFunctionMiddleware(FunctionMiddleware):
async def process(
self,
context: FunctionInvocationContext,
next: Callable[[FunctionInvocationContext], Awaitable[None]]
) -> None:
print(f"[Function Class] Calling {context.function.name}")
await next(context)
print(f"[Function Class] {context.function.name} completed.")
Just make sure to pair the right base class with the right context type. The others follow the same rule.
How to use middleware
There are three stages for middleware and three ways to build it. Let’s put that in one grid chart to see how they connect.
The framework now only supports passing middleware when creating the agent:
agent = chat_client.create_agent(
name="assistant",
instructions="You are a helpful assistant",
tools=[get_weather],
middleware=[
logging_agent_middleware,
LoggingFunctionMiddleware(), logging_chat_middleware,
blocking_middleware, logging_function_middleware,
]
)
You can mix all nine types freely.
But note that only the last function middleware you add actually works right now. I’m not sure if that’s a bug, but we’ll find out later.
Project Practice: Add Compliance Check to Your Agent
Now let’s get hands-on. I’ll show how to use MAF middleware to add compliance checking to an agent.
Why add compliance checks
Every LLM already has basic compliance setups built in based on local laws. When companies self-host LLMs, they also add custom checks in frameworks like vLLM. But those only watch the model’s input or output.
Now that agents are everywhere, we also need checks at the agent level: preventing prompt injection, checking MCP permissions, and so on. Middleware makes this possible.
In today’s demo, we’ll review every user message to make sure no one tries to make our finance assistant promise investment returns.
In the end, the agent will refuse to answer questions like “Will I lose money?” or “Can you guarantee profit?”
How will you do it
Why use compliance checks as an example? Because in real web apps, product teams don’t manage compliance themselves. The compliance department creates the rules and sends them as microservices to each product.
That way, teams don’t touch those rules. They just plug them in using framework middleware. It’s common in normal web apps.
We’ll do the same with MAF agents, using middleware to insert compliance logic.
To simulate real setups, this project has two parts: one server and one client.
On the compliance department side, we’ll deploy a separate agent that reviews messages. It uses an LLM to check user inputs for prompt injections or non-compliant content.
On the business side, we’ll have a middleware that intercepts user requests and sends them to that server. It decides whether the agent should respond.
The two parts communicate using the AG-UI protocol.
Server implementation
Let’s build the compliance-checking agent server.
Since it only checks user requests, I’ll use the Qwen3-30b-a3b-instruct-2507 model for speed.
agent = OpenAILikeChatClient(
model_id=Qwen3.Q30B_A3B
).create_agent(
name="Assistant",
instructions=dedent("""
You are a compliance review officer. You will review user requests or system-generated text for compliance.
Your main task is to check user requests and determine whether they are trying to induce the system to produce content that guarantees investment returns or similar topics.
You should output a JSON text, like {"is_compliance": 1, "reason": ""}
Here, is_compliance being 1 means compliant, and 0 means non-compliant.
reason should state the reason for compliance or non-compliance.
Only output the JSON text without any markdown formatting, and do not add any introduction or explanation.
"""),
)
We’ll make the output structured as JSON for clarity and speed.
Although MAF supports structured output when using Qwen models:
Make Microsoft Agent Framework’s Structured Output Work With Qwen and DeepSeek Models
For some reason, it doesn’t work when used as an AG-UI server.
So we have to tell the format in the prompt.
Next, use add_agent_framework_fastapi_endpoint from agent_framework_ag_ui to register it with FastAPI.
app = FastAPI(title="AG-UI Server")
add_agent_framework_fastapi_endpoint(app, agent, "/compliance")
Finally, run it with uvicorn:
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8888)
Middleware implementation
This middleware is more complex, so we’ll use class-based middleware.
Here’s the full code:
class ComplianceCheckMiddleware(ChatMiddleware):
def __init__(self, *args, **kwargs):
self._init_compliant_agent()
super().__init__(*args, **kwargs)
async def process(
self,
context: ChatContext,
next: Callable[[ChatContext], Awaitable[None]],
):
check_result: ReviewResults = await self._get_compliance_result(context)
if not check_result.is_compliance:
self._output_result(
context,
f"😒We can’t keep providing the service because:\n{fill(check_result.reason)}")
return
await next(context)
@staticmethod
def _output_result(context: ChatContext, response: str) -> None:
if context.is_streaming: #4
async def output_stream() -> AsyncIterable[AgentRunResponseUpdate]:
yield AgentRunResponseUpdate(contents=[TextContent(text=response)])
context.result = output_stream()
else:
context.result = AgentRunResponse(
messages=[ChatMessage(role=Role.ASSISTANT, text=response)]
)
async def _get_compliance_result(self, context: ChatContext) -> ReviewResults:
messages = [message for message in context.messages if message.role.value == "user"][-5:]
response = await self.agent.run(messages) #2
check_result = ReviewResults.model_validate_json(response.text) #3
return check_result
def _init_compliant_agent(self) -> None:
client = AGUIChatClient( #1
endpoint="http://127.0.0.1:8888/compliance"
)
self.agent = client.create_agent(
name="compliance_agent",
instructions="You’re a compliance officer, and you review user requests."
)
A few details to watch:
- _init_compliant_agent creates the AG-UI client but works just like a normal chat client.
- I sent recent user messages for better review accuracy. But the AgentMiddleware context only holds the latest message. To get the message history, you must use ChatMiddleware.
- Since AG-UI doesn’t support response_format, I parse JSON manually.
- _output_result sends text output if a check fails. It switches based on context.is_streaming.
Now we can make a business agent. Use a bigger model and a normal system prompt; just remember to load the ComplianceCheckMiddleware.
chat_client = OpenAILikeChatClient(model_id=Qwen3.NEXT)
agent = chat_client.create_agent(
name="chat_assistant",
instructions="You are a helpful assistant. Answer the user's question in short and simple words.",
middleware=[ComplianceCheckMiddleware()]
)
Let’s test it with a multi-turn chat client:
async def main():
thread = agent.get_new_thread()
while True:
user_input = input("\nUser: ")
if user_input.startswith("exit"):
break
stream = agent.run_stream(user_input, thread=thread)
print("\nAssistant: ")
async for event in stream:
print(event.text, end="", flush=True)
print()
You’ll see the agent chats normally most of the time.
If you ask about guaranteed returns, it refuses to answer but continues working fine afterward.
Task done.
Conclusion
In this guide, we explored how middleware works in Microsoft Agent Framework.
Middleware lets us add new logic for logging, permissions, or compliance without touching the main agent code or prompt text.
In the project section, I used class-based middleware to show how to review user inputs for compliance.
We also took a quick look at AG-UI for building agent microservices. This helps when many teams need to make agents collaborate, and I’ll cover AG-UI and A2A in detail later.
If you have questions or want to learn more, leave a comment.
Don’t forget to subscribe to my blog and share this article with your friends—maybe it’ll help someone build smarter agents 😁.
Enjoyed this read? Subscribe now to get more cutting-edge data science tips straight to your inbox! Your feedback and questions are welcome — let’s discuss in the comments below!
This article was originally published on Data Leads Future.





Top comments (0)