Om Shree

Posted on Sep 3 • Originally published at glama.ai

Elicitation in MCP: Bridging the Human-AI Gap

#ai #beginners #productivity #discuss

The evolution of artificial intelligence has largely centered on creating autonomous systems that can access data and execute tasks with minimal human intervention. The Model Context Protocol (MCP) has been a pivotal development in this space, standardizing the communication between Large Language Models (LLMs) and external tools, essentially acting as a universal API for AI applications. However, not all workflows can or should be fully automated. There are critical junctures where an AI agent needs to pause, seek clarification, or request confirmation from a human user. This need for a "human-in-the-loop" capability is precisely what the new Elicitation feature in the MCP specification addresses.

Elicitation provides a dynamic mechanism for an MCP server to request structured input from a user via the MCP client. This is a significant departure from traditional models where all parameters are hardwired or collected upfront. By enabling this bidirectional communication, Elicitation allows for more sophisticated and secure interactive workflows, from confirming a critical action like a financial transaction to simply clarifying an ambiguous request. This article will delve into the technical underpinnings of Elicitation, focusing on its reliance on the Streamable HTTP transport protocol and providing a hands-on example to illustrate its real-world implementation.

The Problem of Static Context and the Rise of Elicitation

Before the introduction of Elicitation, the typical MCP workflow involved a client making a request to an MCP server, which would then execute a tool and return a result. For a scenario like an expense management tool, where a manager's approval is required, this process was cumbersome. The tool would have to be designed to handle a waiting state, and the user interaction would be a custom, non-standardized part of the application logic. There was no native protocol mechanism to represent this human interaction within the MCP flow itself.

Elicitation, currently in draft in the MCP specification¹, solves this by standardizing the request for user input. It allows the server to send a structured elicitation request to the client, which can then present it to the user through any interface—a console prompt, a pop-up dialog, or a message on a platform like Slack or Microsoft Teams. The key benefit is that the client-server interaction remains consistent with the protocol, regardless of the user's interface. Once the user provides the necessary input, the client sends it back to the server, and the workflow resumes. This dynamic, at-runtime capability is what transforms MCP from a static tool-calling protocol into a framework capable of handling complex, interactive, and multi-turn workflows.

Streamable HTTP: The Foundation for Elicitation

To understand how Elicitation works, one must first grasp the mechanics of Streamable HTTP, one of the two standard transport mechanisms in MCP (the other being stdio). Unlike traditional stateless REST APIs, Streamable HTTP is designed for stateful, bi-directional communication. This transport protocol is the crucial building block that enables the server to push real-time events, like an elicitation request, to the client.

The communication flow over Streamable HTTP is a multi-step handshake:

Session Initialization: The client sends an initial HTTP POST request to the server to begin a session. The server responds with a unique MCP-Session-Id, which serves as the identifier for the entire conversation.
Notification Handshake: The client sends another POST request to initialize a notifications channel.
Event Channel Establishment: The client makes a long-lived HTTP GET request to the /mcp endpoint with a Content-Type of text/event-stream. This establishes a Server-Sent Events (SSE) channel, allowing the server to stream multiple messages to the client without the client needing to continuously poll.
Tool Discovery: With the channels established, the client can request the list of available tools from the server via the tool/list endpoint.
Tool Invocation: The client passes the tool information and the user's prompt to an LLM. The LLM then determines which tool to call and invokes it.
Elicitation Event: This is where Elicitation comes in. When a tool requires user input, the server uses the pre-established SSE channel to push a structured JSON event to the client. The client's registered callback handler for elicitation then takes over, presenting the prompt to the user and collecting their input.
Response and Continuation: The client sends the user's input back to the server. The server processes it, and the LLM receives the new context to complete the task.
Session Termination: The client sends a final HTTP DELETE request to close the session and clean up server-side resources.

This detailed, stateful process is what differentiates Streamable HTTP from standard RPC or REST and provides the necessary foundation for interactive features like Elicitation.

Behind the Scenes / How It Works

Elicitation is a powerful example of how MCP enables dynamic, asynchronous workflows. The core logic hinges on a well-defined data flow and schema.

Server-Side Logic

A developer building an MCP server must define a tool that is capable of eliciting user input. In Python, a library like FastMCP simplifies this by providing a declarative way to create such tools. The key is to use a data class to define the expected user input and then define the tool function to emit the elicitation request.

Here's a schematic of the server-side code:

# Server-side pseudo-code
from fastmcp import FastMCP, Context
from fastmcp.servers.elicitation import ElicitationPrompt
from dataclasses import dataclass

@dataclass
class UserInfo:
    name: str
    age: int

mcp = FastMCP("MyElicitationServer")

@mcp.tool
async def collect_user_info(context: Context):
    """
    Collects user information via elicitation.
    """
    await context.elicit_input(
        prompt="Please provide your information.",
        response_type=UserInfo
    )
    # The server logic will resume here after the user provides input
    # and the result is returned to the context.
    user_data = context.elicit_result
    return f"Hello {user_data.name}, you are {user_data.age} years old."

if __name__ == "__main__":
    mcp.run()

When the collect_user_info tool is invoked, the context.elicit_input method packages the prompt and the response_type schema into a JSON-RPC notification. This notification is then pushed to the client via the SSE channel.

Client-Side Logic

On the client side, a dedicated handler is responsible for processing the elicitation request. The client registers this handler during its initialization.

# Client-side pseudo-code
import asyncio
from fastmcp import FastMCPClient, ElicitationMessage, ElicitationHandler

class MyElicitationHandler(ElicitationHandler):
    async def handle_elicitation(self, message: ElicitationMessage):
        print(f"Server is asking for input: {message.message}")
        print(f"Required schema: {message.response_type.schema}")

        name = input("Your name: ")
        age = int(input("Your age: "))

        # The client must send the structured data back to the server.
        response_content = {"name": name, "age": age}
        return await self.accept_with_content(content=response_content)

async def main():
    handler = MyElicitationHandler()
    client = FastMCPClient(server_url="http://localhost:8000/mcp", elicitation_handler=handler)

    # Example tool invocation
    result = await client.call_tool("collect_user_info")
    print(f"Result from server: {result}")

if __name__ == "__main__":
    asyncio.run(main())

The handle_elicitation method is the registered callback. It receives the structured message from the server, interacts with the user (in this simple case, via console input), and then uses the accept_with_content helper method to format the user's input into a structured JSON-RPC response. This response is then sent back to the server, completing the elicitation loop.

The brilliance of this design is its decoupling. The server defines what information it needs using a clear schema, but it is completely agnostic to how the client collects it. This allows for diverse implementations, from a command-line interface to a sophisticated graphical user interface, all while maintaining protocol compliance.

My Thoughts

The introduction of Elicitation is a pivotal moment for the MCP ecosystem, elevating it from a simple tool-calling protocol to a framework for building complex, interactive AI applications. The ability to put a human in the loop dynamically and securely is essential for enterprise-grade solutions, particularly in regulated industries or for high-stakes actions.

However, as with any new feature, there are limitations and considerations. A major concern is security. The speaker correctly highlights that Elicitation should not be used for sensitive information like passwords or personal identifiers, as the protocol itself does not mandate encryption. Developers must be aware of this and use the appropriate, and recently updated, OAuth and authorization standards within the MCP specification for handling authentication flows and sensitive data².

Furthermore, while the concept of a dynamic schema is powerful, it also places a burden on the client developer to create a robust and adaptable UI. A simple console input is one thing, but a production-ready client must be able to dynamically render forms or UI elements based on the incoming schema, which can become complex.

Looking ahead, I believe Elicitation will be a catalyst for a new class of "agentic applications" where the user and the AI agent work together to complete a task. Future improvements could include multi-turn elicitation and more sophisticated data types beyond the currently supported primitive types³. This would enable even more granular and complex workflows, such as guided form filling or multi-step troubleshooting.

Acknowledgements

I want to extend my sincere thanks to Sir. Janakiram MSV for his insightful talk on Implementing Elicitation: Bringing Human in the Loop to MCP Workflows. His detailed walkthrough of Streamable HTTP and the practical demos using FastMCP provided an invaluable and simple look into the inner workings of this critical new feature. His expertise and clear explanations are a tremendous asset to the broader Model Context Protocol community.

References

Top comments (2)

Anna kowoski • Sep 3

Nice Article om!

Om Shree • Sep 4

Thanks Anna!