DEV Community

Rabinarayan Patra
Rabinarayan Patra

Posted on • Originally published at rabinarayanpatra.com on

Spring AI 2.0 MCP Annotations: From Tool to Production

Spring AI 2.0.0-M6 dropped on May 8, 2026. Buried in the release notes was the thing I had been waiting for since I first wired an MCP server in Java six months ago: native annotations. @McpTool, @McpResource, @McpPrompt, @McpComplete. All in core. All auto-registered.

If you've written an MCP server with the older Spring AI ToolCallback API, you remember the ritual. Build descriptors, register callbacks, wire up the transport manually, handle JSON schema by hand. The annotation API replaces all of it with a single annotation on a method. Spring AI generates the schema. Auto-configuration handles registration. You write the business logic.

This tutorial walks the full path: building a production MCP server, exposing tools and resources, handling async work with progress reporting, picking a transport, registering with Claude Code, and the gotchas I've hit that nobody mentions on the docs.

What changed in Spring AI 2.0 MCP annotations?

Spring AI 2.0 collapses MCP server and client wiring into a small set of annotations that Spring Boot auto-configuration picks up at startup. The annotated beans become the MCP surface for your Spring Boot app.

The core server annotations are:

  • @McpTool marks a method as an MCP tool. Spring AI builds the JSON schema from your method parameters.
  • @McpResource exposes a resource via a URI template.
  • @McpPrompt exposes a prompt template that clients can fetch.
  • @McpComplete provides auto-completion for prompt or resource arguments.

The auto-configuration also injects special context parameters like McpSyncRequestContext and McpAsyncRequestContext, which give your method access to logging, progress reporting, sampling, and elicitation without polluting the JSON schema.

Spring AI 2.0 MCP annotations layered architecture: Java annotations through Spring Boot auto-configuration to MCP transport

For a Spring shop, this is the kind of API change that flips a project's complexity. Before, every team building an MCP server wrote the same plumbing. Now the plumbing is gone.

How do you build an MCP server with @McpTool?

Start with a Spring Boot 3.5+ project on Java 21 or higher. Add the MCP server starter to your pom.xml. There are three flavors: stdio/SSE default, WebMVC, and WebFlux. For production HTTP transport, pick WebMVC or WebFlux based on whether you want blocking or reactive code.

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
    <version>2.0.0-M6</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Spring AI 2.0 milestones live in the Spring milestone repository, so add it if you haven't:

<repositories>
    <repository>
        <id>spring-milestones</id>
        <name>Spring Milestones</name>
        <url>https://repo.spring.io/milestone</url>
    </repository>
</repositories>
Enter fullscreen mode Exit fullscreen mode

Now write a tool. The example everyone reaches for first is a calculator, but let's do something a real MCP client would actually want: fetching a user record from your database.

import org.springframework.ai.mcp.server.annotation.McpTool;
import org.springframework.ai.mcp.server.annotation.McpToolParam;
import org.springframework.stereotype.Component;

@Component
public class UserTools {

    private final UserRepository users;

    public UserTools(UserRepository users) {
        this.users = users;
    }

    @McpTool(
        name = "get_user",
        description = "Fetch a user record by ID. Returns name, email, and signup date."
    )
    public UserDto getUser(
        @McpToolParam(description = "Numeric user ID", required = true) long userId
    ) {
        return users.findById(userId)
            .map(UserDto::from)
            .orElseThrow(() -> new IllegalArgumentException("User not found: " + userId));
    }
}
Enter fullscreen mode Exit fullscreen mode

That's the whole tool. Spring AI does a few things automatically when it scans this bean:

It builds a JSON schema for the userId parameter using the @McpToolParam description and the Java type. The MCP client sees the schema and shows it as a typed input.

It registers the method with the MCP server runtime under the name get_user. The MCP tools/list request returns it. The tools/call request invokes it.

It serializes the return type using Jackson. UserDto becomes a JSON object in the tool result.

You can also use Map<String, Object> if you want loose typing, or a record class if you want strict typing with deserialization on the client side. Records are my default. The serialization is predictable, the schema is implicit, and the code is less than a Lombok annotation pile would be.

How do you handle progress and async work?

The most common reason a tool feels broken in production is that it does real work without telling the client. The client sits there. The user sits there. Eventually something times out. Spring AI 2.0 fixes this with a request context that exposes a progress channel.

import org.springframework.ai.mcp.server.annotation.McpTool;
import org.springframework.ai.mcp.server.context.McpSyncRequestContext;
import org.springframework.stereotype.Component;

@Component
public class ReportTools {

    private final ReportService reports;

    public ReportTools(ReportService reports) {
        this.reports = reports;
    }

    @McpTool(
        name = "generate_quarterly_report",
        description = "Generate the quarterly revenue report. Takes a few seconds."
    )
    public ReportResult generateReport(
        McpSyncRequestContext ctx,
        @McpToolParam(description = "Quarter, e.g. 2026-Q1", required = true) String quarter
    ) {
        ctx.logging().info("Loading transactions for " + quarter);
        ctx.progress().report(0.1, "Loading transactions");

        var transactions = reports.loadTransactions(quarter);
        ctx.progress().report(0.5, "Aggregating");

        var aggregated = reports.aggregate(transactions);
        ctx.progress().report(0.9, "Rendering");

        return reports.render(aggregated);
    }
}
Enter fullscreen mode Exit fullscreen mode

McpSyncRequestContext does not appear in the JSON schema. Spring AI knows it's a framework parameter and excludes it. The user-facing parameters stay clean. Inside the method you get logging(), progress(), sampling(), and elicitation() channels, all wired to the MCP client transport.

For reactive code, use McpAsyncRequestContext and return a Mono or Flux. The progress channel returns Reactor types so you can compose progress reporting into a reactive chain.

@McpTool(name = "async_report", description = "Generate report asynchronously.")
public Mono<ReportResult> asyncReport(
    McpAsyncRequestContext ctx,
    @McpToolParam(description = "Quarter") String quarter
) {
    return reports.loadAsync(quarter)
        .doOnNext(_ -> ctx.progress().report(0.5, "Aggregated").subscribe())
        .flatMap(reports::renderAsync);
}
Enter fullscreen mode Exit fullscreen mode

One thing to flag: the progress channel is not the same thing as streaming results. Progress is metadata. The result is still a single return value. If you want token-by-token streaming, you want sampling, not tools.

How do you expose prompts and resources?

Tools are the most-used MCP primitive, but prompts and resources are what turn a tool collection into a workspace. Prompts let the client request a prefilled prompt template. Resources let the client browse data your server exposes by URI.

A prompt looks like this:

@Component
public class IncidentPrompts {

    @McpPrompt(
        name = "incident_summary",
        description = "Generate an incident summary from a Jira ticket ID."
    )
    public String incidentSummary(
        @McpToolParam(description = "Jira ticket ID, e.g. INC-1234") String ticketId
    ) {
        return """
            Summarize the incident in ticket %s for an executive audience.
            Include: timeline, root cause, customer impact, and remediation.
            """.formatted(ticketId);
    }
}
Enter fullscreen mode Exit fullscreen mode

When the client asks for the incident_summary prompt with INC-1234, the server returns the rendered string. The client passes it to its model.

Resources are different. They expose data the server holds, keyed by URI:

@Component
public class CustomerResources {

    private final CustomerRepository customers;

    public CustomerResources(CustomerRepository customers) {
        this.customers = customers;
    }

    @McpResource(
        uri = "customer://{customerId}/profile",
        description = "Customer profile data including billing address and tier."
    )
    public CustomerProfile profile(String customerId) {
        return customers.profileFor(customerId);
    }
}
Enter fullscreen mode Exit fullscreen mode

The client can list resources, pick one, and read it. The URI template is matched on the path variable. Spring AI extracts the customerId and passes it to the method.

The combination of tools, prompts, and resources is what makes MCP feel like an actual application surface rather than a function bag. Tools do work. Prompts give the client wording. Resources expose state.

How do you choose between transports (stdio, SSE, Streamable HTTP)?

Spring AI supports four transports, and the choice matters more than the docs make it sound.

stdio is for local single-process tools. Claude Code spawns your server as a child process and talks to it over stdin and stdout. Fine for personal tooling. Bad for anything multi-user, multi-tenant, or networked.

SSE was the original HTTP transport for MCP. It's being phased out. Spring AI still ships it for backward compatibility, but new servers should not start there.

Streamable HTTP is the current MCP HTTP transport. Stateful, supports bidirectional notifications, works behind reverse proxies, fits production environments. Use this for any server that isn't strictly local.

Stateless Streamable HTTP is the new variant designed for serverless and horizontally scaled deployments. No session affinity required. The tradeoff is that any context that would have lived in the server session has to come back through the request from the client. If you're deploying to Vercel, Cloud Run, or any autoscaled platform, this is your transport.

MCP transport comparison: stdio for local, SSE deprecated, Streamable HTTP for production, Stateless Streamable HTTP for serverless

For Streamable HTTP with WebMVC, the auto-config wires it up at /mcp by default. You can change the path:

spring:
  ai:
    mcp:
      server:
        transport: streamable-http
        path: /api/mcp
        name: my-spring-server
        version: 1.0.0
Enter fullscreen mode Exit fullscreen mode

The name and version show up in the MCP initialize response. Set them to something recognizable. Default Spring AI names look generic in client UIs.

How do you register and test the server with Claude Code?

Once your Spring Boot app runs and exposes Streamable HTTP at /mcp, register it with Claude Code. The config lives in ~/.config/claude/mcp.json on macOS and Linux, or via the claude mcp CLI.

{
  "mcpServers": {
    "my-spring-server": {
      "type": "streamable-http",
      "url": "http://localhost:8080/mcp"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Restart Claude Code. Run /mcp inside a session. You should see your tools, prompts, and resources listed. Try calling one. If the call returns and the result shows up in your transcript, the loop is alive.

For local stdio testing, swap the JSON to type stdio and point command at your packaged jar:

{
  "mcpServers": {
    "my-spring-server": {
      "type": "stdio",
      "command": "java",
      "args": ["-jar", "/path/to/server.jar"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Claude Code will spawn the JVM, talk over stdin and stdout, and shut it down when the session ends. The startup cost is real, somewhere between 2 and 5 seconds depending on your dependencies. For HTTP transport, Spring Boot stays running and connections are cheap.

What are the production-readiness gotchas?

The annotation API hides the complexity. That's a feature for most cases, but it also means a few production concerns aren't obvious until they bite.

Six production gotchas for Spring AI MCP servers: error handling, schema generation, auth, long-running tools, scaling, tool naming

Error handling. If your @McpTool method throws, Spring AI wraps the exception and returns an MCP error to the client. Good. But the default error mapping returns the exception message verbatim, which can leak internal details. Customize the error handling with a McpExceptionHandler bean if you're exposing the server to untrusted clients.

Schema generation. Records and POJOs work. Generic collections work. Nested polymorphic types are where the generator gets cranky. If your tool returns a List<Animal> where Animal is an interface, expect the schema to be vague. Prefer concrete types for tool return values. Save the polymorphism for internal layers.

Auth. Spring AI does not ship MCP-specific auth. You wire it through normal Spring Security. For Streamable HTTP, add a security filter chain matching /mcp and validate bearer tokens or API keys there. The Stateless Streamable HTTP transport makes this easier because there's no session to protect, only requests.

Long-running tools. MCP clients have timeouts. Claude Code's default is generous but not infinite. If your tool runs longer than 30 seconds, use the progress channel aggressively, and consider returning a job handle and exposing a second tool to poll status. Trying to hold open a 5-minute synchronous tool call will end in tears.

Scaling. Streamable HTTP is stateful per session. Sticky sessions or a shared session store are needed if you scale horizontally. If you want stateless scaling, use the Stateless Streamable HTTP transport and design your tools to be self-contained on each request.

Tool naming. MCP tool names show up in the client UI. Pick verbs. get_user reads better than userQuery. Consistency matters more than cleverness.

Conclusion

The Spring AI 2.0 MCP annotation API is the kind of upgrade you only notice fully when you look back at the old way. The boilerplate is gone. The schema generation is automatic. The transports are real. The progress and async story is sane.

The part I want more Spring teams to internalize: MCP is no longer an experiment. Every major coding agent speaks it. Every Spring service in your fleet is a potential MCP surface. The cost of exposing your internal APIs to an agent has dropped to almost nothing. Whether that's a good idea is a different conversation, but the technical barrier is now low enough that you should decide on purpose, not by default.

For more on Spring AI MCP, see the Spring AI MCP annotations docs, the Spring AI 2.0.0-M6 release post, and the official MCP spec.

Keep Reading

Top comments (0)