Om Shree

Posted on Apr 17

Spring AI SDK for Amazon Bedrock AgentCore: Build Production-Ready Java AI Agents

#agents #ai #aws #java

Java developers have always had a rough deal with agentic AI. The proof of concepts are easy enough — wrap a model call, return a string. But taking that to production means custom controllers, SSE streaming handlers, health check endpoints, rate limiting, memory repositories... weeks of infrastructure work before you've written a single line of actual agent logic.

AWS just GA'd the Spring AI AgentCore SDK, and it collapses most of that into a single annotation.

What's Amazon Bedrock AgentCore

AgentCore is AWS's managed platform for running AI agents at scale. It handles the infrastructure layer — scaling, reliability, security, observability — and provides building blocks like short and long-term memory, browser automation, and sandboxed code execution.

The problem until now: integrating all of that into a Spring application required implementing the AgentCore Runtime contract yourself. Two specific endpoints (/invocations and /ping), SSE streaming with proper framing, health status signaling for long-running tasks, and all the Spring wiring on top. Not impossible, but tedious and error-prone.

The SDK handles all of it automatically.

The Core Idea: One Annotation

Here's a complete, AgentCore-compatible AI agent:

@Service
public class MyAgent {

    private final ChatClient chatClient;

    public MyAgent(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    @AgentCoreInvocation
    public String chat(PromptRequest request) {
        return chatClient.prompt()
            .user(request.prompt())
            .call()
            .content();
    }
}

record PromptRequest(String prompt) {}

The @AgentCoreInvocation annotation auto-configures the /invocations POST endpoint and the /ping health endpoint, handles JSON serialization, detects async tasks and reports busy status so AgentCore doesn't scale down mid-execution, and manages response formatting. No custom controllers.

Want streaming? Change the return type:

@AgentCoreInvocation
public Flux<String> streamingChat(PromptRequest request) {
    return chatClient.prompt()
        .user(request.prompt())
        .stream()
        .content();
}

The SDK switches to SSE output automatically and handles framing, backpressure, and connection lifecycle.

Adding Memory

The SDK integrates AgentCore Memory through Spring AI's advisor pattern — interceptors that enrich prompts with context before they hit the model.

Short-term memory uses a sliding window of recent messages. Long-term memory persists across sessions using four strategies: semantic (factual user info), user preference (explicit settings), summary (condensed history), and episodic (past interactions). AgentCore consolidates these asynchronously.

Configuration is minimal:

agentcore.memory.memory-id=${AGENTCORE_MEMORY_ID}
agentcore.memory.long-term.auto-discovery=true

Then compose it into your chat client:

@AgentCoreInvocation
public String chat(PromptRequest request, AgentCoreContext context) {
    String sessionId = context.getHeader(AgentCoreHeaders.SESSION_ID);

    return chatClient.prompt()
        .user(request.prompt())
        .advisors(agentCoreMemory.advisors)
        .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, "user:" + sessionId))
        .call()
        .content();
}

Auto-discovery mode detects available LTM strategies without manual configuration.

Browser and Code Execution as Tools

AgentCore exposes two additional capabilities as Spring AI tool callbacks via ToolCallbackProvider:

Browser automation — agents can navigate websites, extract content, take screenshots, and interact with page elements.

Code interpreter — agents write and run Python, JavaScript, or TypeScript in a secure sandbox. The sandbox includes numpy, pandas, and matplotlib. Generated files go through an artifact store.

Both are added as Maven dependencies and wired in through the constructor:

public MyAgent(
    ChatClient.Builder builder,
    AgentCoreMemory agentCoreMemory,
    @Qualifier("browserToolCallbackProvider") ToolCallbackProvider browserTools,
    @Qualifier("codeInterpreterToolCallbackProvider") ToolCallbackProvider codeInterpreterTools) {

    this.chatClient = builder
        .defaultToolCallbacks(browserTools, codeInterpreterTools)
        .build();
}

The model decides which tool to call based on the user's request. Both tools are visible equally.

MCP Integration via AgentCore Gateway

Spring AI agents can connect to organizational tools through AgentCore Gateway, which provides MCP support with outbound authentication and a semantic tool registry. Configure your Spring AI MCP client to point at Gateway:

spring.ai.mcp.client.toolcallback.enabled=true
spring.ai.mcp.client.initialized=false
spring.ai.mcp.client.streamable-http.connections.gateway.url=${GATEWAY_URL}

Gateway handles credential management for downstream services. Agents discover and invoke enterprise tools without managing auth themselves.

Deployment Options

AgentCore Runtime — package as an ARM64 container, push to ECR, create a Runtime pointing at the image. AWS handles scaling, health monitoring, pay-per-use pricing (no charge for idle compute). Terraform examples are in the repo.

Standalone — use individual modules (Memory, Browser, Code Interpreter) in applications running on EKS, ECS, EC2, or on-premises. Teams can adopt incrementally — add memory to an existing Spring Boot service before considering a full migration to AgentCore Runtime.

Design Principles

The SDK is built around three ideas: convention over configuration (sensible defaults, port 8080, standard endpoint paths), annotation-driven development (one annotation replaces weeks of infrastructure code), and deployment flexibility (you're not locked into AgentCore Runtime to use the individual modules).

It's open source under Apache 2.0. The repo has five example applications ranging from a minimal agent to a full OAuth-authenticated setup with per-user memory isolation.

What's Coming

The team has flagged three upcoming additions: observability integration with CloudWatch, LangFuse, Datadog, and Dynatrace via OpenTelemetry; an evaluations framework for testing agent response quality; and advanced identity management for streamlined security context handling.

Getting Started

<dependency>
    <groupId>org.springaicommunity</groupId>
    <artifactId>spring-ai-agentcore-runtime-starter</artifactId>
</dependency>

Repo: github.com/spring-ai-community/spring-ai-agentcore

Docs: docs.aws.amazon.com/bedrock-agentcore

There's also a four-hour workshop that walks through building a travel and expense management agent from scratch — memory, browser, code execution, MCP integration, deployed serverless with auth. No ML experience required.

I cover AI infrastructure and developer tools at Shreesozo Yt Channel. AI Infra Weekly drops every Friday.

Top comments (4)

Archit Mittal • Apr 20

Great walkthrough — the Spring AI + AgentCore combo is starting to feel mature enough for enterprise Java shops. The thing I'd flag for anyone adopting in production: get your observability story right before you ship. Spring AI's ObservationConvention integration is powerful but easy to miss — wiring Micrometer traces into every tool invocation + LLM call saves a ton of pain during incident reviews when you're trying to reconstruct what the agent actually did.

Om Shree • Apr 21

Thanks Sir !
Loved your Insights!!!

PEACEBINFLOW • Apr 22

The part that lands for me isn't the annotation itself—though collapsing weeks of boilerplate into @AgentCoreInvocation is genuinely useful. It's the quiet implication that Java might actually become a reasonable choice for AI agent development again.

For the last couple years, the agent space has felt like it belonged to Python and TypeScript. The ecosystem momentum, the SDKs, the tutorials, the "just clone this repo and run" starter kits—all of it pointed away from the JVM. Java developers were left wrapping REST calls to Python microservices or fighting with half-documented HTTP clients. It worked, but it never felt native.

What the SDK does is make AgentCore feel like a Spring module, not an external service you're awkwardly bolting onto your app. The advisor pattern for memory, the tool callback provider pattern for browser and code execution—these are Spring idioms. A Java developer who's never touched an LLM can look at this and think "oh, it's just like adding caching or transactions." That familiarity lowers the activation energy in a way that matters.

The deployment flexibility is the other piece that feels well-considered. You can use the memory module in your existing EKS deployment without committing to the full AgentCore Runtime. That's the kind of incremental adoption path that gets past architecture review boards. Start with one capability, prove value, expand later.

What I'm curious about is the cold start experience on AgentCore Runtime. The pay-per-use model with no idle cost is appealing, but if every invocation spins up a new container, the latency might make certain interaction patterns feel sluggish. Have you noticed any practical impact on response times when the runtime scales from zero, or is the container startup fast enough that users don't really feel it?

Om Shree • Apr 22

Thanks Sir !
Loved your Insights!!!