Ashish Sharda

Posted on May 3

I Built an AI Agent in Java (No Python. No Hype. Just Code.)

#java #springboot #ai #programming

Everyone told me I needed Python for AI. I didn't listen. Here's what happened.

Let me be real with you.

Every time I say "I'm building an AI agent," people assume I'm wrist-deep in Python virtual environments, pip dependencies, and a LangChain tutorial from 2023. And when I say "in Java?" — I get the look. You know the one.

So I built it anyway.

A fully functional AI agent. With tool use. With RAG. With MCP. Running on the JVM. Spring Boot 3.5, zero Python sidecars, no regrets.

Here's exactly how I did it — with real code you can run today.

Why Java for AI? (The Short Version)

The honest answer: because that's where my backend already lives.

Python is great for training models. But if your production system is Java — and for most of us in enterprise land, it is — then integrating AI means either maintaining a Python sidecar service, doing HTTP hops between runtimes, or just... not doing it cleanly.

Spring AI changes that equation completely. As of Spring AI 1.1.5 (released April 27, 2026 — yes, last week), you get:

A ChatClient that works with 20+ AI model providers (OpenAI including GPT-5, Anthropic Claude, Ollama, Google Gemini, Azure OpenAI, and more)
A full Advisors API for RAG and conversation memory
Native MCP (Model Context Protocol) support — servers and clients, annotation-driven
Prompt caching for Anthropic and AWS Bedrock (up to 90% cost reduction)
Switching AI providers = one line in application.yml

💡 Heads up for the curious: Spring AI 2.0 is in active milestone with GA targeting late May 2026. It moves to a Spring Boot 4.0 baseline and adds full null-safety via JSpecify. The 1.1.x line is stable and production-ready right now — that's what we're using here.

Let's build something real.

What We're Building

A Research Agent that can:

Accept a natural language question from an HTTP endpoint
Search a knowledge base (RAG) for relevant context
Call external tools via MCP (a news search tool)
Return a grounded, intelligent answer

No LangChain. No Python. Just Java 21 + Spring Boot 3.5 + Spring AI 1.1.5.

Project Setup

Dependencies (pom.xml)

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>org.springframework.ai</groupId>
      <artifactId>spring-ai-bom</artifactId>
      <version>1.1.5</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <!-- Spring Boot Web -->
  <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
  </dependency>

  <!-- Spring AI - Anthropic Claude -->
  <dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
  </dependency>

  <!-- Spring AI - MCP Client -->
  <dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-mcp-client</artifactId>
  </dependency>

  <!-- Spring AI - Vector Store (in-memory for this demo) -->
  <dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-vector-store-simple</artifactId>
  </dependency>
</dependencies>

💡 Swap spring-ai-starter-model-anthropic for spring-ai-starter-model-openai and change one config line. The ChatClient code stays identical. That's the point.

Step 1 — Configure the Model and MCP

# application.yml
spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: claude-sonnet-4-20250514
    mcp:
      client:
        name: research-agent-client
        version: 1.0.0
        tool-callbacks-enabled: true
        streamable:
          http:
            connections:
              news-server:
                url: http://localhost:8090  # Our MCP tool server (built below)

That's it for config. Spring Boot auto-configuration handles the rest.

Step 2 — Build the MCP Tool Server

This is the agent's "hands." A separate Spring Boot app that exposes tools the AI can call.

@SpringBootApplication
public class NewsToolServer {
    public static void main(String[] args) {
        SpringApplication.run(NewsToolServer.class, args);
    }
}

@Service
public class NewsSearchTool {

    @Tool(description = "Search for recent news articles on a given topic. " +
                         "Returns a list of relevant headlines and summaries.")
    public List<NewsResult> searchNews(
            @ToolParam(description = "The topic or keyword to search for") String topic,
            @ToolParam(description = "Max number of results to return") int maxResults) {

        // In production: call a real news API (NewsAPI, Bing, etc.)
        // For this demo, we return simulated results
        return List.of(
            new NewsResult(
                "Java Sees Surge in AI Workloads in 2026",
                "Enterprise teams are increasingly choosing Java for AI production systems..."
            ),
            new NewsResult(
                "Spring AI 1.1.5 Ships with Security Fixes and OpenAI SDK Integration",
                "JVM developers can now build AI agents without Python sidecars..."
            )
        ).stream().limit(maxResults).toList();
    }
}

public record NewsResult(String headline, String summary) {}

# application.yml for the tool server
server:
  port: 8090
spring:
  ai:
    mcp:
      server:
        name: news-server
        version: 1.0.0

Spring Boot auto-configuration discovers the @Tool annotation and registers searchNews as an MCP-exposed tool over Streamable HTTP. Zero boilerplate registration. Annotate a method, you're done.

Step 3 — Wire the Agent (The Good Part)

Back in our main application. This is where it all comes together.

@Configuration
public class AgentConfig {

    @Bean
    public ChatClient researchAgent(
            ChatClient.Builder builder,
            ToolCallbackProvider mcpTools,
            VectorStore vectorStore) {

        return builder
            // Give the agent access to MCP tools (news search, etc.)
            .defaultToolCallbacks(mcpTools)
            // RAG advisor — searches vector store before every prompt
            .defaultAdvisors(
                new QuestionAnswerAdvisor(vectorStore),
                new MessageChatMemoryAdvisor(new InMemoryChatMemory())
            )
            // System prompt defines agent behavior
            .defaultSystem("""
                You are a research assistant with access to real-time news tools
                and a curated knowledge base. Always cite your sources.
                When answering questions, first check your knowledge base,
                then use available tools to find current information.
                Be concise, accurate, and honest about what you don't know.
                """)
            .build();
    }
}

The ToolCallbackProvider is the key abstraction here. Spring AI auto-populates it with every tool discovered from connected MCP servers, sends the tool schemas to the LLM with the initial prompt, executes tool calls when the model decides it needs them, and feeds results back — all transparently.

Step 4 — The REST Endpoint

@RestController
@RequestMapping("/agent")
public class ResearchAgentController {

    private final ChatClient researchAgent;

    public ResearchAgentController(ChatClient researchAgent) {
        this.researchAgent = researchAgent;
    }

    @GetMapping("/ask")
    public AgentResponse ask(@RequestParam String question,
                             @RequestParam(defaultValue = "session-1") String sessionId) {

        String answer = researchAgent.prompt()
            .user(question)
            .advisors(advisor -> advisor.param(
                MessageChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId))
            .call()
            .content();

        return new AgentResponse(question, answer);
    }

    // Seed the knowledge base
    @PostMapping("/knowledge")
    public void addKnowledge(@RequestBody KnowledgeRequest req,
                             VectorStore vectorStore) {
        vectorStore.add(List.of(
            new Document(req.content(), Map.of("source", req.source()))
        ));
    }

    public record AgentResponse(String question, String answer) {}
    public record KnowledgeRequest(String content, String source) {}
}

Let's See It Work

Start the news tool server on port 8090, then the main agent on port 8080.

Seed some knowledge:

curl -X POST http://localhost:8080/agent/knowledge \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Spring AI 1.1.5 supports 20+ AI model providers including OpenAI with GPT-5, Anthropic Claude, Google Gemini, Ollama, and Azure OpenAI. It provides a unified ChatClient API.",
    "source": "Spring AI release notes"
  }'

Ask the agent:

curl "http://localhost:8080/agent/ask?question=What+AI+models+does+Spring+AI+support+and+what+is+happening+with+Java+AI+adoption?"

What happens under the hood:

QuestionAnswerAdvisor searches the vector store and injects relevant context
The model sees the question + retrieved docs
The model decides it wants current news → calls searchNews("Java AI adoption", 3) via MCP
Spring AI executes the tool call, feeds results back to the model
The model synthesizes a final grounded answer
MessageChatMemoryAdvisor stores this exchange for the next turn in the session

All of that — tool use, RAG, memory — in a single .prompt().user().call().content() chain.

The Part That Gets Me

Here's the full ChatClient call:

String answer = researchAgent.prompt()
    .user(question)
    .call()
    .content();

Three lines. The advisors and tool routing handle everything else.

Compare this to the equivalent Python + LangChain setup: agent chains, callback handlers, tool registrations, memory buffer classes, and a requirements.txt that breaks every two months.

Switching AI Providers

Want to swap Claude for GPT-4o? Two changes:

pom.xml: Replace spring-ai-starter-model-anthropic with spring-ai-starter-model-openai

application.yml:

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-4o

Your ChatClient code? Unchanged. Not a single line.

Want to run locally with zero API costs? Swap in Ollama:

spring:
  ai:
    ollama:
      chat:
        options:
          model: llama3.2

Same code. Different model. That's the portable API promise, and it actually delivers.

What This Unlocks

Once you're here, the next steps are straightforward:

Multi-agent systems — wire multiple ChatClient beans with different system prompts and tool sets, let them coordinate via MCP
Streaming responses — swap .call().content() for .stream().content() and pipe to an SSE endpoint
Prompt caching — Spring AI 1.1.5 ships Anthropic prompt caching support out of the box, cutting costs up to 90% for repeated context
Production observability — native Micrometer integration gives you token counts, latency, and model call traces
GraalVM native — compile the whole agent to a native binary for sub-100ms startup
Spring AI 2.0 — if you're starting fresh and don't mind milestones, 2.0 adds Spring Boot 4.0 baseline, full JSpecify null-safety, and Jackson 3. GA is targeted for late May 2026.

The Takeaway

Python is great for training models. For building AI agents on top of them — integrating with your existing infrastructure, your databases, your APIs, your Spring services — Java is not second class anymore.

Spring AI 1.1.5 isn't a wrapper around Python tooling. It's a native, production-grade AI framework built for the JVM by the same team that built Spring Boot. The ChatClient API is clean. The MCP integration is real. The Advisors chain is genuinely powerful.

You don't need a Python sidecar. You don't need to learn LangChain. You don't need to maintain two runtimes.

You just need Spring Boot and an API key.

Built with Spring Boot 3.5, Spring AI 1.1.5, Java 21, Claude Sonnet via Anthropic API. Published May 2026.

Drop a comment if you want a Part 2 — streaming responses, multi-agent coordination, or GraalVM native compilation.

DEV Community