Running AI Models Locally with Ollama + Spring AI: A Practical Guide for Java Developers

#technology #programming #datascience #artificialintelligen

No Cloud, No Limits: Run AI Locally with Spring Boot and Ollama

If you’ve been playing around with AI models but hate the idea of sending your data to the cloud every time you make a call to OpenAI or Hugging Face, you’re going to love this: You can now run powerful AI models right on your laptop.

In this post, I’ll walk you through how to do exactly that using Ollama and integrate it with your Spring Boot applications using Spring AI.

We’ll cover:

The different types of models you can run locally (text, audio, visual)
What models are available on Ollama and their system requirements
Which models actually run well on a local machine (no crazy GPUs needed)
How Spring AI makes working with these models ridiculously easy

Let’s dive in.

⸻

First, What Is Ollama?

Think of Ollama like Docker, but for AI models. With a single command, you can pull and run an LLM locally – no GPU required (though it helps). It’s lightweight, fast, and it gives you privacy and control over your AI workflows.

For example, want to run the mistral model? Just type:

ollama run mistral

Types of Models Ollama Supports

Right now, Ollama is primarily focused on language models, but support for other model types is starting to trickle in. Here’s a quick breakdown:

Text models: These are your bread and butter for chatbots, summarization, Q&A, etc. (e.g. phi, mistral, llama2, gemma)
Visual models: Some early support here, like llava, for things like image captioning.
Audio: Not officially in Ollama yet, but the community is starting to explore integrations with models like Whisper.

Bottom line: if you’re building anything text-based, Ollama has you covered today. More is coming soon.

What Models Can You Actually Run Locally?

Here’s a handy table that breaks down a few popular models and how much memory they typically need:

Pro tip: If you’re working on a MacBook or any device with 8 – 16 GB of RAM and no GPU, start with phi or mistral-mini. They’re fast, smart, and run surprisingly well on basic setups.

Enter Spring AI: AI for Java Devs

Now, let’s talk about Spring AI, which is the Spring ecosystem’s way of saying: “You don’t need to be a Python ninja to use AI in your Java apps.”

Spring AI gives you:

Easy integration with LLMs (OpenAI, Ollama, HuggingFace, etc.)
Chat models: for building conversational experiences
Prompt templates: like handlebars, but for prompts
Function calling: let models trigger your Spring services
Agents + tools: build smarter workflows using custom tools

To get started with Ollama, add this to your Spring Boot app:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>

Then configure it in application.yml:

spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      model: mistral

And boom – you’re now talking to an LLM from your Spring Boot controller like this:

@RestController
public class ChatController {

    private final ChatClient chatClient;

    public ChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/ask")
    public String ask(@RequestParam String message) {
        return chatClient.call(message);
    }
}

It’s clean, it’s simple, and it keeps your AI interactions private and fast.

⸻

So Why Should You Care?

Here’s why this setup is a game-changer:

No more cloud costs: Run models without paying per token.
Speed: Local models respond instantly.
Privacy: Your data stays on your machine.
Portability: Build once, run anywhere – even air-gapped environments.

If you’re building anything from customer support bots to internal dev tools, this combo of Ollama + Spring AI is about as powerful (and fun) as it gets.

⸻

Bonus: Run It All in Docker

Want to ship your entire AI app as a container? You can run Ollama and your Spring Boot app in a single Docker Compose file. Makes deployment to servers or edge devices super smooth.

⸻

Final Thoughts

AI is no longer locked behind cloud APIs and GPU clusters. With tools like Ollama and Spring AI, you can bring AI right into your own machine – and your own Java app stack.

Start small, maybe with phi or mistral-mini, wire it up with Spring AI, and see where it takes you. Whether you’re building a side project or enterprise-grade tools, this setup has you covered. Of course, when it comes to scale then gradually can be integrated with external API provider with required functionality implemented locally without spending a single penny.

If you found this helpful or want a follow-up post on vector search, RAG, or agents with Spring AI, let me know!