I found that getting a local LLM up and running is becoming easier and easier, so it's time to update the guides. In this article, I'm setting up Ollama and building a minimal web UI—all in Java.
- A full-stack application with a Vaadin frontend and Spring Boot backend.
- Real-time chatting capability using a local Ollama in a container.
- Dynamic message streaming and display of Markdown for the UX you would expect.
We will need the following installed:
- Java 21
- Maven for building the app
- Local Docker container environment for running Ollama
Install and Run Ollama Container
First, set up the Ollama Docker container:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Next, we need to install some of the supported LLM models, with 'mistral' being the default. This will take a while to download:
docker exec ollama ollama pull mistral
Once done, you can check if the model is up and running by calling the REST API from the command line. For example:
curl http://localhost:11434/api/chat -d '{"model": "mistral", "messages": [{"role": "user", "content": "is black darker than white?"}], "stream":false}'
Creating the Web UI
Generate a new Spring Boot project using Spring Initializr. You can configure the dependencies you need, but for this, we only need:
- Ollama - Spring AI APIs for the local LLM
- Vaadin - for Java web UI
Here is direct link to the configuration.
This will create a ready-to-run project that you can import into your Java IDE.
Adding Extras
To use Vaadin add-ons, configure the dependencies in your pom.xml. We want to use the Viritin add-on  for handy Markdown streaming.
<dependency>
    <groupId>in.virit</groupId>
    <artifactId>viritin</artifactId>
    <version>2.8.14</version>
</dependency>
Building the Web UI
We still need the UI. Create a MainView class to handle the chat UI and interaction with the local LLM. Injecting Spring beans into the Vaadin UI, this is all that is needed:
@Route("") // map view to the root
class MainView extends VerticalLayout {
    private final ArrayList<Message> chatHistory = new ArrayList<>();
    VerticalLayout messageList = new VerticalLayout();
    Scroller messageScroller = new Scroller(messageList);
    MessageInput messageInput = new MessageInput();
    MainView(StreamingChatClient chatClient) {
        add(messageScroller, messageInput);
        setSizeFull();
        setMargin(false);
        messageScroller.setSizeFull();
        messageInput.setWidthFull();
        // Add system message to help the AI to behave
        chatHistory.add(new SystemMessage("Only if the user asks you about Vaadin, reply in bro style. Always show a piece a code."));
        messageInput.addSubmitListener(ev -> {
            // Add use input as markdown message
            chatHistory.add(new UserMessage(ev.getValue()));
            messageList.add(new MarkdownMessage(ev.getValue(),"Me"));
            // Placeholder message for the upcoming AI reply
            MarkdownMessage reply = new MarkdownMessage("Assistant");
            messageList.add(reply);
            // Ask AI and stream back the reply to UI
            Prompt prompt = new Prompt(chatHistory);
            chatClient.stream(prompt)
                    .doOnComplete(() -> chatHistory.add(new AssistantMessage(reply.getMarkdown())))
                    .subscribe(cr -> reply.appendMarkdownAsync(cr.getResult().getOutput().getContent()));
            reply.scrollIntoView();
        });
    }
}
You can run the application either by running the main Application.java class from your IDE or from the command line using Maven: `mvn spring-boot:run'. 
Now we can chat with the Ollama Mistral model locally at localhost:8080
Conclusion
This was another one of those step-by-step guides that I wrote mostly for myself when setting up a small demo application. The full app is in GitHub (note the branch), if you wish to use it as a starting point for your own demos.
If you find it useful, let me know in the comments. Thanks!
 
 
              

 
    
Top comments (4)
Would it possible to use Testcontainers and not Docker? They support Ollama. java.testcontainers.org/modules/ol...
Good idea! I haven't tried that, but why not. I don't think you can skip Docker, but instead use it from the Java app. I'm wondering how would it behave in application restart (during development you get those a lot)?
I made some testing and with some tricks you can actually use Testcontainers. To keep the LLM models over restarts and expose a fixed port from Ollama you need some container configuration like this:
My new post for Java developers ✨
Hope this help. Thank you 🌻
Top 10 GitHub repositories for Java developers ✨
Sáng Minh Trần ・ May 26