<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sahaj Gupta</title>
    <description>The latest articles on DEV Community by Sahaj Gupta (@sahaj_gupta_7).</description>
    <link>https://dev.to/sahaj_gupta_7</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3921940%2F7ef70689-fac3-4073-9812-7ac10a3fbb2f.png</url>
      <title>DEV Community: Sahaj Gupta</title>
      <link>https://dev.to/sahaj_gupta_7</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sahaj_gupta_7"/>
    <language>en</language>
    <item>
      <title>Building a Local AI Chatbot with Gemma 4 and Java</title>
      <dc:creator>Sahaj Gupta</dc:creator>
      <pubDate>Sat, 09 May 2026 14:27:41 +0000</pubDate>
      <link>https://dev.to/sahaj_gupta_7/building-a-local-ai-chatbot-with-gemma-4-and-java-cmg</link>
      <guid>https://dev.to/sahaj_gupta_7/building-a-local-ai-chatbot-with-gemma-4-and-java-cmg</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Building a Local AI Chatbot with Gemma 4 and Java
&lt;/h1&gt;

&lt;p&gt;Recently, I started exploring local AI models because I wanted to understand how modern LLMs actually work beyond cloud APIs. Most tutorials I found online were focused on Python, but since I usually work with Java projects, I wanted to try something different.&lt;/p&gt;

&lt;p&gt;So I decided to build a simple local AI chatbot using Gemma 4, Ollama, and Spring Boot.&lt;/p&gt;

&lt;p&gt;Honestly, I expected the setup to be complicated, but it turned out to be much simpler than I thought.&lt;/p&gt;

&lt;p&gt;In this tutorial, I’ll show exactly how I got everything working on my laptop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Wanted to Try Local AI
&lt;/h2&gt;

&lt;p&gt;Most AI applications today depend heavily on cloud APIs. While that works well, there are still some drawbacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internet dependency&lt;/li&gt;
&lt;li&gt;API usage limits&lt;/li&gt;
&lt;li&gt;Extra costs&lt;/li&gt;
&lt;li&gt;Privacy concerns&lt;/li&gt;
&lt;li&gt;Latency issues sometimes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted to see whether a lightweight local model could still generate useful responses without depending completely on external services.&lt;/p&gt;

&lt;p&gt;That’s where Gemma came in.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Gemma 4?
&lt;/h2&gt;

&lt;p&gt;Google released Gemma as a family of lightweight open AI models designed for developers and researchers.&lt;/p&gt;

&lt;p&gt;One thing I liked about Gemma is that it can run locally on consumer hardware without requiring expensive cloud infrastructure.&lt;/p&gt;

&lt;p&gt;That makes it useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Learning AI development&lt;/li&gt;
&lt;li&gt;Offline experimentation&lt;/li&gt;
&lt;li&gt;Personal projects&lt;/li&gt;
&lt;li&gt;Private applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of course, local models are not as powerful as massive cloud-hosted models, but they are surprisingly capable for development and testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools Used
&lt;/h2&gt;

&lt;p&gt;For this project, I used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;Gemma 4&lt;/li&gt;
&lt;li&gt;Spring Boot&lt;/li&gt;
&lt;li&gt;Java 21&lt;/li&gt;
&lt;li&gt;IntelliJ IDEA&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  My Laptop Specs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;16 GB RAM&lt;/li&gt;
&lt;li&gt;Intel i5 processor&lt;/li&gt;
&lt;li&gt;No dedicated GPU&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even without a GPU, the model still worked reasonably well for small prompts.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step 1: Install Ollama
&lt;/h1&gt;

&lt;p&gt;Ollama makes running local LLMs extremely simple.&lt;/p&gt;

&lt;p&gt;After installing Ollama, open the terminal and verify the installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is installed correctly, you should see the installed version.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step 2: Download and Run Gemma
&lt;/h1&gt;

&lt;p&gt;Now download the Gemma model locally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama run gemma3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first download took several minutes on my system because the model files are quite large.&lt;/p&gt;

&lt;p&gt;After downloading, Ollama automatically starts the local model server.&lt;/p&gt;

&lt;p&gt;This was honestly the easiest part of the setup.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step 3: Create the Spring Boot Project
&lt;/h1&gt;

&lt;p&gt;I created a simple Spring Boot application using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Web&lt;/li&gt;
&lt;li&gt;Lombok&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Project structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src
 └── main
      └── java
           └── chatbot
                ├── controller
                ├── service
                └── ChatbotApplication.java
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing complicated here.&lt;/p&gt;




&lt;h1&gt;
  
  
  Step 4: Connecting Java with Gemma
&lt;/h1&gt;

&lt;p&gt;What surprised me most was how easy the integration was.&lt;/p&gt;

&lt;p&gt;Ollama exposes a local REST API, so Java can directly communicate with the model.&lt;/p&gt;

&lt;p&gt;I created the following service class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;RestTemplate&lt;/span&gt; &lt;span class="n"&gt;restTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RestTemplate&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;askAI&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                &lt;span class="s"&gt;"http://localhost:11434/api/generate"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;requestBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""
        {
          "model": "gemma3",
          "prompt": "%s",
          "stream": false
        }
        """&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setContentType&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APPLICATION_JSON&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;requestBody&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                &lt;span class="n"&gt;restTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;postForEntity&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;entity&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;
                &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBody&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sends prompts to Gemma running locally and returns the generated response.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step 5: Create the Controller
&lt;/h1&gt;

&lt;p&gt;Next, I added a REST endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@RestController&lt;/span&gt;
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/chat"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Autowired&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;askAI&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the chatbot can be accessed directly from the browser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8080/chat?prompt=Explain+Java+Threads
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  First Problem I Faced
&lt;/h1&gt;

&lt;p&gt;Initially, the API was not responding.&lt;/p&gt;

&lt;p&gt;After wasting almost 15 minutes debugging the Java code, I realized Ollama was not running in the background.&lt;/p&gt;

&lt;p&gt;Running this command fixed the issue:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, everything started working properly.&lt;/p&gt;

&lt;h1&gt;
  
  
  Testing the Chatbot
&lt;/h1&gt;

&lt;p&gt;I tested several prompts like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Explain operating system deadlocks in simple words.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Write a short Java multithreading example.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What is the difference between stack and heap memory?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The responses were actually much better than I expected from a local model.&lt;/p&gt;

&lt;p&gt;Short prompts generated responses within a few seconds on my laptop.&lt;/p&gt;

&lt;p&gt;Longer prompts were slower, but still usable.&lt;/p&gt;

&lt;h1&gt;
  
  
  What I Learned During This Project
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. Local AI feels more flexible
&lt;/h2&gt;

&lt;p&gt;Since everything runs on your own machine, experimentation becomes easier.&lt;/p&gt;

&lt;p&gt;You can test prompts freely without worrying about API usage limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prompt quality matters a lot
&lt;/h2&gt;

&lt;p&gt;Small wording changes can significantly improve responses.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;vague prompts → average output&lt;/li&gt;
&lt;li&gt;specific prompts → much better output&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Hardware still matters
&lt;/h2&gt;

&lt;p&gt;The biggest limitation is hardware.&lt;/p&gt;

&lt;p&gt;Larger models need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More RAM&lt;/li&gt;
&lt;li&gt;Better CPU/GPU&lt;/li&gt;
&lt;li&gt;More storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Smaller models perform much better on normal laptops.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Java integration is easier than expected
&lt;/h2&gt;

&lt;p&gt;Before this project, I assumed AI integration would mostly require Python.&lt;/p&gt;

&lt;p&gt;But since Ollama exposes a REST API, integrating with Java applications is actually very straightforward.&lt;/p&gt;

&lt;h1&gt;
  
  
  Performance on My Laptop
&lt;/h1&gt;

&lt;p&gt;Here’s what I observed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short prompts: 2–5 seconds&lt;/li&gt;
&lt;li&gt;Longer prompts: 10–20 seconds&lt;/li&gt;
&lt;li&gt;RAM usage increased noticeably during generation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even without a dedicated GPU, the project still worked decently for learning purposes.&lt;/p&gt;

&lt;h1&gt;
  
  
  Possible Improvements
&lt;/h1&gt;

&lt;p&gt;Some things I want to try next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add chat history&lt;/li&gt;
&lt;li&gt;Create a frontend UI&lt;/li&gt;
&lt;li&gt;Stream responses in real time&lt;/li&gt;
&lt;li&gt;Store conversations in a database&lt;/li&gt;
&lt;li&gt;Add PDF summarization&lt;/li&gt;
&lt;li&gt;Experiment with other Gemma variants&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;Before trying this project, I thought local AI development would be difficult and resource-heavy.&lt;/p&gt;

&lt;p&gt;But tools like Ollama make the setup surprisingly beginner-friendly.&lt;/p&gt;

&lt;p&gt;Gemma turned out to be a solid starting point for experimenting with local LLMs, especially for developers who want more control and privacy.&lt;/p&gt;

&lt;p&gt;This project also changed my perspective on Java AI integration. I expected the process to be much harder, but using a simple REST API made everything manageable.&lt;/p&gt;

&lt;p&gt;If you are a Java developer interested in AI, building a small local chatbot is honestly one of the best ways to start experimenting.&lt;/p&gt;

&lt;h1&gt;
  
  
  Useful Links
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;https://ollama.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ai.google.dev/gemma" rel="noopener noreferrer"&gt;https://ai.google.dev/gemma&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://spring.io/projects/spring-boot" rel="noopener noreferrer"&gt;https://spring.io/projects/spring-boot&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Tags
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#ai
#java
#machinelearning
#gemma
#tutorial
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>devchallenge</category>
    </item>
  </channel>
</rss>
