DEV Community: HashCoder

[Boost]

HashCoder — Fri, 11 Jul 2025 10:00:48 +0000

HashCoder

Jul 10 '25

From Zero to AI Hero: Building a Gemini-Powered Chat App with Spring Boot (Part 1: The Foundation)

#ai #java #spring #genai

Comments

6 min read

From Zero to AI Hero: Building a Gemini-Powered Chat App with Spring Boot (Part 1: The Foundation)

HashCoder — Thu, 10 Jul 2025 23:21:24 +0000

The world of Large Language Models (LLMs) is moving at a breakneck pace, and as Spring developers, we have a first-class ticket to the show: Spring AI. This incredible project from the Spring team demystifies the process of integrating powerful AI capabilities into our applications. It provides a unified, elegant abstraction over various AI providers, from OpenAI to Hugging Face, and, as we'll explore today, Google's powerful Gemini models.

In this series, we'll build a complete, production-minded AI application. For Part 1, we're laying the essential groundwork. We will create a simple yet robust "Movie Expert" chatbot. This bot will leverage the Google Gemini Pro model to answer movie-related questions, and we'll build it with all the Spring Boot best practices you know and love: dependency injection, service layers, and even asynchronous processing.

Ready to wire up your first AI? Let's get started.

The Big Picture: Our Application Architecture

Before we dive into the code, let's look at our application from 10,000 feet. The user flow is straightforward:

A user types a question into a simple web UI.
Our ChatController receives the request.
It passes the user's prompt to a dedicated MovieChatService.
The MovieChatService constructs a carefully worded prompt (telling the AI its "persona") and sends it to the Gemini model via the Spring AI ChatClient.
Simultaneously, we'll asynchronously save the conversation to a PostgreSQL database for logging and future analysis using a ChatLogService.
The AI's response is sent back to the user's screen.

Simple, clean, and scalable.

Setting the Stage: Configuration is Key

The magic of Spring AI begins in your application.properties (or .yml) file. This is where you tell Spring how to connect to the AI model of your choice. Here’s how we configure our application to use Google Gemini.

spring:
  application:
    name: Spring AI Chat
  # --- Database Configuration ---
  datasource:
    url: jdbc:postgresql://localhost:5432/moviedb
    username: admin
    password: secret
    driver-class-name: org.postgresql.Driver
  jpa:
    hibernate:
      ddl-auto: create-drop # For demo purposes; use 'validate' or 'update' in prod
    show-sql: true
    properties:
      hibernate:
        format_sql: true
    database: postgresql
    database-platform: org.hibernate.dialect.PostgreSQLDialect

  # --- Spring AI Configuration for Gemini ---
  ai:
    openai:
      chat:
        options:
          model: gemini-2.0-flash # Or gemini-1.5-flash, etc.
        base-url: https://generativelanguage.googleapis.com
        # This path is crucial for the OpenAI-compatible endpoint
        completions-path: /v1beta/models/{model}:generateContent 
      api-key: ${GEMINI_API_KEY} # Best practice: use environment variables

  # --- Async Task Executor Configuration ---
  task:
    execution:
      pool:
        core-size: 5
        max-size: 10

Let's break down the most important part—the spring.ai block:

Wait, openai? For Gemini? Yes, and this is a brilliant piece of engineering. Google provides an OpenAI-compatible endpoint for Gemini. This means we can use the battle-tested Spring AI openai-spring-boot-starter dependency and simply point it to Google's API. This gives us incredible flexibility without needing a separate, dedicated Gemini starter.
model: Here, we specify which Gemini model we want to use. I'm using gemini-2.0-flash, but you could easily switch to gemini-1.5-flash or another variant.
base-url: This is the root URL for Google's Generative Language API.
completions-path: This is the critical line. It tells Spring AI the exact path to use for generating content, which differs from the standard OpenAI path. Note: The path in the code provided in the prompt was slightly off; the official path for the REST API is /v1beta/models/{model}:generateContent.
api-key: Your secret API key from the Google AI Studio. I strongly recommend loading this from an environment variable (${GEMINI_API_KEY}) rather than hardcoding it.

We've also configured our database and a thread pool for asynchronous tasks, which we'll see in action shortly.

The Core Logic: Communicating with the AI

The MovieChatService is the heart of our application. It's responsible for taking the user's raw input, giving it context, and interacting with the ChatClient.

@Slf4j
@Service
public class MovieChatService {

    private final ChatClient chatClient;
    private final ChatLogService chatLogService;

    // Constructor injection - a Spring best practice
    public MovieChatService(ChatClient chatClient, ChatLogService chatLogService) {
        this.chatClient = chatClient;
        this.chatLogService = chatLogService;
    }

    public String getMovieChatResponse(String userPrompt) {
        // 1. Define the AI's persona with a System Prompt
        var systemPromptTemplate = new SystemPromptTemplate(Constants.MOVIE_PROMPT);
        var systemMessage = systemPromptTemplate.createMessage();

        // 2. Create a prompt from the user's direct input
        var userPromptTemplate = new PromptTemplate(userPrompt);
        var userMessage = userPromptTemplate.createMessage();

        // 3. Combine prompts and create the final Prompt object
        // NOTE: A more idiomatic Spring AI way is to pass a list of Messages
        Prompt prompt = new Prompt(List.of(systemMessage, userMessage));

        // 4. Call the AI model
        String aiResponse = chatClient.prompt(prompt)
                                      .call()
                                      .content();

        // 5. Asynchronously save the interaction
        chatLogService.saveLog(userPrompt, aiResponse);

        return aiResponse;
    }
}

Let's walk through this method:

System Prompt: We use a SystemPromptTemplate to set the AI's persona. This is our "meta-instruction" that governs all its subsequent answers. We're telling it: "You are a helpful and knowledgeable movie expert." This is crucial for controlling the AI's tone and domain.

// Constants.java
public static final String MOVIE_PROMPT = """
         You are a helpful and knowledgeable movie expert.
         Your main goal is to answer questions about movies, actors, directors, and film history.
         Please be concise and friendly in your responses.
         If the user asks a question that is NOT related to movies, politely decline to answer
         and remind them that you are a movie expert. Give only 1 fact at a time.
        """;

User Prompt: We wrap the user's raw string in a PromptTemplate.
Combine and Call: The modern Spring AI approach is to create a Prompt object containing a list of Message objects (in this case, our system and user messages). This structured format is what the ChatClient expects.
The Magic Call: The line chatClient.prompt(prompt).call().content() is the beautiful abstraction Spring AI provides. Under the hood, it's creating an HTTP request to the Gemini API, including your API key, formatting the JSON payload, sending the request, and parsing the response to give you back a simple String. All that complexity, hidden behind a clean, fluent API.
Async Logging: We immediately fire off the chatLogService.saveLog() method. We don't wait for it to complete. This is a key performance optimization.

The Supporting Cast: Web Layer and Async Logging

To make our application whole, we need a few more pieces.

Asynchronous Logging: Why bother with @Async? Imagine your database is slow or under heavy load. If we saved the log synchronously, the user would be stuck waiting for the database write to finish after the AI has already responded. By making it asynchronous (@EnableAsync on the main class is required), we send the response back to the user instantly while the database operation proceeds in a separate thread from our configured pool. It’s a small change that drastically improves user experience.

@Service
public class ChatLogService {
    // ... repository ...

    @Async // Run this method in a background thread
    public void saveLog(String userPrompt, String aiResponse) {
        log.info("Saving chat log in a separate thread...");
        ChatLog chatLog = new ChatLog();
        // ... set properties and save ...
        chatLogRepository.save(chatLog);
    }
}

The Controller: Our ChatController is a standard Spring MVC controller. It manages a simple in-memory list to display the conversation history on a Thymeleaf template. Its primary job is to orchestrate the flow between the user's browser and our backend services.

@Controller
public class ChatController {
    // ... dependencies and chat history list ...

    @PostMapping("/chat")
    public String handleChat(@RequestParam String prompt, Model model) {
        chatHistory.add(new ChatMessage("You", prompt));
        String aiResponse = movieChatService.getMovieChatResponse(prompt);
        chatHistory.add(new ChatMessage("MovieBot", aiResponse));
        return "redirect:/chat"; // Redirect-after-post pattern
    }
}

Conclusion and What's Next

And there you have it! In a surprisingly small amount of code, we've built the foundation of a robust, AI-powered chatbot. We've seen how to:

Configure Spring AI to use Google's Gemini models via its OpenAI-compatible endpoint.
Craft effective system and user prompts to guide the AI's behavior.
Use the elegant ChatClient to handle all the complex API communication.
Implement a smart, non-blocking logging strategy using @Async.

This is just the beginning. Our bot is smart, but it has no knowledge of any specific data in our own database. In Part 2 of this series, we'll level up our application by introducing Retrieval-Augmented Generation (RAG). We'll teach our AI to query our movies and actors tables to answer questions with data it wasn't originally trained on. Stay tuned

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

HashCoder — Wed, 09 Jul 2025 10:41:42 +0000

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

Last month, our card transaction processing service began failing during peak hours. The system, responsible for authorizing millions of dollars in transactions, was throwing ConnectionTimeoutException errors. The culprit? Our PostgreSQL database was drowning under the sheer volume of connection requests from our fleet of Spring Boot microservices.

After a deep dive, we implemented PgBouncer as a centralized connection pooler. The results were immediate and dramatic: we slashed our p95 latency by 75% (from 1.2s to 300ms) and tripled our peak transaction authorization throughput, completely eliminating connection timeouts.

This isn't just another "how-to." This is the production-grade blueprint, complete with the configurations, JVM tuning, and architectural reasoning that actually matters when the system cannot fail.

The Problem: Connection Pool Anarchy

Our architecture consisted of 8 Spring Boot microservice instances, each configured with HikariCP's default maximum-pool-size of 10. In theory, that's 80 concurrent connections. In reality, during traffic spikes and deployments, connection churn was causing massive contention on PostgreSQL.

The symptoms were a classic sign of database saturation:

High ConnectionTimeoutException error rates during peak traffic.
Excessive memory usage on the PostgreSQL server from managing hundreds of idle processes.
Slow query performance, even for indexed SELECT statements.
Cascading failures as upstream services timed out waiting for our service.

The Solution: PgBouncer as the Master Connection Manager

PgBouncer is a lightweight, highly-optimized proxy that sits between your applications and PostgreSQL. It maintains a master pool of physical connections to the database and serves thousands of client connections by multiplexing them through this small, efficient pool. It's the traffic controller your database desperately needs.

Here is our complete setup, ready to run in Docker.

Docker Compose Configuration

version: '3.8'

services:
  postgresql:
    image: postgres:16-alpine
    container_name: postgres-db
    environment:
      POSTGRES_DB: card_transactions_db
      POSTGRES_USER: app_user
      POSTGRES_PASSWORD: secure_password
      # Enforce modern, secure authentication from the start
      POSTGRES_INITDB_ARGS: "--auth-host=scram-sha-256"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./postgresql.conf:/etc/postgresql/postgresql.conf
    command: postgres -c config_file=/etc/postgresql/postgresql.conf
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app_user -d card_transactions_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  pgbouncer:
    image: pgbouncer/pgbouncer:1.21.0
    container_name: pgbouncer-proxy
    ports:
      - "6432:6432"
    depends_on:
      postgresql:
        condition: service_healthy
    volumes:
      # We mount our specific configs for production-like control
      - ./pgbouncer.ini:/etc/pgbouncer/pgbouncer.ini
      - ./userlist.txt:/etc/pgbouncer/userlist.txt

  card-transaction-service:
    build: .
    container_name: card-transaction-service
    environment:
      # Note: We connect to pgbouncer, not postgresql directly
      SPRING_DATASOURCE_URL: jdbc:postgresql://pgbouncer:6432/card_transactions_db
      SPRING_DATASOURCE_USERNAME: app_user
      SPRING_DATASOURCE_PASSWORD: secure_password
      SPRING_PROFILES_ACTIVE: docker
    ports:
      - "8080:8080"
    depends_on:
      - pgbouncer
    mem_limit: 1g
    mem_reservation: 512m

volumes:
  postgres_data:

Architect's Note on Startup: Docker's depends_on only ensures the container has started, not that the PgBouncer process within is ready. For true production resilience, use a wrapper script with a tool like wait-for-it.sh or build connection retry logic into your Spring application's startup sequence.

PgBouncer Configuration (`pgbouncer.ini`)

This is where the magic happens. The defaults are not sufficient for production.

[databases]
card_transactions_db = host=postgresql port=5432 dbname=card_transactions_db

[pgbouncer]
# Network and Auth
listen_addr = *
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

# Pooling Mode - CRITICAL
pool_mode = transaction

# Connection Limits
max_client_conn = 1000      # Max inbound connections from all apps
default_pool_size = 20      # Connections per DB pool kept open to Postgres
min_pool_size = 5           # Keep at least this many connections ready
reserve_pool_size = 5       # Emergency connections for admin use
max_db_connections = 40     # A hard cap on physical connections to this PG instance

# Timeouts (seconds)
server_lifetime = 3600            # Max life of a server connection
server_idle_timeout = 600         # Reconnect if idle for 10 mins
server_connect_timeout = 15
server_login_retry = 15
client_login_timeout = 60

# Query and State Management
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits

# Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
stats_period = 60

Let me explain the most critical settings:

pool_mode = transaction: This is the key to high throughput. A physical connection is assigned to a client for the duration of a single transaction and returned to the pool immediately upon COMMIT or ROLLBACK. This provides maximum reuse.
server_reset_query = DISCARD ALL: This is a non-negotiable security and stability setting. It ensures that any session-specific state (e.g., SET variables, temporary tables) is wiped clean before a connection is reused by another client, preventing state leakage between transactions.
Connection Limits: We've set max_client_conn high to accept traffic from all service instances, but default_pool_size low to protect PostgreSQL. The database now only ever sees a maximum of ~40 connections from our entire application fleet, regardless of how many microservices we scale up to.

The "Double Pooling" Dilemma: Why a Pool in Front of a Pool?

You're right to ask: "If PgBouncer is my pooler, why do I still need HikariCP?" This isn't redundancy; it's a separation of concerns.

The Application Pool (HikariCP): Its job is to manage a pool of connections to PgBouncer. These are cheap, fast, local network connections. Its purpose is to eliminate thread contention inside your Java application. When a thread needs to run a transaction, it gets a connection instantly from HikariCP without blocking. This pool should be sized for your application's peak internal concurrency.
The Master Pool (PgBouncer): Its job is to manage the truly scarce resource: physical connections to PostgreSQL. It acts as the gatekeeper, taking the high volume of requests from all your service instances and efficiently funneling them through a small, controlled number of real database connections.

By separating these concerns, you get the best of both worlds: no thread starvation in your application and no connection exhaustion on your database.

Spring Boot Configuration (`application-docker.yml`)

The key is to configure HikariCP to be a good citizen in this new world. It no longer needs to be stingy with connections.

spring:
  datasource:
    url: jdbc:postgresql://pgbouncer:6432/card_transactions_db
    username: app_user
    password: secure_password
    driver-class-name: org.postgresql.Driver
    hikari:
      pool-name: Hikari-Card-Service
      # Size this pool for your app's internal threads, not the DB.
      minimum-idle: 10
      maximum-pool-size: 20
      # Aggressive timeouts because connections are to the local PgBouncer proxy.
      idle-timeout: 30000
      max-lifetime: 600000 # 10 minutes
      connection-timeout: 3000
      validation-timeout: 2000
      connection-test-query: SELECT 1
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
    hibernate:
      ddl-auto: validate
    properties:
      hibernate:
        jdbc.batch_size: 30
        order_inserts: true
        order_updates: true

Architect's Note: We've removed explicit autocommit properties. In a modern Spring Boot application using @Transactional, the Spring DataSourceTransactionManager is the definitive authority. It correctly handles transaction boundaries, making these lower-level settings redundant and potentially confusing.

JVM Tuning for Containerized Performance (`Dockerfile`)

Your application code runs on the JVM, and the JVM needs to be told it's running in a resource-constrained container.

FROM openjdk:21-jre-slim

# Tell the JVM it's in a container and tune the GC for low-latency service work
ENV JAVA_OPTS="-Xms512m -Xmx768m \
    -XX:+UseG1GC \
    -XX:MaxGCPauseMillis=100 \
    -XX:+UseStringDeduplication"

COPY target/card-transaction-service-*.jar app.jar

EXPOSE 8080

ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar /app.jar"]

Performance Testing Results

The numbers speak for themselves.

Metric	Before PgBouncer	After PgBouncer	Improvement
Avg. Response Time	450ms	150ms	67% reduction
95th Percentile	1.2s	300ms	75% reduction
Connection Errors	~15-20%	0%	Eliminated
Active DB Connections	80-120	20-25	~75% reduction

Load Testing with Artillery

Verifying performance requires a realistic load test.

# load-test.yml
config:
  target: 'http://localhost:8080'
  phases:
    - duration: 300
      arrivalRate: 100
      name: "Ramp-up"
    - duration: 600
      arrivalRate: 250
      name: "Sustained Peak Load"

scenarios:
  - name: "Authorize Card Transaction"
    flow:
      - post:
          url: "/api/transactions/authorize"
          json:
            cardNumber: "4{{ $randomDigits(15) }}"
            expiryMonth: "{{ $randomInt(1, 12) }}"
            expiryYear: 2026
            cvv: "{{ $randomDigits(3) }}"
            amount: "{{ $randomInt(10, 500) }}.00"
            currency: "USD"

Common Pitfalls and How to Avoid Them

Pool Sizing Mismatch: The most common mistake is sizing the application pool larger than PgBouncer's pool, creating a bottleneck.
- Rule: Sum of all HikariCP pools should be comfortably less than pgbouncer.max_client_conn. A single HikariCP pool (maximum-pool-size) should be less than or equal to pgbouncer.default_pool_size.
Using SESSION Pool Mode: Unless you are using features that require a persistent connection for the entire session (like advisory locks), avoid session mode. It defeats the purpose of high-throughput pooling. transaction mode is almost always what you want.
Authentication Errors: Ensure PostgreSQL's pg_hba.conf and PgBouncer's userlist.txt use the same authentication method (scram-sha-256 is the modern standard).
Prepared Statements (PREPARE): By default, PgBouncer in transaction mode doesn't support named prepared statements. Spring/Hibernate uses these under the hood. The PostgreSQL JDBC driver is smart enough to detect a proxy like PgBouncer and will fall back to unnamed statements, but you must use a recent version of the driver.

Key Takeaways

Centralize Connection Pooling: Don't let individual applications manage scarce database connections. Use a dedicated tool like PgBouncer to act as a central gatekeeper.
Tune for a Layered World: Configure your application pool (HikariCP) for in-app thread performance and your master pool (PgBouncer) for database resource protection.
transaction Mode is Your Default: It provides the best balance of performance and compatibility for microservice architectures.
State Management is Critical: Use DISCARD ALL to guarantee transactional isolation when connections are reused.

The performance gains we achieved weren't magic. They were the result of a systematic approach to identifying a bottleneck and applying the right architectural pattern—with the right configuration—to solve it.

How have you tackled database connection scaling in your environment? Share your war stories in the comments.

DEV Community: HashCoder

[Boost]

From Zero to AI Hero: Building a Gemini-Powered Chat App with Spring Boot (Part 1: The Foundation)

From Zero to AI Hero: Building a Gemini-Powered Chat App with Spring Boot (Part 1: The Foundation)

The Big Picture: Our Application Architecture

Setting the Stage: Configuration is Key

The Core Logic: Communicating with the AI

The Supporting Cast: Web Layer and Async Logging

Conclusion and What's Next

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

The Problem: Connection Pool Anarchy

The Solution: PgBouncer as the Master Connection Manager

Docker Compose Configuration

PgBouncer Configuration (pgbouncer.ini)

The "Double Pooling" Dilemma: Why a Pool in Front of a Pool?

Spring Boot Configuration (application-docker.yml)

JVM Tuning for Containerized Performance (Dockerfile)

Performance Testing Results

Load Testing with Artillery

Common Pitfalls and How to Avoid Them

Key Takeaways

PgBouncer Configuration (`pgbouncer.ini`)

Spring Boot Configuration (`application-docker.yml`)

JVM Tuning for Containerized Performance (`Dockerfile`)