Building a Local RAG Application with Spring AI, Ollama, PGVector, and Apache Tika

#ai #spring

Retrieval-Augmented Generation (RAG) is a powerful design pattern that allows you to ground Large Language Models (LLMs) with your proprietary, real-time context. This prevents hallucinations and eliminates the need for expensive model fine-tuning.
This comprehensive guide walks you through building a completely local, production-ready RAG application using the Spring ai.

1. Prerequisites and Local Environment Setup

Before touching Java code, you need to set up the infrastructure. Create a compose.yml file to spin up PostgreSQL (with the pgvector extension) and Ollama:

docker-compose.yml
services:
  postgres:
    image: pgvector/pgvector:pg16
    container_name: spring-ai-rag-postgres
    environment:
      POSTGRES_DB: spring_ai_rag
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
    ports:
      - "5432:5432"

  ollama:
    image: ollama/ollama:latest
    container_name: spring-ai-rag-ollama
    ports:
      - "11434:11434"

Pulling the Local AI Models
Start your Docker containers by running docker compose up -d. Next, allocate models to your local Ollama engine via your terminal:

docker exec -it spring-ai-rag-ollama ollama pull llama3.2
docker exec -it spring-ai-rag-ollama ollama pull nomic-embed-text

2. Project Setup & Dependencies

Head over to start.spring.io and create a standard Spring Boot project using Maven or Gradle. Add the following dependencies to your pom.xml:

    <!-- Spring Web for REST APIs -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Spring AI Ollama Support -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    </dependency>

    <!-- Spring AI PGVector Store Starter -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
    </dependency>

    <!-- Apache Tika Document Reader Support -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-tika-document-reader</artifactId>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
</dependencies>`

3. Configuration Management

Add the following properties to your src/main/resources/application.yml file to stitch your components together:

spring:
  application:
    name: spring-ai-local-rag

  # Database Connection Details
  datasource:
    url: jdbc:postgresql://localhost:5432/spring_ai_rag
    username: postgres
    password: postgres

  ai:
    # Ollama Model Configurations
    ollama:
      base-url: http://localhost:11434
      chat:
        options:
          model: llama3.2
      embedding:
        options:
          model: nomic-embed-text

    # Vector Database Settings
    vectorstore:
      pgvector:
        initialize-schema: true
        table-name: rag_documents

4. Implementing Retrieval and Chat Generation

Next, build the execution pipeline. We configure a structured ChatClient. This component automatically queries PGVector behind the scenes based on incoming prompts and attaches relevant context.

public class RagService {

    @Value("classpath:rag-guide.txt")
    Resource textfile;

    ChatClient chatClient;
    VectorStore vectorStore;

    public RagService(ChatClient.Builder chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient.build();
        this.vectorStore = vectorStore;
    }

    public void ingestText() {

        System.out.println("Reading document...");

        TikaDocumentReader reader = new TikaDocumentReader(textfile);

        List<Document> documents = reader.get();

        System.out.println("Documents loaded: " + documents.size());

        vectorStore.add(documents);

        System.out.println("Documents added to vector store");
    }


    public String askSimpleQuestion() {
        String question = "What is RAG?";
        System.out.println("Starting similarity search...");
        SearchRequest searchRequest = SearchRequest.builder()
                .query(question)
                .topK(3)
                .build();
        List<Document> documents =
                vectorStore.similaritySearch(searchRequest);
        System.out.println("Similarity search completed");
        StringBuilder context = new StringBuilder();
        for (Document document : documents) {
            context.append(document.getText()).append("\n");
        }
        System.out.println("Calling LLM...");

        String prompt = """
                Given the following context information,
                answer the question.
                Context:
                %s
                Question:
                %s
                """.formatted(context, question);

        String response = chatClient.prompt(prompt)
                .call()
                .content();

        System.out.println("LLM response received");

        return response;
    }
}

5. Data foler

Create a store rag-guide.txt file in resources folder

6. Running the application

We will run the application using CommandLineRuner interface of springboot as show below:

@SpringBootApplication
public class RagCliApplication implements CommandLineRunner{

    @Autowired
    private RagService service;

    public static void main(String[] args) {
        SpringApplication.run(SpringAiCliApplication.class, args);
    }


    @Override
    public void run(String... args) throws Exception {
         service.ingestText();
        final String s = service.askSimpleQuestion();
        System.out.println(s);
    }
}