Retrieval-Augmented Generation (RAG) is a powerful design pattern that allows you to ground Large Language Models (LLMs) with your proprietary, real-time context. This prevents hallucinations and eliminates the need for expensive model fine-tuning.
This comprehensive guide walks you through building a completely local, production-ready RAG application using the Spring ai.
1. Prerequisites and Local Environment Setup
Before touching Java code, you need to set up the infrastructure. Create a compose.yml file to spin up PostgreSQL (with the pgvector extension) and Ollama:
docker-compose.yml
services:
postgres:
image: pgvector/pgvector:pg16
container_name: spring-ai-rag-postgres
environment:
POSTGRES_DB: spring_ai_rag
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
ports:
- "5432:5432"
ollama:
image: ollama/ollama:latest
container_name: spring-ai-rag-ollama
ports:
- "11434:11434"
Pulling the Local AI Models
Start your Docker containers by running docker compose up -d. Next, allocate models to your local Ollama engine via your terminal:
docker exec -it spring-ai-rag-ollama ollama pull llama3.2
docker exec -it spring-ai-rag-ollama ollama pull nomic-embed-text
2. Project Setup & Dependencies
Head over to start.spring.io and create a standard Spring Boot project using Maven or Gradle. Add the following dependencies to your pom.xml:
<!-- Spring Web for REST APIs -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Spring AI Ollama Support -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<!-- Spring AI PGVector Store Starter -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
<!-- Apache Tika Document Reader Support -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
</dependencies>`
3. Configuration Management
Add the following properties to your src/main/resources/application.yml file to stitch your components together:
spring:
application:
name: spring-ai-local-rag
# Database Connection Details
datasource:
url: jdbc:postgresql://localhost:5432/spring_ai_rag
username: postgres
password: postgres
ai:
# Ollama Model Configurations
ollama:
base-url: http://localhost:11434
chat:
options:
model: llama3.2
embedding:
options:
model: nomic-embed-text
# Vector Database Settings
vectorstore:
pgvector:
initialize-schema: true
table-name: rag_documents
4. Implementing Retrieval and Chat Generation
Next, build the execution pipeline. We configure a structured ChatClient. This component automatically queries PGVector behind the scenes based on incoming prompts and attaches relevant context.
public class RagService {
@Value("classpath:rag-guide.txt")
Resource textfile;
ChatClient chatClient;
VectorStore vectorStore;
public RagService(ChatClient.Builder chatClient, VectorStore vectorStore) {
this.chatClient = chatClient.build();
this.vectorStore = vectorStore;
}
public void ingestText() {
System.out.println("Reading document...");
TikaDocumentReader reader = new TikaDocumentReader(textfile);
List<Document> documents = reader.get();
System.out.println("Documents loaded: " + documents.size());
vectorStore.add(documents);
System.out.println("Documents added to vector store");
}
public String askSimpleQuestion() {
String question = "What is RAG?";
System.out.println("Starting similarity search...");
SearchRequest searchRequest = SearchRequest.builder()
.query(question)
.topK(3)
.build();
List<Document> documents =
vectorStore.similaritySearch(searchRequest);
System.out.println("Similarity search completed");
StringBuilder context = new StringBuilder();
for (Document document : documents) {
context.append(document.getText()).append("\n");
}
System.out.println("Calling LLM...");
String prompt = """
Given the following context information,
answer the question.
Context:
%s
Question:
%s
""".formatted(context, question);
String response = chatClient.prompt(prompt)
.call()
.content();
System.out.println("LLM response received");
return response;
}
}
5. Data foler
Create a store rag-guide.txt file in resources folder
6. Running the application
We will run the application using CommandLineRuner interface of springboot as show below:
@SpringBootApplication
public class RagCliApplication implements CommandLineRunner{
@Autowired
private RagService service;
public static void main(String[] args) {
SpringApplication.run(SpringAiCliApplication.class, args);
}
@Override
public void run(String... args) throws Exception {
service.ingestText();
final String s = service.askSimpleQuestion();
System.out.println(s);
}
}
Top comments (0)