DEV Community

Alkademy
Alkademy

Posted on • Originally published at munonye.com

RAG with Spring Boot — Embeddings and Vector Search Step by Step (2026)

Canonical URL: Republished from munonye.com. Full code on GitHub.

Learn how to build a RAG Spring Boot tutorial pipeline that answers questions from your own documents. This post extends the AI Developer Tutorials series and connects to M7-A Spring AI REST basics.

RAG architecture

Documents → chunk → embed → VectorStore
User question → embed → top-K similar chunks → prompt → LLM → answer
Enter fullscreen mode Exit fullscreen mode

Dependencies

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Ingestion service

@Service
public class DocumentIngestionService {
  private final VectorStore vectorStore;
  private final Resource docsFolder;

  public DocumentIngestionService(VectorStore vectorStore,
      @Value("classpath:docs/") Resource docsFolder) {
    this.vectorStore = vectorStore;
    this.docsFolder = docsFolder;
  }

  public void ingestAll() throws IOException {
    for (Resource file : docsFolder.getFile().listFiles()) {
      String text = Files.readString(file.getFile().toPath());
      List<Document> chunks = split(text, 800, 100);
      vectorStore.add(chunks);
    }
  }

  private List<Document> split(String text, int size, int overlap) {
    List<Document> out = new ArrayList<>();
    for (int i = 0; i < text.length(); i += size - overlap) {
      out.add(new Document(text.substring(i, Math.min(i + size, text.length()))));
    }
    return out;
  }
}
Enter fullscreen mode Exit fullscreen mode

Question endpoint

@PostMapping("/api/ask")
public AnswerResponse ask(@RequestBody QuestionRequest req) {
  List<Document> similar = vectorStore.similaritySearch(req.question(), 5);
  String context = similar.stream().map(Document::getContent).collect(Collectors.joining("\n---\n"));
  String answer = chatClient.prompt()
      .system("Answer only from the context below. Say 'I don't know' if not found.\n" + context)
      .user(req.question())
      .call()
      .content();
  return new AnswerResponse(answer);
}
Enter fullscreen mode Exit fullscreen mode

Next: function calling from Angular

M8-B — Structured JSON from LLMs in Angular


Full tutorial: RAG with Spring Boot — Embeddings and Vector Search Step by Step (2026)

Kindson MunonyeGitHub · LinkedIn · About

Top comments (0)