Allan Roberto

Posted on Apr 6

How I Added LangChain4j Without Letting It Take Over My Spring Boot App

#ai #java #springboot #postgres

Most AI examples look clean for about five minutes.

Then the framework starts leaking everywhere:

controllers know about embedding models
services return framework types
retrieval becomes a black box
swapping providers means rewriting half the application

I did not want that here.

This project is a Spring Boot knowledge base backed by PostgreSQL, pgvector, and LangChain4j. It supports a practical RAG-style flow:

accept documents through an HTTP API
split them into chunks
generate embeddings
store vectors in PostgreSQL
retrieve relevant chunks with hybrid search
build a prompt and generate an answer

The interesting part is not that LangChain4j is present. The interesting part is how it is present.

LangChain4j is now part of the real execution flow, but it is still treated as an outbound technology. The application core owns the use cases. PostgreSQL still owns retrieval. LangChain4j helps with chunking, embeddings, prompt templating, and chat, but it does not define the architecture.

The architectural rule

The project is organized by business context first:

document: ingestion, indexing, chunk persistence, indexing events
search: retrieval, prompt construction, answer generation
shared: AI ports, LangChain4j adapters, and configuration

Inside each context, the code follows a hexagonal structure:

domain
application
adapter/in
adapter/out

That gives the project a simple rule: dependencies point inward.

controllers call application services
application services depend on ports
adapters implement those ports
domain classes stay free of framework concerns

This matters because the project touches several infrastructure-heavy concerns at once:

HTTP
Spring events
PostgreSQL and pgvector
LangChain4j

If you let all of those bleed into the core, the use cases disappear. The application becomes a pile of framework-shaped services.

What LangChain4j does here, and what it does not do

The code defines its own application ports:

DocumentChunker
EmbeddingPort
ChatPort

That one decision keeps the boundaries clean.

The application does not talk directly to:

DocumentSplitter
EmbeddingModel
ChatModel

Instead, LangChain4j is pushed to the edge through adapters. That means the use cases depend on the application contracts they need, not on the framework types that happen to implement them today.

This is the difference between ?using a framework? and ?letting a framework shape your codebase.?

The indexing flow stays in the application core

The document creation endpoint stays thin. It accepts a request and delegates to the application service. It does not know about chunks, embeddings, or vector storage.

That service persists the document and publishes an application-level event. Then an event listener forwards the event into the indexing use case:

@Transactional
@EventListener
public void handle(KnowledgeDocumentCreatedEvent event) {
    indexer.index(event.documentId());
}

That listener is intentionally boring. It is transport glue, not business logic.

The real work lives in KnowledgeDocumentIndexer:

List<String> chunks = documentChunker.chunk(document.getContent());

int index = 0;
for (String chunkText : chunks) {
    EmbeddingVector embedding = embeddingPort.embed(chunkText);

    KnowledgeDocumentChunk knowledgeChunk = KnowledgeDocumentChunk.builder()
            .documentId(document.getId())
            .chunkIndex(index++)
            .chunkText(chunkText)
            .embedding(embedding.values())
            .embeddingModel(embedding.modelName())
            .build();

    chunkStore.save(knowledgeChunk);
}

This is exactly where chunking and embedding belong: in the indexing use case.

Not in the controller. Not in the event listener. Not hidden in a framework callback.

LangChain4j is useful here because it is constrained

One of the better examples is chunking.

The project does not expose LangChain4j DocumentSplitter directly to the core. Instead, the application depends on DocumentChunker, and the adapter implementation is ParagraphPreservingDocumentSplitter.

That class keeps the original project behavior of one paragraph per chunk, but still uses LangChain4j internally when a paragraph is too large:

@Override
public List<String> chunk(String text) {
    return split(Document.from(text)).stream()
            .map(TextSegment::text)
            .toList();
}

And the actual paragraph handling is explicit:

String[] paragraphs = document.text().split("\\R\\s*\\R");

for (String paragraph : paragraphs) {
    String normalized = paragraph.strip();

    if (normalized.isEmpty()) {
        continue;
    }

    List<TextSegment> paragraphSegments =
            characterSplitter.split(Document.from(normalized));
}

That is a good pattern for framework integration:

preserve the business behavior you care about
use the framework for the mechanics it is good at
do not accept framework defaults blindly

The same idea shows up in the embedding adapter:

@Override
public EmbeddingVector embed(String text) {
    return new EmbeddingVector(
            embeddingModel.embed(text).content().vector(),
            embeddingModel.modelName()
    );
}

The use case gets exactly what it needs:

vector values
model name

It does not need LangChain4j response wrappers in the application layer.

The chat side follows the same pattern:

@Override
public String ask(String prompt) {
    return chatModel.chat(prompt);
}

The application wants an answer for a prompt. That is the contract. It should not need to know about ChatModel.

Retrieval still belongs to PostgreSQL

This is the part I like most in the project.

LangChain4j was introduced without giving retrieval away to a framework abstraction.

The retrieval flow in RetrievalService is still explicit:

float[] questionEmbedding = embeddingPort.embed(question).values();

String vector = vectorFormatter.toPgVector(questionEmbedding);
String metadataFilterJson = toMetadataFilterJson(metadataFilters);

List<SimilarChunk> results =
        knowledgeChunkSearchPort.searchTopK(
                vector,
                normalizeKeywordQuery(keywordQuery),
                metadataFilterJson,
                topK
        );

The actual search strategy still lives in PostgreSQL:

vector similarity through pgvector
keyword ranking through full-text search
exact metadata filtering through jsonb

That is an important architectural choice.

Too many examples treat retrieval like a magical AI feature. It is not. It is a search problem. In this project, PostgreSQL remains visible as the system that ranks and filters the data. That keeps the behavior understandable and debuggable.

Prompt rendering is framework-assisted, not framework-owned

Prompt construction uses LangChain4j PromptTemplate:

private static final PromptTemplate PROMPT_TEMPLATE = PromptTemplate.from("""
        You are an assistant for a knowledge base.
        Answer only using the context below.
        If the answer is not present in the context, say you do not know.

        Context:
        {{context}}

        User question:
        {{question}}

        Answer:
        """);

But PromptBuilder still returns a plain String to the application layer.

That is the right compromise. LangChain4j helps with the mechanics of prompt templating, but the framework does not become the API of the core service.

The fake models are not a shortcut anymore

The project still ships with fake models, and that is a good thing.

The important detail is that they are fake LangChain4j models now:

FakeEmbeddingModel implements LangChain4j EmbeddingModel
FakeChatModel implements LangChain4j ChatModel

That means local development and tests can run without provider credentials, while still exercising the same architectural flow a real provider would use.

This is much better than maintaining a fake architecture for local work and a separate real architecture for production. Here, replacing the fake provider is mostly a wiring change.

One constraint that should stay explicit

There is one technical detail that should never be buried in the fine print:

the database column is vector(1536)
the current fake embedding model also returns 1536 dimensions

If you swap in a real embedding provider, that dimension has to match or the schema has to change.

That is not an implementation detail. It is part of the persistence contract.

Why this design works

What makes this project credible is not that it uses LangChain4j.

It is that the project uses LangChain4j without surrendering the architecture.

The core ideas are simple:

define use cases first
keep framework dependencies behind ports
let PostgreSQL stay responsible for retrieval
keep controllers and listeners thin
make provider replacement a wiring problem instead of a rewrite

That is the part worth copying.

If you are building AI features into a Spring Boot application, the lesson is not ?avoid frameworks.? The lesson is narrower and more useful:

Use frameworks as adapters.
Do not let them become your architecture.

Meaning: How Data Vectorization Powers AI
Turning PostgreSQL Into a Vector Database with Docker
Indexing Knowledge Base Content with Spring Boot and pgvector
Building Semantic Search with Spring Boot, PostgreSQL, and pgvector (RAG Retrieval)
How I Added LangChain4j Without Letting It Take Over My Spring Boot App

Project Here

DEV Community