Structured LLM Output in Java — Finally

#java #llm #springboot #opensource

If you've ever tried to integrate an LLM into a Java application, you know the feeling.

You send a prompt. You get back a string. Now what?

The problem

String response = openai.complete(
    "Extract the product name and rating from: " + userText
);
// response = "The product is an iPhone with a rating of 4 out of 5."
// or        = "```
{% endraw %}
json\n{\"name\": \"iPhone\"}\n
{% raw %}
```"
// or        = "{ name: iPhone, rating: 4 }"  ← invalid JSON

You end up writing brittle parsing logic, adding retry mechanisms, handling
markdown code fences the LLM randomly wraps around your JSON...

In Python, this problem was solved years ago by libraries like Instructor
and Pydantic. In Java — nothing mature existed. Until now.

Introducing llm4j-schema

llm4j-schema lets you define what you want using a plain Java 21 Record,
and handles everything else automatically.

@LLMSchema(description = "A product review")
public record ProductReview(
    String productName,
    @FieldDescription("Rating from 1 to 5") int rating,
    String summary,
    boolean recommended
) {}

LLMClient client = new OpenAIClient(System.getenv("OPENAI_API_KEY"));
LLMExtractor extractor = new LLMExtractor(client);

ProductReview review = extractor.extract(
    ProductReview.class,
    "I bought the Sony WH-1000XM5 last month. " +
    "Best headphones I've ever used. Noise cancellation is incredible. 5/5."
);

System.out.println(review.productName());  // "Sony WH-1000XM5"
System.out.println(review.rating());       // 5
System.out.println(review.recommended());  // true
Type-safe. Auto-retried on failure. No manual JSON parsing.

How it works under the hood

Schema generation — llm4j-schema reads your Record class via reflection and generates a JSON Schema automatically
System prompt injection — the schema is sent to the LLM as a system instruction so it knows exactly what format to return
Typed deserialization — the JSON response is deserialized directly into your Record using Jackson
Auto-retry — if the LLM returns malformed JSON or wraps it in markdown, it cleans it up and retries up to 3 times automatically
Spring Boot integration in 2 minutes

Add the starter:

<dependency>
    <groupId>io.github.karolannmauger</groupId>
    <artifactId>llm4j-schema-spring-boot-starter</artifactId>
    <version>0.1.0</version>
</dependency>

Configure your API key:

# application.yml
llm4j:
  provider: openai        # or: anthropic
  api-key: ${OPENAI_API_KEY}

Inject and use:

@Service
public class ReviewService {

    private final LLMExtractor extractor;

    public ReviewService(LLMExtractor extractor) {
        this.extractor = extractor;
    }

    public ProductReview analyze(String text) {
        return extractor.extract(ProductReview.class, text);
    }
}

That's it. No boilerplate, no configuration classes, no manual wiring.

Switching providers is one line

# Switch from OpenAI to Anthropic (Claude)
llm4j:
  provider: anthropic
  api-key: ${ANTHROPIC_API_KEY}

Your Java code doesn't change at all.

Without Spring Boot
// OpenAI
LLMClient client = new OpenAIClient(System.getenv("OPENAI_API_KEY"));

// Or Anthropic
LLMClient client = new AnthropicClient(System.getenv("ANTHROPIC_API_KEY"));

// Or your own provider — just implement the interface
LLMClient client = (systemPrompt, userMessage) -> myCustomLLM.call(...);

LLMExtractor extractor = new LLMExtractor(client);

What's next

The roadmap for v0.2.0:

Nested object support — Records containing other Records
List field support — extract arrays of items
Validation annotations — @NotNull, @Range for field constraints
Async extraction — CompletableFuture for non-blocking calls
Ollama support — run extraction against local LLMs

Try it

Available on Maven Central today:

<dependency>
    <groupId>io.github.karolannmauger</groupId>
    <artifactId>llm4j-schema-core</artifactId>
    <version>0.1.0</version>
</dependency>