Juls

Posted on Oct 18

I built a library to define DTOs, Validation, and Schema in one place. Thoughts?

#java #opensource #validation

I've been working on a new project called FluentSchema and would love to get your feedback on the idea.

It's trying to solve a common problem: we often define the same data structure in three different places:

The Java record (our DTO)
The Jakarta Validation (@NotNull, @Size, @Email...)
The OpenAPI .yaml file (for API docs)

All three have to be kept in sync manually. A change in one place means you have to remember to update the other two. It's a lot of boilerplate and easy to make mistakes.

My library tries to fix this by creating a single source of truth, inspired by libraries like Zod in the TypeScript world.

The whole idea is: Define your schema once, in code, and get everything else derived from it.

Here's the "Before" (The typical way)

// 1. The DTO
public record UserRegistration(
    @NotNull @Email String email,
    @NotNull @Size(min = 8, max = 100) String password,
    @NotNull @Size(min = 2, max = 50) String firstName,
    @NotNull @Size(min = 2, max = 50) String lastName,
    @Min(13) @Max(120) Integer age,
    @Valid Address address
) {}

public record Address(
    @NotNull @Size(min = 1, max = 100) String street,
    @NotNull @Size(min = 1, max = 50) String city,
    @NotNull @Pattern(regexp = "^[0-9]{5}$") String zipCode
) {}

// 2. ...plus a separate Validator class for custom logic
// (e.g., "password can't contain email username")

// 3. ...plus 30+ lines of OpenAPI YAML to define the schema

And here's the "After" (With FluentSchema)

import static io.fluentschema.Schemas.*;

public class UserSchemas {

    @GenerateRecord  // This generates the record at compile-time
    public static final ObjectSchema<User> USER_REGISTRATION = object()
        .field("email", string().email().transform(String::toLowerCase))
        .field("password", string().min(8).max(100))
        .field("firstName", string().min(2).max(50).transform(String::trim))
        .field("lastName", string().min(2).max(50).transform(String::trim))
        .field("age", integer().min(13).max(120).optional())
        .field("address", object()
            .field("street", string().min(1).max(100))
            .field("city", string().min(1).max(50))
            .field("zipCode", string().regex("^[0-9]{5}$")))
        .field("tags", array(string().min(2)).optional())
        // Custom validation lives with the schema
        .refine(user -> {
            String emailUser = user.get("email").toString().split("@")[0];
            return !user.get("password").toString().toLowerCase().contains(emailUser);
        }, "Password cannot contain email username")
        .build();
}

From this single definition, you get:

Compile-Time Code Gen: The @GenerateRecord annotation processor generates the User and Address records. No runtime reflection.
Runtime Validation: You can just call USER_REGISTRATION.parse(untrustedInput) and it validates everything, including the custom .refine() logic.
Data Transformation: It handles the .transform(String::toLowerCase) and type coercions (like a string "25" to an Integer) automatically.
(Planned) OpenAPI Export: The next step is to export this definition directly to JSON Schema/OpenAPI, so you can ditch the manual YAML file.

This is a new project and I'm just looking for some honest feedback.

What do you think of this "schema-as-code" approach?
Is this a problem you've actually run into?
Any obvious downsides or pitfalls I'm not seeing?

Here's the GitHub repo with the full README if you want to see more:
https://github.com/jlapugot/fluentschema

Thanks for taking a look!

Top comments (2)

mathiewz • Oct 20

I disagree with your starter problem : "we often define the same data structure in three different places"

Both records and validation are defined in the same place, and openapi files should be generated from your DTOs, then the data structure should alwys be defined in a single place : in the data structure definition (your DTOs)

Juls • Oct 21

Thank you for your feedback!

You're right that Bean Validation keeps structure and validation together! However, there are friction points:

Problem 1: Contextual Validation

Same DTO, different contexts (CREATE vs UPDATE) requires validation groups:

public record User(
    @Null(groups = Create.class) @NotNull(groups = Update.class) String id
) {}

FluentSchema uses different schemas:

USER_CREATE = object().field("email", string()).build();
USER_UPDATE = object().field("id", string().uuid()).build();

Problem 2: Cross-Field Validation

Requires custom validators (boilerplate):

@ValidDateRange  // Custom annotation + validator class needed
public record Event(LocalDate startDate, LocalDate endDate) {}

FluentSchema makes it first-class:

.refine(e -> e.get("endDate").isAfter(e.get("startDate")))

Problem 3: OpenAPI Generation Drift

Springdoc can't see:

Service-layer validation
Custom validators (unless you extend the generator)
Complex business rules

Example: Password matching validation is invisible in OpenAPI:

@PasswordsMatch  // Not in generated OpenAPI spec
public record User(String password, String confirmPassword) {}

FluentSchema will export refinements (planned):

.refine(Refinements.fieldEquals("password", "confirmPassword"))
// Generates x-refinements in OpenAPI spec

Core Difference: Inference vs Generation

Bean Validation: DTO + Annotations > [infer] > OpenAPI
FluentSchema: Schema > [generate] > DTO + Validation + OpenAPI

Both approaches are valid! This is a style preference: explicit schemas (Zod/Pydantic style) vs inferred specs from annotations. You still choose what works for your team.