Cover image for Generating data classes in Java

Generating data classes in Java

bertilmuth profile image Bertil Muth ・7 min read

Kotlin has a concise syntax to declare data classes:

data class User(val name: String, val age: Int)

The equivalent Java syntax is verbose. You have to create a Java class with private fields. And getter and setter methods for the fields. And additional methods like equals(), hashCode() and toString().

But who says you have to create the Java code by hand? In this article, I'll show you how to generate Java source files from a YAML file.

Here's the example YAML file:

    name: Name
    age: Integer

    firstName: String
    lastName: String

The example output of the code generator is two Java source files, User.java and Name.java.

Content of User.java:

public class User{
    private Name name;
    private Integer age;

    public User(){

    public Name getName(){
        return name;
    public void setName(Name name){
        this.name = name;
    public Integer getAge(){
        return age;
    public void setAge(Integer age){
        this.age = age;

Name.java is similar.

The point of this article is: You'll learn how to program a code generator from scratch. And it's easy to adapt it to your needs.

The main method

The main() method does two things:

  • Step 1: Read in the YAML file, into class specifications
  • Step 2: Generate Java source files from the class specifications

It decouples reading and generating. So you can change the input format in the future, or support more input formats.

Here's the main() method:

public static void main(String[] args) throws Exception {
    // Make sure there is exactly one command line argument, the path to the YAML file
    if (args.length != 1) {
        System.out.println("Please supply exactly one argument, the absolute path of the YAML file.");

    // Get the YAML file's handle, and the directory it's contained in
    // (generated files will be placed there)
    final String yamlFilePath = args[0];
    final File yamlFile = new File(yamlFilePath);
    final File outputDirectory = yamlFile.getParentFile();

    // Step 1: Read in the YAML file, into class specifications
    YamlClassSpecificationReader yamlReader = new YamlClassSpecificationReader();
    List<ClassSpecification> classSpecifications = yamlReader.read(yamlFile);

    // Step 2: Generate Java source files from the class specifications
    JavaDataClassGenerator javaDataClassGenerator = new JavaDataClassGenerator();
    javaDataClassGenerator.generateJavaSourceFiles(classSpecifications, outputDirectory);

    System.out.println("Successfully generated files to: " + outputDirectory.getAbsolutePath());

Step 1: Read in the YAML file, into class specifications

Let me explain what happens in this line:

List<ClassSpecification> classSpecifications = yamlReader.read(yamlFile);

A class specification is a definition of a class to be generated, and its fields.
Remember the User in the example YAML file?

    name: Name
    age: Integer

When the YAML reader reads that, it will create one ClassSpecification object, with the name User. And that class specification will reference two FieldSpecification objects, called name and age.

The code for the ClassSpecification class and the FieldSpecification class is simple.

Content of ClassSpecification.java:

public class ClassSpecification {
    private String name;
    private List<FieldSpecification> fieldSpecifications;

    public ClassSpecification(String className, List<FieldSpecification> fieldSpecifications) {
        this.name = className;
        this.fieldSpecifications = fieldSpecifications;

    public String getName() {
        return name;

    public List<FieldSpecification> getFieldSpecifications() {
        return Collections.unmodifiableList(fieldSpecifications);

Content of FieldSpecification.java:

public class FieldSpecification {
    private String name;
    private String type;

    public FieldSpecification(String fieldName, String fieldType) {
        this.name = fieldName;
        this.type = fieldType;

    public String getName() {
        return name;

    public String getType() {
        return type;

The only remaining question for Step 1 is: how do you get from a YAML file to objects of these classes?

The YAML reader uses the SnakeYAML library to parse YAML files.
SnakeYAML makes a YAML file's content available in data structures like maps and lists. For this article, you only need to understand maps. Because that's what we use in the YAML files.

Look at the example again:

    name: Name
    age: Integer

    firstName: String
    lastName: String

What you see here is two nested maps.
The key of the outer map is the class name (like User).
When you get the value for the User key, you get a map of the class' fields:

    name: Name
    age: Integer

The key of this inner map is the field name, the value is the field type.

It's a map of strings to a map of strings to strings.
That's important to understand the code of the YAML reader.
Here's the method that reads in the complete YAML file contents:

private Map<String, Map<String, String>> readYamlClassSpecifications(Reader reader) {
    Yaml yaml = new Yaml();

    // Read in the complete YAML file to a map of strings to a map of strings to strings
    Map<String, Map<String, String>> yamlClassSpecifications = 
        (Map<String, Map<String, String>>) yaml.load(reader);

    return yamlClassSpecifications;

With the yamlClassSpecifications as input, the YAML reader creates the ClassSpecification objects:

private List<ClassSpecification> createClassSpecificationsFrom(Map<String, Map<String, String>> yamlClassSpecifications) {
    final Map<String, List<FieldSpecification>> classNameToFieldSpecificationsMap 
        = createClassNameToFieldSpecificationsMap(yamlClassSpecifications);

    List<ClassSpecification> classSpecifications = 
            .map(e -> new ClassSpecification(e.getKey(), e.getValue()))

    return classSpecifications;

The createClassNameToFieldSpecificationsMap() method creates

  • the field specifications for each class, and based on these
  • a map of each class name to its field specifications.

Then, the YAML reader creates a ClassSpecification object for each entry in that map.

The contents of the YAML file are now available to Step 2 in a YAML independent way. We're done with Step 1.

Step 2: Generate Java source files from the class specifications

Apache FreeMarker is a Java template engine that produces textual output. Templates are written in the FreeMarker Template Language (FTL). It allows to mix static text with the content of Java objects.

Here's the template to generate the Java source files, javadataclass.ftl:

public class ${classSpecification.name}{
<#list classSpecification.fieldSpecifications as field>
    private ${field.type} ${field.name};

    public ${classSpecification.name}(){

<#list classSpecification.fieldSpecifications as field>
    public ${field.type} get${field.name?cap_first}(){
        return ${field.name};
    public void set${field.name?cap_first}(${field.type} ${field.name}){
        this.${field.name} = ${field.name};

Let's look at the first line:

public class ${classSpecification.name}{

You see it begins with the static text of a class declaration: public class. The interesting bit is in the middle: ${classSpecification.name}

When Freemarker processes the template, it accesses the classSpecification object in its model. It calls the getName() method on it.

What about this part of the template?

<#list classSpecification.fieldSpecifications as field>
    private ${field.type} ${field.name};

At first, Freemarker calls classSpecification.getFieldSpecifications(). It then iterates over the field specifications.

One last thing. That line is a bit odd:

public ${field.type} get${field.name?cap_first}(){

Let's say the example field is age: Integer (in YAML).
Freemarker translates this to:

public Integer getAge(){

So ?cap_first means: capitalize the first letter, as the YAML file contains age in lower case letters.

Enough about templates. How do you generate the Java source files?
First, you need to configure FreeMarker by creating a Configuration instance. This happens in the constructor of the JavaDataClassGenerator:

public JavaDataClassGenerator() throws IOException {        
    configuration = new Configuration(Configuration.VERSION_2_3_28);

    // Set the root of the class path ("") as the location to find templates
    configuration.setClassLoaderForTemplateLoading(getClass().getClassLoader(), "");


To generate source files, the JavaDataClassGenerator iterates over the class specifications, and generates a source file for each:

public void generateJavaSourceFiles(Collection<ClassSpecification> classSpecifications, File yamlFileDirectory) throws Exception {
    Map<String, Object> freemarkerDataModel = new HashMap<>();

    // Get the template to generate Java source files
    Template template = configuration.getTemplate("javadataclass.ftl");

    for (ClassSpecification classSpecification : classSpecifications) {
        // Put the classSpecification into the data model.
        // It can  be accessed in the template through ${classSpecification}
        freemarkerDataModel.put("classSpecification", classSpecification);

        // The Java source file will be generated in the same directory as the YAML file
        File javaSourceFile = new File(yamlFileDirectory, classSpecification.getName() + ".java");
        Writer javaSourceFileWriter = new FileWriter(javaSourceFile);

        // Generate the Java source file
        template.process(freemarkerDataModel, javaSourceFileWriter);

And that's it.


I showed you how to build a Java source code generator based on YAML files. I picked YAML because it is easy to process. And thus, easy to teach.
Replace it with another format if you see fit.

You can find the complete code on Github.

To make the code as understandable as possible, I took a few shortcuts:

  • No methods like equals(), hashCode() and toString()
  • No inheritance of data classes
  • Generated Java classes are in the default package
  • The output directory is the same as the input directory
  • Error handling hasn't been my focus

A production ready solution would need to deal with those issues. Also, for data classes, Project Lombok is an alternative without code generation.

So think of this article as a beginning, not an end.
Imagine what is possible.

A few examples:

  • Scaffold JPA entity classes or Spring repositories
  • Generate several classes from one specification, based on patterns in your application
  • Generate code in different programming languages
  • Produce documentation

I currently use this approach to translate natural language requirements
directly to code, for research purposes. What will you do?

Posted on by:


markdown guide

Pretty nice! This could be a good approach when designing large systems with lots of models.I actually had a similar idea in the past, because I needed to generate getters/setters for about 100+ POJOs.

Right now I'm just mixing Kotlin and Java, so I benefit from Kotlin's Data classes whilst using Java for the most parts.


Yes, there are many ways to reach a goal!:-) As you said, my approach is especially useful with big numbers of classes, and if you have to customize the generated classes further.


What will you do?

I will continue to let IntelliJ do it for me...


I got some useful ideas from your post. Thank you for that. IT seems that Java 12 introduces a record concept. See


I am glad you found it interesting - I recently read the article you referenced, I didn’t know about Java 12‘s record before. It looks like a spitting image of Kotlin.

Code generation is useful beyond it, of course, it was just a simple example.


We weren't promised that Java 12 would have Records, it's something they are looking at though, if not in Java 12 then in Java 13+. For now, we could use your solution :).