yangbongsoo

Posted on Mar 10

Deep dive into ObjectMapper

#java #eventdriven #kafka #springboot

1. serialize/deserialize issue

When operating on an EDA (Event-Driven Architecture) basis, particular attention must be paid to the serialization/deserialization of messages. This is because the systems are distributed.

For example, consider the scenario where System1, as depicted below, modifies a message-related object. System1 is responsible for serializing and publishing the message. However, other organizational systems like System2 and System3 might have different deployment timings. If the changes are not reflected, mismatches in consumption rules can lead to failures. Therefore, it is crucial to manage the ObjectMapper effectively.

1-1. FAIL_ON_UNKNOWN_PROPERTIES

The FAIL_ON_UNKNOWN_PROPERTIES option always requires careful attention.

The FAIL_ON_UNKNOWN_PROPERTIES option is an attribute that determines whether deserialization should fail when there is no matching field (unknown property) during deserialization.
cf) When a new field is added to an object for serialization and deployed, but the consuming systems (where the object is deserialized) do not have that field reflected.

If this attribute is set to true, a JsonMappingException exception occurs, leading to a failure. To prevent this, it should be set to false (to ignore unknown properties) or the object being deserialized should be additionally annotated with @JsonIgnoreProperties.

1-2. Spring Boot FAIL_ON_UNKNOWN_PROPERTIES

When using Spring Boot, the FAIL_ON_UNKNOWN_PROPERTIES option is adjusted to FALSE by default in Jackson2ObjectMapperFactoryBean.

However, if a new ObjectMapper is created directly, it does not go through Jackson2ObjectMapperFactoryBean, thus FAIL_ON_UNKNOWN_PROPERTIES remains true (the default value).

Therefore, it needs to be disabled as shown in the code below.

private static ObjectMapper objectMapper() {
    ObjectMapper objectMapper = JsonMapper.builder()
        .disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES)
        .build()
}

1-3. codehaus and fasterxml

There are services that mix the use of org.codehaus.jackson(Jackson 1.x) and com.fasterxml.jackson(Jackson 2.x). Given the service has a long history, it has multiple ObjectMappers. Therefore, although there's a desire to add options to the ObjectMapper, predicting the scope of impact is challenging, making modifications difficult. In other words, there is a significant risk when ObjectMapper management is neglected. If predicting the scope of impact becomes difficult upon changing ObjectMapper settings, it's safer to add the @JsonIgnoreProperties(ignoreUnknown = true) annotation to all classes.

In practice, there have been instances where problems occurred. Despite importing com.fasterxml.jackson.annotation.JsonIgnoreProperties and adding the @JsonIgnoreProperties annotation, the ObjectMapper used for deserialization was from codehaus. Consequently, a Could not read JSON: Unrecognized field error occurred. Mistakes during import can lead to issues, and there's a high chance that these might be overlooked even during PR reviews. In cases where services mix libraries, it's advisable to standardize on codehaus since Jackson 2 also supports it, making it a safe choice. Later, when phasing out Jackson 1, it's better to switch to fasterxml in a unified manner.

2. Understanding the Principles of Deserialization

If Jackson decides to deserialize into the Sample class type, it first looks for a no-argument constructor in the Sample class. If one exists, it uses that constructor to create a Sample instance and then looks for setters. It then checks for setters with the same name as the keys in the JSON string and injects the values accordingly.

public static class Sample {
    private String name;
    private String address;

    public Sample() { }

    public void setName(String name) {
        this.name = name;
    }

    public void setAddress(String address) {
        this.address = address;
    }
}

Let's look at the other case. JSON delivers data such as name, address, and gender as follows

name : "ybs",
address : "seoul",
gender : "male"

but in the Sample object being deserialized into, there is no setter for the gender field. In this case, the name and address are injected through their setters, and since there is no setter for gender, it checks for a field with the same name. If found, it injects the value using reflection. If not found, it either ignores it or throws an exception, depending on the FAIL_ON_UNKNOWN_PROPERTIES setting.

// 1. setter
// 2. reflection
public static class Sample {
    private String name;
    private String address;
    private String gender;

    public Sample() { }

    public void setName(String name) {
        this.name = name;
    }

    public void setAddress(String address) {
        this.address = address;
    }
}

Next, if there is an all-argument constructor, it injects the values through this constructor. However, simply adding the constructor is not enough; it must be indicated with @JsonCreator and @JsonProperty annotations.

// 1. constructor
// 2. setter
// 3. reflection
public static class Sample {
    private String name;
    private String address;
    private String gender;

    @JsonCreator
    public Sample(
        @JsonProperty("name") String name,
        @JsonProperty("address") String address,
        @JsonProperty("gender") String gender
    ) {
        this.name = name;
        this.address = address;
        this.gender = gender;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void setAddress(String address) {
        this.address = address;
    }
}

However, since the above method can be cumbersome, there is an alternative approach that involves using the @ConstructorProperties annotation, as follows.

public static class Sample {
    private final String name;
    private final String address;
    private final String gender;

    @ConstructorProperties({"name", "address", "gender"})
    public Sample(String name, String address, String gender) {
        this.name = name;
        this.address = address;
        this.gender = gender;
    }
}

How does deserialization work with inheritance? First, it injects values through the constructor, but there's no field for the parent. That means it doesn't get set through the constructor. Next, it looks for a setter, but if there's no setter available, finally, it resorts to reflection to inject the field, since the field exists

// 1. constructor
// 2. setter
// 3. reflection
public static abstract class Parent {
    private String parent;
}

public static class Sample extends Parent {

    @ConstructorProperties({"name", "address"})
    public Sample(String name, String address) {
        this.name = name;
        this.address = address;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void setAddress(String address) {
        this.address = address;
    }    
}

However, since using setters is preferable to reflection, setters can be created as follows

public static abstract class Parent {
    private String parent;

    void setParent(String parent) {
        this.parent = parent;
    }
}

For Collection types, there's an additional step (getter). As in the code below, there might not be a setter, but a getter could exist. If it's a Collection type, Jackson calls the getter. If the list returned by the getter is null, it proceeds to the next step. However, if the list is not null and a list is returned, Jackson adds to this list.

// 1. constructor
// 2. setter
// 3. getter !!
// 4. reflection
public static class Sample {
    private List<String> samples = new ArrayList<>();

    public List<String> getSamples() {
        return samples;
    }
}

There's a caveat to be aware of. You should not use the List.of() operator as shown below because it results in ImmutableCollections.emptyList(), which is unmodifiable. Therefore, an error occurs the moment Jackson attempts to add to it.

public static class Sample {
    private List<String> samples = List.of();

    public List<String> getSamples() {
        return samples;
    }
}

Similarly, if the getter wraps the list with unmodifiableList, an error occurs the moment Jackson attempts to add to it. Therefore, as a solution, one should either handle it with @JsonIgnore or avoid using unmodifiableList.

public static class Sample {
    private List<String> samples = new ArrayList<>();

    // Add @JsonIgnore or remove unmodifiableList()
    public List<String> getSamples() {
        return Collections.unmodifiableList(samples);
    }
}

3. Deserialization with polymorphism

When you specify the necessary concrete class (Sample) while using readValue with ObjectMapper, it automatically deserializes into the Sample type.

Sample sample = OBJECT_MAPPER.readValue(json, Sample.class);

When you need to deserialize using an interface, it can occur when the framework automatically handles the deserialization for you. For example, consider the scenario where there are AEvent, BEvent, and CEvent classes that implement the Event interface. The raiseEvent method receives an argument of the Event type, but you want it to be deserialized into the appropriate concrete class based on the situation.

public static interface Event {
  ...
}

public static class AEvent implements Event { ... }
public static class BEvent implements Event { ... }
public static class CEvent implements Event { ... }

... 

protected void raiseEvent(Event event) {
    delegate.raiseEvent(event);
}

Of course, we can create a class capable of holding all messages and deserialize into it. However, this approach requires understanding which fields are used and which are not, making it challenging to manage. The prevalence of nullable cases makes it convenient initially but difficult to maintain over time.

To deserialize into the appropriate concrete class even when dealing with an interface type, there needs to be something that can identify the correct class. This is where @JsonTypeInfo comes in handy. By attaching this annotation, the type information is embedded during serialization, indicating which class to deserialize into.

@JsonTypeInfo(use = Id.NAME, include = As.WRAPPER_OBJECT)
private List<Event> events = new ArrayList<>();

In other words, when serializing an interface type, the type name is also serialized alongside the data to ensure that the type can be identified upon deserialization. This is done by embedding an id value, which is used to select the correct class to deserialize into. If the strategy is WRAPPER_OBJECT, the data is wrapped, resulting in an additional layer of depth. This can be observed in the JSON structure, where, for example, AEvent would be used as a key, and the actual data would be nested within an additional layer beneath it.

{
  "events":
    [
      {
        "AEvent":{"sourceVersion":100,"id":1}
      }

        ...
    ]
}

Changing @JsonTypeInfo settings requires caution, as errors can occur if the deserialization side is not properly aligned. Since Id.NAME is used with @JsonTypeInfo, serialization passes only the class name, so changes in the package do not affect it. However, if Id.CLASS is used, it serializes the fully-qualified Java class name, including the package name, so any change in the package affects deserialization.

{
  "events":
    [
      {
        "com.toy.AEvent":{"sourceVersion":100,"id":1}
      }

        ...
    ]
}

And if PROPERTY is used instead of WRAPPER_OBJECT, the type information is passed using the @type key."

{
  "events":
    [
      {
        "@type":"AEvent"
        "sourceVersion":100,
        "id":1
      }

        ...
    ]
}

Next, the @JsonSubTypes annotation, which matches @JsonTypeInfo, is specified separately on the setter. This registers the mapping information to determine the restoration type based on the serialized name when deserializing an interface type.

@JsonSubTypes({
    @Type(value = AEvent.class, name = "AEvent"),
    @Type(value = BEvent.class, name = "BEvent"),
    @Type(value = CEvent.class, name = "CEvent"),
})
protected void setEvents(List<Event> Events) {
    this.Events = Events;
}

cf) Caution: When adding a new Event type, it must be added here to ensure proper deserialization.

If DEvent is created but not added to JsonSubTypes, the following error occurs:

com.fasterxml.jackson.databind.exc.InvalidTypeIdException:
Could not resolve type id 'DEvent' as a subtype of `com.toy.Event`: known type ids = [AEvent, BEvent, CEvent] (for POJO property 'events')
at [Source: (String)"{"events":[{"AEvent":{"type":"AEvent","sourceVersion":100,"id":1}},{"BEvent":{"type":"BEvent","sourceVersion":200,"id":2}},{"CEvent":{"type":"CEvent","sourceVersion":300,"id":3}},{"DEvent":{"type":"DEvent","sourceVersion":400,"id":4}}]}"; line: 1, column: 181] (through reference chain: com.toy.Process["events"]->java.util.ArrayList[3])

4. Other Options in ObjectMapper

First, we disable the ALLOW_FINAL_FIELDS_AS_MUTATORS setting.

.disable(MapperFeature.ALLOW_FINAL_FIELDS_AS_MUTATORS)

"In the code below, the final field 'name' was initialized at the time of declaration, making it immutable thereafter. However, during deserialization, Jackson can use reflection to modify its value. To prevent this, we disable the ALLOW_FINAL_FIELDS_AS_MUTATORS setting.

public static class Sample {
    private final String name = "ybs";
}

We also disable the FAIL_ON_EMPTY_BEANS setting.

.disable(SerializationFeature.FAIL_ON_EMPTY_BEANS)

When deserializing the Sample class, Jackson basically throws an error if there are no fields defined in the class. At least one field must be present. Therefore, we disable the FAIL_ON_EMPTY_BEANS setting.

public static class Sample {
}

The situation where this option is necessary is as follows: when data of types A, B, and C is received from a Kafka topic, but you want to ignore type C. While filtering can be done at the front end using Kafka header values, there are cases where you need to deserialize the data to determine its type. However, since there's no need to define fields for type C that you won't use, you can create a shell class and skip it when necessary.

Type Cache

The LRUMap is used as a cache for the TypeFactory, allowing Jackson to internally cache information about the types during serialization and deserialization. This means that when searching for a required type, Jackson first checks the cache, retrieves it if found, and uses it.

return objectMapper.setTypeFactory(
    objectMapper.getTypeFactory()
        .withCache((LookupCache<Object, JavaType>)new LRUMap<Object, JavaType>(5120, 5120))
);

The default cache size is 200.

protected TypeFactory(LookupCache<Object,JavaType> typeCache, TypeParser p,
                      TypeModifier[] mods, ClassLoader classLoader)
{
    if (typeCache == null) {
        // initialEntries : 16
        // maxEntries : 200
        typeCache = new LRUMap<>(16, 200);
    }
}

When the cache becomes full, if a new type needs to be added, Jackson internally clears the LRUMap within a synchronized block and starts over from scratch. Since 200 slots can fill up quickly, the cache size was increased to 5120.

package com.fasterxml.jackson.databind.util;

public class LRUMap<K,V>
    implements LookupCache<K,V>, // since 2.12
        java.io.Serializable
{
    @Override
    public V put(K key, V value) {
        if (_map.size() >= _maxEntries) {
            // double-locking, yes, but safe here; trying to avoid "clear storms"
            synchronized (this) {
                if (_map.size() >= _maxEntries) {
                    clear();
                }
            }
        }
        return _map.put(key, value);
    }

    ...
}

Lastly, let's examine exactly what is being cached. In the LookupCache<Object, JavaType>, the key is an Object, and the value is a JavaType. JavaType (com.fasterxml.jackson.databind.JavaType) contains information about the type corresponding to the Object (key), including super class type, super interface type, and so on.

When deserializing the following JSON string into a YBS object, the items field of the YBS class is of type List interface, but the actual concrete type created is ArrayList.

// json = "{\"items\":[\"item1\",\"item2\"]}";

@Getter
@Setter
public class YBS {

    private List<String> items;

    @ConstructorProperties({"items"})
    public YBS(List<String> items) {
        this.items = items;
    }
}

As a result, JavaType is retrieved and cached for all objects associated with ArrayList.

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable { ... }

Ultimately, it can be observed that various objects are cached for a single List field.

DEV Community

Deep dive into ObjectMapper

1. serialize/deserialize issue

1-1. FAIL_ON_UNKNOWN_PROPERTIES

1-2. Spring Boot FAIL_ON_UNKNOWN_PROPERTIES

1-3. codehaus and fasterxml

2. Understanding the Principles of Deserialization

3. Deserialization with polymorphism

4. Other Options in ObjectMapper

Type Cache

Top comments (0)

Read next

Demystifying hashCode() and equals(): The Backbone of Java Hash Collections

Introducing the PII Mask Maven Dependency: Secure Your JSON Data with Ease

What Is Bearer Tokens for REST APIs and How to Debug It With Code & Tools

Akka, RabbitMQ, Kafka, and Azure Service Bus in Microservices Architecture