L2 cache invalidation traffic across multiple nodes can be costly, and can reduce the performance of all nodes. In this article, I describe a possible solution to reduce this invalidation traffic.
As Oracle says:
“A second-level cache is a local store of entity data managed by the persistence provider to improve application performance. A second-level cache helps improve performance by avoiding expensive database calls, keeping the entity data local to the application. A second-level cache is typically transparent to the application, as it is managed by the persistence provider and underlies the persistence context of an application. That is, the application reads and commits data through the normal entity manager operations without knowing about the cache.”
Saying in other words, second-level (L2) cache is a system, that saves the recently used entities in the memory, to avoid going to the database to get them if the application need it again. With this mechanism, the application can reduce the response time, CPU and other resources (but not memory :-P).
To maintains the L2 cache with the right information, the persistence provider can implement different strategies. Hazelcast, the tool we’ll use in this example, offers the next options:
As the documentation says, the local configuration offers better performance, avoiding to syncing the data across all the nodes. The local L2 option only sent the invalidation messages to ask the other nodes to remove the data from its local copy because it was modified (invalidated). Then the next time that the other nodes need the entity, they must go to the database to read the fresh data.
In geo-redundant environments, the communications delay between nodes can be a problem affecting the good performance of the service. In applications with the L2 cache, the cache system must handle the invalidation messages, and the messages can accumulate in the buffer, waiting to be sent or for a confirmation of the reception.
In addition to that, if the invalidation message is not received in time, the data read in the other nodes from the L2 cache could be invalid.
As we can see in the next architecture Image 1, the invalidation of an entity, needs at least three invalidation messages to be communicated to all nodes. One local message (same zone), and two for the remote nodes.
Depending on the network quality, it could take dozens of milliseconds. If the application works with a model with hundreds of different entities each external request to the application could imply several invalidation messages.
And, how we can avoid or reduce this invalidation effect?
Disconnect L2 cache between the zones. If business logic allows separate data by regions, we could have a system that works only with a section of the data. So invalidation between different zones is not more needed.
Disconnect partially L2 cache between the zones. If business logic doesn’t allow us to separate totally the data, because some entities are shared between zones, then we could use a mechanism to invalidate only those shared entities across all the zones. See Image 2.
For the explanation, let’s imagine we have a service to offer films by demand, installed in two geographic zones, and databases are replicated and synchronized. See Image 3.
In addition to personal data, we can have saved along the Users entity another data as:
- History of watched films, along the last minute.
- List of the films saved by the user.
- A list of personal suggestion tailored by the system.
If we had several million users, we could have hundreds of accesses per second on average. If every time a user accesses the system, their Users entity is updated, then we will have hundreds of invalidation messages between all the nodes.
If we are able to redirect users from each geographic zone to the application running in this zone, each user will connect only to this zone. So why not apply just a local invalidation? We could avoid those hundred of message, and only invalidate the common entities like Movies, Categories, etc. Which surely will be updated a bunch of time along the day.
As I mentioned before, we are going to use Hazelcast and Hibernate to implement this.
For the two invalidation systems, we need to configure Hazelcast two times, one for the local invalidation and another for the global invalidation.
For the Local Invalidation, we can start following the indications given in the official documentation.
Briefly it consists of activating the L2 cache and indicating the class that will help hibernate to handle it. For this we have to set in our configuration the following properties:
* hibernate.cache.use_second_level_cache = true
* hibernate.cache.region.factory_class = com.hazelcast.hibernate.HazelcastLocalCacheRegionFactory
If we want to customize the configuration of the Hazelcast, we can update the file “hazelcast.xml”, where we can indicate the IPs along with other parameters.
To use our customized Hazelcast’s config file, we must set the following property:
* hibernate.cache.provider_configuration_file_resource_path =
And finally, we will set the following property, so that Hazelcast knows that it must cache all the entities (with the tag @Cacheable):
* javax.persistence.sharedCache.mode = ALL
For de Global Invalidation, in addition to have the extra Hazeclast instance, we need to implement some functionalities, to:
- Detect deleted or modified entities.
- Send and receive invalidation messages.
- Delete from l2 cache the invalidated entities indicated by the messages.
We need to monitoring those entities that we can invalidate in all the nodes. In our example about Users, Movies, Categories and News, we must monitor all of them except Users, which is going to be invalidate only locally.
We will ignore the entity-creation-event because there is nothing to invalidate in this case.
To capture the invalidation events we are going to use the EventListenerRegistry from Hibernate, which we are going to obtain through the SessionFactory.
Here you can see an example of our base listener HibernateEventListener and how we can register it in the registry of Hibernate to receive the notifications.
The use of the extra instance of Hazelcast is not because its cache functionality but because its Topic service for communications. We could use another library if we want to, but this Topic works well for our purpose.
Here is an example of how to use this Topic service to send and receive messages. First a message listener is added to print the message received, and finally a message is sent using the publish method of the Topic service.
In our case, we are going to register a listener for each entity that we want to invalidate globally, using the entity canonical name:
To invalidate the entry from the L2 cache, when we receive the invalidation message in our listener, we will invoke the evict method of the L2 cache.
In the next example entityClass is the class for what the listener is registered for.
To have the extra instance of Hazelcast we need to customize a new configuration file, with the IPs of nodes of all zones. Here is an example of how to create this new instance:
Now is time to see the traffic effect.
- Two nodes connected directly.
- Initial data: 1 category, 100 films, and 1000 users.
- Load: Users update request each 10 millisecond, and one Film update each second, all while 30 seconds.
In the normal configuration, with all nodes connected, we can say we have bout 130 packets per second as average.
In the global invalidation configuration, Hazeclast of cache l2 disconnected, and the Hazelcast for global invalidation connected. We have barely some messages.
As expected, avoid unnecessary invalidation messages we can reduce the traffic considerably. The solution depends entirely on how the model is designed and how you work with it.
In environment were we have a big amount of invalidation messages per second or a big latency in the network, we can use solutions like this to reduce the load of the network and its congestion, which increase the performance of the applications.
If you like this article or find it useful, please share it to help others!