Yes, you read that right.
To give you some context, some time ago, our (my org's) Redis usage was un-tracked -- meaning we didn't know why our Redis memory was being occupied as much as it was. Our 2.5GB of Redis ElastiCache was almost close to being full, and if it somehow reached its limit, our system would start to fail. Though there were fallbacks in place, Redis could turn out to be a bottle-neck.
In this post, I would try to explain how we reduced the storage occupied by the data by more than 50%. This would also kind of be a step by step guide from the basics, so if you're just interested in how Redis is being used, just skip the and go to the Optimization section.
Basic Setup
I would be using the latest version of Spring Boot from https://start.spring.io. Firstly, select our two of our main dependencies - Spring Boot Web
and Spring Data Reactive Redis
.
You would find these in the pom.xml
file when you download the starter project.
The Spring Boot Web
is for building basic web applications with Spring Boot, whereas Spring Data Reactive Redis
would be used for connecting and using Redis inside the application. At its core, the Redis dependency by default uses the Lettuce Redis client, and is supported by the latest versions of Spring Boot.
Note that I'm going to skip the installation of Redis, as there are other guides available for every operating system. You do need the Redis Server to be started for our application to work successfully.
After downloading the basic application, you'll need to extract and open it in your favourite IDE (my favourite one is IntelliJ IDEA).
In my case the project name is redis-util
, and you'll find my "base packages" to be named com.darshitpp.redis.redisutil
. This base package would have a class called RedisUtilApplication
, which in my case has the following configuration.
@SpringBootApplication
@ComponentScan(basePackages = {"com.darshitpp.redis.redisutil"})
public class RedisUtilApplication {
public static void main(String[] args) {
SpringApplication.run(RedisUtilApplication.class, args);
}
}
I have manually added the @ComponentScan
annotation to specify a top-level package name under where Spring should look for defined Beans/Configurations.
To connect to Redis, I create a configuration class called LettuceRedisConfiguration
, under a new package named configuration
(note that this should be under the basePackages
path defined above.
You could define the configuration in the RedisUtilApplication
class itself, but I want this to be as "production-ready" as possible. Thus, it's a good practice to separate out your different parts of application.
My configuration class is
@Configuration
public class LettuceRedisConfiguration {
@Bean
public LettuceConnectionFactory redisConnectionFactory() {
return new LettuceConnectionFactory(new RedisStandaloneConfiguration("localhost", 6379));
}
}
It is a very simple class, which has the configuration of which URL to connect to for Redis. In my case, it is localhost
, but in most production apps, it would be an external Redis server. Port 6379
is the default port on which the Redis server starts. This Bean
would return us a "factory" of Redis connections. Think of this as something which would allow you to connect to Redis when required.
At this point, my package structure looks like:
->src
->main
->java
->com.darshitpp.redis.redisutil
->configuration
Now that we know how to connect to a Redis server, we need to figure out what data we need to store in Redis. In our case, we would be storing User
data. This is the "domain model" of our application (domain model could be translated to a table in a Database, but we don't have a table in our scenario). This User
is stored in a package called domain
.
The User
would have three fields, namely, firstName
, lastName
, and birthday
.
Before storing the objects in Redis, it is a good idea to identify how you will store the data so that it's efficient to fetch it back. What that means is Redis being a simple Key-Value store, you would need to identify the Key you would be storing the Value with. In our case, I am choosing firstName
as the key. The data would be stored in a hash, so the hashKey
that we select would be the lastName
and the value mapped to the hashKey
is the User
object.
This is because Hashes in Redis have the following structure:
key1 --- hashKey1 === value1
--- hashKey2 === value2
--- hashKey3 === value3
key2 --- hashKey4 === value4
--- hashKey5 === value5
.
.
.
You could also imagine it as a tree with the top level nodes being the Keys, the immediate next level to be hashKeys, and the leaf nodes to be the values. To access value2
, you would need to have key1
and hashKey2
.
Our example is a bit incorrect, as a User
could have same key=firstName
and hashKey=lastName
as another user, and Redis will overwrite value
. However, for brevity, we will assume there are unique User
s using our application.
We would now be creating a controller class called NormalController
which would act as an entry point for our API. We have named it NormalController
for reasons that will be clear further in this article.
@RestController
@RequestMapping("/normal")
public class NormalController {
private final NormalService normalService;
@Autowired
public NormalController(NormalService normalService) {
this.normalService = normalService;
}
@GetMapping("/get")
public User get(@RequestParam("firstName") String firstName, @RequestParam("lastName") String lastName) {
return normalService.get(firstName, lastName);
}
@PostMapping("/insert")
public void insert(@RequestBody User user) {
normalService.put(user);
}
@PostMapping("/delete")
public void delete(@RequestParam("firstName") String firstName) {
normalService.delete(firstName);
}
}
NormalController
also has a service named NormalService
which is Autowired
.
The class should be defined in a new packaged named controller
after which the package structure would look like
->src
->main
->java
->com.darshitpp.redis.redisutil
->configuration
->domain
->controller
Our basic operations would be simple CRUD like operations which NormalService
implements using a custom Operations
interface.
public interface Operations {
User get(String firstName, String lastName);
void put(User user);
void delete(String firstName);
}
To use Lettuce in our application, we need to do a couple of more things though. Just like to access JDBC, there's a provision for a JdbcTemplate
, you must similarly use a RedisTemplate
to operate on Redis. We must also define in what format will Redis store the data inside it. By default, it stores data as a String. However, know that you'll be storing User
in Redis, and in order to facilitate the storage and fetch from Redis, you would need a way through which Redis will be able to identify and convert it back to the appropriate type of data you want.
Think of this as talking with someone who doesn't know the same language as you do. If you want to communicate with someone who only speaks Spanish, you would need to find a translator who would convert English into Spanish for you. This process of conversion and recovery is known as Serialization and Deserialization.
English to Spanish = Serialization
Spanish to English = Deserialization
Thus, we need a translator or a Serializer in our case too. We would be using Jackson for this process. Jackson is a nifty library which Spring Boot supports out-of-the-box to handle Json.
We would need to create a Serializer which implements
RedisSerializer
for our purposes. In our case, I have created a class JsonRedisSerializer
inside a new package called serializer
.
class JsonRedisSerializer<T> implements RedisSerializer<T> {
public static final Charset DEFAULT_CHARSET;
private final JavaType javaType;
private ObjectMapper objectMapper = new ObjectMapper()
.registerModules(new Jdk8Module(), new JavaTimeModule(), new ParameterNamesModule(JsonCreator.Mode.PROPERTIES))
.configure(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS, true)
.configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false)
.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
.setSerializationInclusion(JsonInclude.Include.NON_NULL);
public JsonRedisSerializer(Class<T> type) {
this.javaType = JavaTypeHandler.getJavaType(type);
}
public T deserialize(@Nullable byte[] bytes) throws SerializationException {
if (bytes == null || bytes.length == 0) {
return null;
} else {
try {
return this.objectMapper.readValue(bytes, 0, bytes.length, this.javaType);
} catch (Exception ex) {
throw new SerializationException("Could not read JSON: " + ex.getMessage(), ex);
}
}
}
public byte[] serialize(@Nullable Object value) throws SerializationException {
if (value == null) {
return new byte[0];
} else {
try {
return this.objectMapper.writeValueAsBytes(value);
} catch (Exception ex) {
throw new SerializationException("Could not write JSON: " + ex.getMessage(), ex);
}
}
}
static {
DEFAULT_CHARSET = StandardCharsets.UTF_8;
}
}
As you can see, it has two methods called serialize
and deserialize
. Each of these methods use the Jackson's ObjectMapper
for conversion.
There is also a class named JavaTypeHandler
which helps you get the Type of the object you're trying to serialize.
final class JavaTypeHandler {
static <T> JavaType getJavaType(Class<T> clazz) {
return TypeFactory.defaultInstance().constructType(clazz);
}
}
Consequently, we would also need a class which returns us a RedisTemplate
which utilizes this serializer. I would name this class RedisSerializationBuilder
.
public final class RedisSerializationBuilder {
public static <T> RedisTemplate<String, T> getNormalRedisTemplate(final LettuceConnectionFactory factory, final Class<T> clazz) {
JsonRedisSerializer<T> jsonRedisSerializer = new JsonRedisSerializer<>(clazz);
RedisTemplate<String, T> redisTemplate = new RedisTemplate<>();
redisTemplate.setConnectionFactory(factory);
redisTemplate.setDefaultSerializer(RedisSerializer.json());
redisTemplate.setKeySerializer(RedisSerializer.string());
redisTemplate.setValueSerializer(RedisSerializer.string());
redisTemplate.setHashKeySerializer(RedisSerializer.string());
redisTemplate.setHashValueSerializer(jsonRedisSerializer);
redisTemplate.afterPropertiesSet();
return redisTemplate;
}
}
Notice that the above method will return you a template specific to a particular domain model(in our case, the User
) using Generics. It also specifies what connection factory is to be used, what should be the default key
/value
/hashKey
/hashValue
serializers.
Consequently, the NormalService
looks like
@Service
public class NormalService implements Operations{
private final RedisTemplate<String, User> redisTemplate;
private final HashOperations<String, String, User> hashOperations;
public NormalService(LettuceConnectionFactory redisConnectionFactory) {
this.redisTemplate = RedisSerializationBuilder.getNormalRedisTemplate(redisConnectionFactory, User.class);
this.hashOperations = this.redisTemplate.opsForHash();
}
@Override
public User get(String firstName, String lastName) {
return hashOperations.get(firstName, lastName);
}
@Override
public void put(User user) {
hashOperations.put(user.getFirstName(), user.getLastName(), user);
}
@Override
public void delete(String firstName) {
hashOperations.delete(firstName);
}
}
I then inserted a User
, using the POST
method, and URL: localhost:8080/normalService/insert
Request Body:
{
"firstName": "Priscilla",
"lastName": "Haymes",
"birthday": "2020-04-12T11:15:00Z"
}
If I then run this application for 100 Users, I find the following stats for the memory usage in Redis (I used the memory stats
command using the redis-cli
)
21) "keys.count"
22) (integer) 100
23) "keys.bytes-per-key"
24) (integer) 1044
25) "dataset.bytes"
26) (integer) 32840
Using the hgetall
command for a key gives me
127.0.0.1:6379>hgetall "Priscilla"
1) "Haymes"
2) "{\"firstName\":\"Priscilla\",\"lastName\":\"Haymes\",\"birthday\":1586690100000}"
Notice that 2)
gives us the actual type of data stored in Redis -> Json!
Our basic structure for further optimizations is in place! Yay!
Optimization
MessagePack is here to the rescue! As I said, you'd need a "transalation" mechanism. What if the translator is an expert, and converts your English into Spanish in as few words as possible? MessagePack is the same!
You would need to add two more dependencies in your pom.xml
file.
<dependency>
<groupId>org.msgpack</groupId>
<artifactId>msgpack-core</artifactId>
<version>0.8.20</version>
</dependency>
<dependency>
<groupId>org.msgpack</groupId>
<artifactId>jackson-dataformat-msgpack</artifactId>
<version>0.8.20</version>
</dependency>
We create a controller called MsgPackController
and a service called MsgPackService
almost similar to NormalController
and NormalService
. We would create a MsgPackSerializer
to serialize using MessagePack.
class MsgPackRedisSerializer<T> implements RedisSerializer<T> {
public static final Charset DEFAULT_CHARSET;
private final JavaType javaType;
private ObjectMapper objectMapper = new ObjectMapper(new MessagePackFactory())
.registerModules(new Jdk8Module(), new JavaTimeModule())
.configure(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS, true)
.configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false)
.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
.setSerializationInclusion(JsonInclude.Include.NON_NULL);
public MsgPackRedisSerializer(Class<T> type) {
this.javaType = JavaTypeHandler.getJavaType(type);
}
public T deserialize(@Nullable byte[] bytes) throws SerializationException {
if (bytes == null || bytes.length == 0) {
return null;
} else {
try {
return this.objectMapper.readValue(bytes, 0, bytes.length, this.javaType);
} catch (Exception ex) {
throw new SerializationException("Could not read MsgPack JSON: " + ex.getMessage(), ex);
}
}
}
public byte[] serialize(@Nullable Object value) throws SerializationException {
if (value == null) {
return new byte[0];
} else {
try {
return this.objectMapper.writeValueAsBytes(value);
} catch (Exception ex) {
throw new SerializationException("Could not write MsgPack JSON: " + ex.getMessage(), ex);
}
}
}
static {
DEFAULT_CHARSET = StandardCharsets.UTF_8;
}
}
The only major noticeable change is an instance of MessagePackFactory
being passed into the ObjectMapper
. This would act as a bridge between binary and String formats of data between Redis and our Spring Boot application.
Testing our changes (after clearing the previously utilized storage from redis gives us the following:
127.0.0.1:6379> hgetall "Priscilla"
1) "Haymes"
2) "\x83\xa9firstName\xa9Priscilla\xa8lastName\xa6Haymes\xa8birthday\xcf\x00\x00\x01qn\x19\x8b "
127.0.0.1:6379> memory stats
.
.
.
21) "keys.count"
22) (integer) 100
23) "keys.bytes-per-key"
24) (integer) 876
25) "dataset.bytes"
26) (integer) 15976
Compare the dataset.bytes
from the current memory to the previously recorded one. 15976 bytes vs 32840 bytes, nearly 50% reduction already!
But wait, we can reduce it further. How, you ask. Compression! What if we compress the data and then store it? In our case it would work! This time, Snappy to the rescue!
Your first question after this would be: compression and decompression takes time. Wouldn't it be detrimental on production? Snappy has the answer to this too.
It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression.
Using Snappy is also as simple as adding the dependency in pom.xml
, and a couple of lines of code changes. Just add Snappy.compress
while serialization and Snappy.decompress
while deserialization.
<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
<version>1.1.7.3</version>
</dependency>
Testing it again with the same inputs returns the following
127.0.0.1:6379> hgetall "Priscilla"
1) "Haymes"
2) "7\\\x83\xa9firstName\xa9Priscilla\xa8la\t\x13`\xa6Haymes\xa8birthday\xcf\x00\x00\x01qn\x19\x8b "
127.0.0.1:6379> memory stats
.
.
.
21) "keys.count"
22) (integer) 100
23) "keys.bytes-per-key"
24) (integer) 873
25) "dataset.bytes"
26) (integer) 15720
You can see that the size of the data set is smaller, 15720 bytes vs 15976 bytes, a marginal difference, but with larger amounts of data, this difference increases.
In my case, cleaning and restructuring the data, and utilizing the above techniques, we brought down the memory usage from 2GB to less than 500MB.
The full code can be found on my Github for redis-util.
Special mention to Rahul Chopda (@_RahulChopda) for his guidance! You have been a best mentor anyone could ask for!
Top comments (2)
Nicely done! I appreciate all the code samples. The storage savings look very real. But I'm very curious, what kind of performance trade-offs did you see?
The only tradeoff this has is a slightly higher CPU usage which is caused due to compression and decompression on the application nodes. You could choose another type of compression depending on your requirements