DEV Community

Cover image for Data structure in Mineral
2

Data structure in Mineral

Introduction

The data structure is at the heart of every IT project, playing a decisive role from the earliest design stages.

It defines the way in which information is organised, stored and manipulated, directly influencing the application's performance, maintainability and scalability.

A well thought-out data structure makes it possible to solve complex problems efficiently, optimising resources and reducing execution times.
Conversely, an inappropriate choice can lead to major limitations and additional costs in the long term.

So it's essential to think carefully about data structures from the outset of a project, to lay the foundations for a robust, scalable solution.


Context

Basically, we wanted Mineral to be able to offer data models ready for immediate use by the end developer.

So, in accordance with the Discord.js library, we chose to serve all the data available for a single Discord server from any model.

This choice implies that each time we receive an event from the Discord websocket client, we have to serialise the data that is the subject of our event, as well as everything that is accessible to it, whatever the level of depth.

As each structure had to be able to be traced back to its parent (and vice versa), we soon found ourselves having to serialise a huge amount of data for a simple structure.

An example may seem more concrete to you.


Issues

Let's say we receive the MessageCreate event from Discord's websocket client, we need to write our handler in this way.

client.events.server.messageCreate((ServerMessage message) {
  // Our code
});
Enter fullscreen mode Exit fullscreen mode

Serialisation is costly

Following this premise, we need to allow access to our Discord server from our message model, to do this we chain together something like message.channel.server.

Behind the simplicity of this chain, there are a number of implications

  1. The message must contain a complete structure of the Channel into which it was sent.
  2. The Channel must be able to make accessible the complete structure of the Server in which it evolves.
  3. The Server model embeds a whole set of properties including its own channels, members, roles... which, for each of them, must also make it possible to trace back to itself...

And so on...

All these structures, nested in a multitude of levels, generate enormous computational complexity and have a significant impact on the cost of serialising each data structure.

Our main concern is the difficulty of serialisation with which even the smallest data structure will be confronted.

So we increase the computational complexity, again and again.

It is important to note that a complex data structure is transmitted to the client even if it does not require any level of depth.

client.events.server.messageCreate((ServerMessage message) {
  await message.reply(content: 'Hello World');
});
Enter fullscreen mode Exit fullscreen mode

In this case, even though we only need the instance of our message, at no point do we use a message.channel.server chain; and yet the server is indeed serialized and supplied to the client.

Cache dependency

Our philosophy is not to impose anything on developers.
One of our promises is not to impose the use of any caching solution.

Behind this promise, we want the framework to be able to be offered as a stable and viable solution for any project.

In this section, we will use the Discord.js library as a reference for comparison.

This library imposes the use of an in-memory caching solution, which allows it to offer complex data structures such as those described above, allowing developers to access almost any property or action available on the Discord server.

The choice of imposing a caching solution actually hides a deeper problem that introduces excessive consumption of the memory used by the application.
Conversely, not using a memory cache in the Discord.js library makes it impossible to deliver these complex data structures.

It is possible to envisage a system that introduces a notion of lifespan or life cycle of the data, but it is up to the developer to work this out.


Solution adopted

In order to overcome all the problems explained above, we decided to drastically reduce the size of our data structures to make them as minimalist and atomic as possible.

This choice has enabled us to...

  • reduce the dependency of each data structure on the others
  • reduce the size of our data structures
  • reduce serialisation time
  • reduce the overall complexity of using a resource
  • eliminate potential errors linked to the absence of properties required for complete serialisation of the complex data structure
  • eliminate dependency on a caching solution

Impact on DX

It is important to note that these profound changes will result in a modification of the development experience for the end user.

As each delivered structure is smaller, the developer will have to explicitly request the retrieval of certain resources required for his business context.

An example of this is the retrieval of a server from a Channel using a newly introduced function resolveServer().

We selected the term resolve in a very specific case where our data model contains every piece of information needed to construct an HTTP request (usually to retrieve a resource) without any additional parameters being requested from the developer.

We can now see a new way of accessing our data.

client.events.server.messageCreate((ServerMessage message) async {
  if (message.authorIsBot) {
    return;
  }

  final (channel, author) = await (
    message.resolveChannel(),
    message.resolveMember()
  ).wait;

  final str = 'Hello from ${author.username} in ${channel.name}';
  await message.reply(content: str);
});
Enter fullscreen mode Exit fullscreen mode

Caching

We have already stated our desire to allow the end developer to dispense with any caching solution, but make no mistake, it is still extremely interesting and important within your applications.

Using it will drastically reduce the execution time of your code while reducing the number of requests to the Discord API, thereby reducing the risk of a rate-limit.

The current procedure is as follows.

Data flow

There are 2 possible uses for it:

Without cache

When no caching solution is used, the Datastore will make a direct HTTP request to the Discord API to obtain the result, then serialise it and send it back to the consumer.

With Cache

In this case, there are two possible scenarios.

The cache has the data
The datastore contacts the caching solution, then retrieves the result, serialises it and sends it to the consumer.

The cache does not have the data
In this case, the Datastore will make an HTTP request to the Discord API to retrieve the result. The result is then pushed into the cache, serialized and sent back to the consumer.

Credits
We would like to thank Lexedia and Abitofevrything from the Nyxx team for the discussions and advice that led to this result.

See more in the documentation

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay