DEV Community

Cover image for The mysteries of GraphQL clients' cache - The Showdown
Andrej Szalma
Andrej Szalma

Posted on

The mysteries of GraphQL clients' cache - The Showdown

Recently I completed an internship at Microsoft where I had the privilege to work with people who are experts in their fields, more specifically in the field of GraphQL. I had numerous opportunities to learn from them and now I would like to teach you something. Let me take you on a journey through the midst of GraphQL clients & their cache.

This is a two-part series, where I'll talk about GraphQL client caches and compare the client's cache performance.

Part 1: The mysteries of GraphQL clients' cache - The Introduction
Part 2: The mysteries of GraphQL clients' cache - The Showdown


As promised in the previous part (Part 1), I would like to show you a head-to-head comparison of the client caches performance using a benchmark that I have been working on.

The Benchmark

So what is this benchmark I've been talking about? Simply put it is a tool to compare the latency performance of different GraphQL clients using your example queries. Thanks to this benchmark anyone can get performance data that can be used to make well-informed, data-driven decisions when choosing a GraphQL client for their project. Not only that, but it can be also used for testing experimental cache implementations, and finding out their strengths and weaknesses to know what needs to be optimised.

GraphQL Client Cache Benchmark interface

If you're interested, feel free to check it out for yourself - GraphQL Client Cache Benchmark.

This tool was originally developed by Convoy. We have upgraded and expanded it to fit our purpose.

Methods

In this part of the post, I would like to reach deep into my pocket and find the academic part of me to give you an overview of the experiment setup and testing methods so you know what and why are we testing.

Experiment setup

For the possibility of duplication of our results, I'd like to share the conditions under that the experiment has been run. However, regardless of the conditions, the benchmark results should show the same trends and patterns, only with different numbers.

  • OS: macOS Monterey 12.5
  • CPU: 2.4 GHz 8-Core Intel Core i9
  • GPU: Intel UHD Graphics 630 1536 MB
  • RAM: 32 GB 2667 MHz DDR4
  • Browser: Google Chrome - Version 103.0.5060.134 (Official Build) (x86_64)

Clients

The main important part of this whole project are GraphQL clients, and even though you might already know which clients are we going to compare, there is one more thing I need to explain before we continue. As I mentioned before, we have decided to compare two clients - Apollo client and Relay. However, we have also decided that it would be beneficial to see the effects on the performance of Apollo if we disabled the ResultCache, therefore you will see Apollo client twice in the benchmark. This proved to be a great idea, as you will see later, it turns out that depending on your specific scenario, it might be worth thinking about disabling it after all.

Example queries

As I have mentioned before, this benchmark uses arbitrary example queries and evaluates them with different clients throughout the whole test suite. These queries can include fragments as well, and they can represent your own real-life needs. Herein lies the beauty of this benchmark.

We have included a couple of example queries that you can use to play around with the benchmark, however, we expect you to add your own.

Disclaimer: There is an inbuilt query editor in the tool, however, it needed to be disabled since the newest Relay version needs pre-compiled gql artefacts. This compilation is performed by a compilator written in Rust, and it does not have a JS API which could be used to compile artefacts during the runtime. That being said, you will have to get your hands a little dirty and add your examples to the code manually. But don't worry, the documentation explains it very well.

Data access patterns

In our benchmark, we have been testing for three data access patterns - Read, Write and Update. In the following sections, I will provide you with tables that contain detailed descriptions of all the test cases and some example scenarios these might represent in the real life.

Read

As you can see in the table below, we have tested reading from empty/full cache and reading using identical/same shape queries. The latter is especially important to demonstrate the reliance of the clients on AST objects vs query strings. (Hint: Apollo's ResultCache is reliant on identical AST objects, therefore having the same query but with a different AST, will not be able to trigger reading from the ResultCache leading to a new write to it.)

Read type Notes Example Scenario
R1 Read (empty cache) Setup: None, Test: Read from empty cache, Comment: This demonstrates a client overhead for no-op queries. N/A
R2 Read, fully cached, identical query written Setup: Write query data to cache, Test: Read the identical query, Comment: This demonstrates the performance of the first read after write. Identical query is when the AST object is strictly equal. Explicitly warming the cache (eg. with chat members, or slimcore data), identical query is used to read and write.
R3 Duplicate read, fully cached, identical query Setup: Write query data to cache, read the identical query, Test: Read the identical query, Comment: This demonstrates the second read of the identical query. Test out the repeated reads, eg, pulling the chat list every time a user switches from Channels to chats.
R4 Duplicate read, fully cached, same query shape Setup: Write query data to cache, read the identical query, Test: Read a query with same shape but different AST, Comment: This demonstrates the reliance of a client on AST vs query string in its caches. Tests our repeated reads from separate components that would request the data from differently crafted queries.
R5 Read 50% of fields, fully cached Setup: Write query data to cache, Test: Read a query with half of the fields of the original query, Comment: This demonstrates how the client reads subsets of queries. Explicitly warming the cache (eg. with chat members, or slimcore data), but data queried by components requires only smaller set of fields.
R6 Random read, fully cached, expanded response Setup: Write query data with big number of items to cache, Test: Read a fragment for a random item, Comment: This demonstrates the overhead of big cache size on the client read behavior. Showing behavior of a client in a situation of growing cache size overtime during Teams session.

Write

Similar to reads, we have been again testing writing from empty/full cache, with identical/same shape queries. Furthermore, you can also see observers here, which represent watchedQueries in Apollo and subscriptions in Relay. Conceptually, if you imagine having multiple components on a site, which are all watching a certain query or a fragment, these are our observers. We tested for 1, 25, and 125 of these to see the growing overhead in different clients. The writes in a fully cached state were figuring as empty-writes as no data was updated, so it only triggered a refetch of queries at the observers.

Write type Notes Example Scenario
W1 Write, empty cache, no observers Setup: None, Test: Write query data to cache Writing startup data response to cache before any observers are setup on clean boot.
W2 Write, empty cache, 1, 25, 125 observers Setup: Create observers, Test: Write query data to cache, Comment: This demonstrates the overhead caused by observers when writing to empty cache. N/A
W3 Write, fully cached, 25, 125 observers Setup: Write query data into the cache, set up observers for partial queries, Test: Write the main query data to cache again, Comment: This demonstrates if the client updates the observers even when the data didn’t change. Refetch query from the network, Eg: people, presence data where no data has changed.
W4 Write, fully cached, 1, 25, 125 identical observers, identical query Setup: Write query data into the cache, create identical observers, Test: Write the main query data to cache again, Comment: This demonstrates the differences between the clients for different vs identical ASTs Refetch query from the network for components which observe an identical query
W5 Repeated write, fully cached, identical query, no observers Setup: Write query data into the cache, Test: Write the same identical query data into the cache again, Comment: Compare against writing with the empty cache (W1) to see the overhead caused by the initial write or cached write. Refetch query from the network without any observers

Update

When it comes to updates, we have tested the same things as with the previous tests, only with updating the data in the cache. However, it is important to mention, that these update tests represent the absolute worst-case scenario in real-life when all the observers are updated. Normally only a small subset would be notified. This particullarly affects Apollo client's records in the ResultCache which need to be rewritten on every observer update.

Update type Notes Example Scenario
U1 Update, fully cached, 1, 25, 125 observers Setup: Write main query into the cache, create 1, 25, 125 partial query observers, Test: Write same query shape to the main query with an updated response and verify if all affected partial queries have had their responses updated, Comment:  This demonstrates the degradation of performance with frequently updated data. Presence and user detail or also, emotions are updated quite often in teams and this example could represent that. (Edge case, eg. XL meetings)

Findings (🥁 Drumroll.. 🥁)

Without further ado, I'd like to present you our findings.

Results of the benchmark - Relay performs better by an order of magnitude nearly all across the board

Okay, now that you had time to digest that, let's analyze these results together a little. From the results of the benchmark, we have found that generally through different examples and tests the Relay client has been performing 10x faster throughout the board. This raises questions about what are they doing better and what is stopping people from making Relay their first choice. One of the factors in decision-making is most probably ease of use, as Relay has a steeper learning curve compared to Apollo or others. Furthemore, the need to precompile GraphQL artifacts in Relay might be an incovenience for some, but definitelly not for many.

Patterns

Looking at the benchmark results, you can notice a few patterns happening again, and again, no matter what type of query you test. I'd like to highlight these using the following image, and explain why are they occuring.

Results of the benchmark with A, B1, B2 pattern markers

A - This simply shows us that Relay being best across the board with just a few exemptions is a reoccuring pattern.

B1, B2 - This pattern provides a bit more interesting insight, and that is the effects of Apollo client's ResultCache. At B1 we can see that this is a duplicate read, fully cached, identical query test, which is exactly what ResultCache is optimised for. To explain better, when the first read from the cache, after the initial write, is performed, Apollo client memoizes the query data, and saves it to the ResultCache. After this, anytime a query with identical AST is read from the cache, instead of going through the denormalization process from the EntityStore, it is returned straight from the ResultCache. Very quickly. The same scenario occurs on B2, where we are testing empty writes. The ResultCache shines here again, as creation of the observers, it memoizes all the query data that is being watched, and since the writes are empty, and do not update any data, all that the observers need to do is to refetch the same data from the ResultCache. If you compare this to Apollo client with ResultCache disabled, you can see that the latency there is even 10x slower at some points.

Of course, as everywhere, exceptions occur and the results for each query are ever so-slightly different. However, these patterns are repetitve and have been showing up in all of our tests, so I believe they give us a nice representation of the results.

Closing thoughts

If you reached this point, I'd like to thank you for reading my post. It has been an amazing experience learning all this during the course of my internship, and now trying to teach you something new as well. But before the tears come out, let me leave you with some of my final conclusions and thoughts.

When it comes to Relay, it is a highly optimized React framework which provides the best, and most reliable latency performance. No doubts about that. Apollo on the other hand, has libraries for the whole stack and for a whole lot of languages, but they are definitely lacking behind on the performance side. However, it is important to mention, that there have been experimental caches for Apollo client previosly, which provided a simillar performance to Relay, so ... perhaps, instead of hating it after reading this post, let's try to come up with a way to optimise it. That's what open-source is about afterall 😉

Top comments (0)