DEV Community

Rui Silva
Rui Silva

Posted on

GraphQL Fastify vs Apollo Server — Learnings from the real world

Cover Image

In this article, we’ll be discussing a problem that we faced when using Apollo GraphQL in a production environment and how we managed to solve it. To do so, we’ll compare two different GraphQL runtime solutions, one using ExpressJS and the other one using Fastify, and also analyze the implementation details and show you some benchmarks.

Apollo GraphQL

Apollo is a platform for building a unified graph, a communication layer that helps you manage the flow of data between your application clients (such as web and native apps) and your backend services. At the heart of the graph is a query language called GraphQL.
Apollo GraphQL

GraphQL Fastify Server

GraphQL Fastify Server is an implementation on top of GraphQL. This package provides GraphQL integration with caching of both query parsing and validation, a JIT compiler to further optimize query executions, and a cache system out of the box. Unlike Apollo, this package only works over the Fastify runtime.
GitHub - rpvsilva/graphql-fastify-server

Benchmark tools

Hey and Clinic Doctor

To compare the two servers, we used two packages. The first one was ClinicJS, an open-source set of tools used to diagnose NodeJS performance issues, which also gives you suggestions and points you in a direction to fix the diagnosed problems. The second one was hey, a CLI tool to send some load to the server.
Below we show how we use them and the results of these comparisons.

The problem

Some months ago, we had to build a GraphQL server to serve a customer-facing mobile application. The goal since we bootstrapped it was to have something stable that could scale as well. At the time, we found Apollo Server, and we choose it due to due to the community and the number of features and tools that this platform has developed. Also, with Apollo, we can choose to use it as a standalone GraphQL server and pair it with a NodeJS framework, such as ExpressJS or Fastify. During the development, we had no problems with it and this package was always enough to achieve our goal, deliver a good and fast product.

Flash

Once the Minimum Viable Product (MVP) was reached, we deployed the first version to production on a Kubernetes service. After a while, we started to have some traffic on our server, as expected, and the problems were also appearing.

We faced some memory issues, where the server was reaching the Pod limit. After increasing this limit and adding horizontal autoscaling rules we were hoping that the problem was mitigated, but unfortunately, the problem was still there, but now the server crashes before reaching the memory limit. 😔

Memory Usage by Container

Memory Usage by Pod

Cache

Since we were focusing to deliver the MVP, we didn’t concern about using a good cache database like Redis, but we’ve prepared our server to be ready for it, however, we stayed with the in-memory cache. The integration with a Redis cache was our new attempt to mitigate the persisting problem. This improvement would decrease the number of read/write operations in the memory and remove the redundancy of data between pods, these were the main topics why we suspected that the problem was with the in-memory cache.

We’ve deployed a Redis Database for each environment and started the migration to this database. In the first days, we were aware of the progress and we observed some improvements, as expected since all GraphQL instances were consuming from the same Redis database. We thought that the issue was solved, but no… 😩

What?

The problem wasn’t mitigated, we’ve just got the GraphQL instance running longer since the memory didn't increase so fast. At least this wasn’t a waste of time because this was a big improvement in scalability.

Memory Usage by Container

Memory Usage by Pod

The research

Since the issue was not solved, we needed to keep researching to understand what was going on.

First, we started to observe the memory usage with some simple load tests and we noticed that, even on non-cacheable queries, the memory increased constantly. Then we started to update the Apollo packages to the latest versions, tested again and the problem persisted.

Then we started to debug using the chrome devtools to take heap snapshots while we were running some load tests and we couldn’t find any memory leak in our code.

After a while, we thought to create a Proof-of-Concept (PoC) of a simple GraphQL server using Apollo, without any major external dependencies, like cache system, plugins, or context logic. With this simple project, we ran the same load tests and the results were the same, we had the same memory behavior, increasing and crashing before reaching the limit.

The solution

Math equations

So we thought to start looking for alternatives to Apollo or to the NodeJS framework that we were using, ExpressJS. We found Fastify and we started searching for some benchmarks and comparisons between this framework and the one we were using. One of the purposes of this article starts with this research, as we didn’t find any specific benchmarks with GraphQL. Then we built a simple PoC, similar to the one using Apollo, with the same queries, and resolvers, without caching, plugins, or context logic. Subsequently, we did the same load tests on this and we found some differences in the memory progress.

Using the clinic set of tools we can have, as an output, some charts like the progress of the CPU usage and memory usage, just by running the following command:

clinic doctor — autocannon [ / — method POST ] — node dist/index.js
Enter fullscreen mode Exit fullscreen mode

Memory Usage on Apollo Server

Memory Usage on GraphQL Fastify Server

As you can see the GraphQL Fastify Server is a lot more stable during the tests. Also, there is a relation between the behavior of our project in production and the one from the PoC. Clinic doctor also shows us a chart about event loop delay, in milliseconds, on the Node thread. Following NodeJS documentation:

The event loop is what allows Node.js to perform non-blocking I/O operations — despite the fact that JavaScript is single-threaded — by offloading operations to the system kernel whenever possible.

Event Loop Delay on Apollo Server

Event Loop Delay on GraphQL Fastify Server

By analyzing the charts we can see that Apollo Server has more spikes and the average delay is greater than GraphQL Fastify Server.

Subsequently, we thought that was good to see the number of requests that we could do to each project per second, here is where we used hey by running the command below that would test a query that computes the sum of two received numbers.

hey -m POST -T "application/json" -d "{"query": "{ add(x:123, y:321) }" }" [http://localhost:8080](http://localhost:8080)
Enter fullscreen mode Exit fullscreen mode

As you can see in the charts below there is also a big difference between the GraphQL Fastify Server and the Apollo Server. Additionally, hey gives us the average time of all requests done.

Requests per second

Average time in seconds

Production

After this entire research and the PoC implementation, you ask, “What were the results in production?” We did also a test where we compared both GraphQL implementations and the results are stunning, while the Apollo Server crashed the GraphQL Fastify Server instance kept running.

Comparison in production

In conclusion

The purpose of this article was to show some analysis between two different GraphQL implementations while we were facing some performance issues in production. We made some deep analysis on Apollo GraphQL where we found problems with memory usage. After that, we started to research some alternatives to it and make the same analysis. With the results that we’ve found, we decided to migrate from Apollo to GraphQL Fastify Server since it has the features that we need and it has shown that is more stable.

Also, I took this opportunity to show the package that I recently published, GraphQL Fastify Server, where I focused on performance, optimization, and the developer experience while using/integrating my package. Don’t forget to try it and give some feedback. 🤩

Thank you for reading, I hope you learned something from this article, I know I did 🚀

You can also find this article on Medium.

Top comments (0)