When I started to learn GraphQL it was hard for me to understand the purpose of the tool, the use cases, and some problems that we can have with it. Because of this, I decided to write this little article to show you some points that I think are important before you implement this excellent tool in your next GraphQL project.
Points aborded in this article:
- What is GraphQL.
- GraphQL use cases.
- Layer of complexity.
- "n+1" problem and how to fix it.
What is GraphQL? π€―
GraphQL is an alternative to the common REST pattern. GraphQL and Rest can use the same protocol to communicate server and client: HTTP/HTTPS. However, the way to make requests is a bit different.
Building a front-end web application it's very common to consume an API. When this API is made using Rest we can do a request using method GET
for resource /users
to get a user listing, POST
request to create a new user... but graphql provides just one POST
endpoint and the request verbs are divided between queries, mutations, and subscriptions.
GraphQL use cases π€
In an e-commerce context, it's common to use a web application and a mobile application to provide products to the user. Using a Rest API to provide this information, when both clients fetch request from /users
they will get the same information and this can be a problem because not always the mobile client will need the same info that the web client need. GraphQl will enter here! this amazing tool turns data fetch more dynamic leaving the responsibility of what data returns to who is asking (the client).
If you want to use graphql in your next API you need to consider some points before. They are:
- Underfetching / Overfetching
- Layer of complexity
Underfetching / Overfetching
In a system where the costs are based on the amount of data traffic, this quantity of information can have two problems: underfetching and overfetching. Underfetching returns less data than we need on the client and Overfetching returns more data in the request than we need.
Here you can check more about the costs to data traffic from AWS ec2.
As we see above, graphQL will resolve this turning dynamic the data from the request.
For example, if I have this entity:
I can choose what fields I need in the client, so both of these graphql requests will work:
Load usuless in a slow network
If you already accessed a mobile app where each page load for an eternity before showing the content, overfetching can be the cause. Talking about a mobile context, the user will not always with a good internet connection, so it's very important for us, developers, just load the information that the app needs, nothing more. For example, when you are fetching a huge list, it will consume a big bandwidth! So even if you are developing an app with the best techs, the app can be very slow just because of the amount of data that you are fetching.
Layer of complexity
Using the graphql approach you need to know about some difficulties brings with him. I will cite some below.
- File upload
File upload is not part of GraphQL native specifications, so isn't as easy as using a common Rest API, but we have some solutions for this.
- You can encode the file to base64. In doing this you need to consider the size of the request depending on the file. I don't recommend this approach.
- As I commented above, GraphQL is an alternative to Rest, so thinking about it, we can use both of them to create a hybrid API using GraphQL to major resources and Rest to file uploads. If you use NestJS I will leave here this article to you learn step by step.
- Use some lib to abstract the work on API for file upload. You can use graphql-upload if you are using node. The advantage to using this lib is that under the table this lib uses another lib called busboy that parse the file into streams so we can process the file on demand.
- Cache on the network level
In a Rest API, it's very easy to work with a cache at a network level because the rest divides the resource of our application in a lot of routes. Therefore we can leave part of the responsibility to implement a cache in the browser.
With GraphQL API it's more difficult because we have just the route http://localhost/graphql
that provides the queries and mutations and the queries will be different depending on the data that we want.
For this situation, if you are using apollo-client you can use the cache feature of this amazing tool.
"n+1" problem and how to fix it. π€
Before discussing this problem we need to understand the syntax of GraphQL requests. Let's use this query for example:
{
query {
authors { # first layer
name
books { # second layer
title
}
}
}
}
In the first layer of the query authors
, we are getting a list of authors. In the second layer, books
, we are getting the books of the respective authors. This is a pseudo code to get data from this query:
await ormConnection.getAuthors().getBooksFromAuthors()
The main behavior of this code will be to make a SQL query to get all authors
and for each author
make another query to get your own books
. In this scenario, the amount of executed queries will depend on the number of users in our DB. Here is the problem "n+1", for intimate people "1+n"! Because 1 query can generate N other queries and if you care about the performance of your application you will be crazy with this situation! However, we can do better to avoid this situation, by batching SQL queries.
If you like Prisma ORM they have an internal solution for this situation! You can check more about it in the official documentation.
Using dataloader to resolve the "1+n" problem.
The main responsibility of the dataloader is batching functions that take an array of keys, process these keys, and return an array of values.
If you are working with Node you can use dataloader developed by the GraphQL repository on Github.
Example:
Now using this dataloader lib let's get imagine that we have a fake database to resolve the query cited above about authors and books:
Looking the logs:
Executed
Author #1 books:
[
{
title: 'Harry Potter and the Chamber of Secrets',
author_id: 1
},
{
title: 'Harry Potter and the Prisoner of Azkaban',
author_id: 1 }
]
Author #2 books:
[ { title: 'Jurassic Park', author_id: 2 } ]
As you can see in the logs the dataloader callback that we passed was run only once.
Considerations π
This is my first time writing about something and I hope that you enjoyed the time reading :)
I will be happy if you can leave here some feedback about this wording and send me a message on my Linkedin whether you have some doubts GraphQL.
Hope you have a good day :)
Top comments (1)
Great summary. Thanks for sharing.