A post by Clinton Pahl and Nick Boers, PhD
Table Of Contents
- Introduction
- AssociationLoader and its Drawbacks
- Background for the SQL Window Loader
- Aggregate Functions
- Window Functions
- Memory-Efficient N+1 Resolution
- Introducing the WindowLoader
- Conclusion
Introduction
At Jobber, we're constantly evolving a modern GraphQL API to support our Web-based interface, mobile app interfaces, and third-party integrations. GraphQL allows these clients to specify the field structure and data they need in response to API queries. For example, consider the following GraphQL query:
query JobVisits {
jobs {
nodes {
visits {
nodes {
title
}
}
}
}
}
In this GraphQL query, the client has requested the title for each job visit. The response will contain an array of jobs, and for each job, an array of visits with titles. The structure of the JSON response is similar to the query:
{
"data": {
"jobs": {
"nodes": [
{
"visits": {
"nodes": [
{
"title": "Initial Assessment of Property"
},
...
]
}
},
...
]
}
}
}
In our Rails application, we use the popular graphql Ruby gem to resolve GraphQL queries. When used naively, it essentially resolves queries as a depth-first tree traversal, which leads to the N+1 problem in GraphQL.
GraphQL’s N+1 problem, which might be better thought of as the 1+N problem, refers to the number of fetches from a backend data store necessary to resolve a relationship. In the previous example, a single fetch can obtain all of the jobs for an account. After obtaining all of the jobs, a naive resolver fetches the visits for the first job, the visits for the second job, and so on. After one fetch to get all the jobs, N additional fetches get the visits. Given the depth of relationships possible in a GraphQL query, these fetches from the backend data store can quickly balloon and lead to poor performance.
The poor performance will specifically be seen in API response times. Let’s assume the above query returned 100 jobs and that fetching the visits for each job from the database takes 2 ms. In this example, the additional 100 fetches will add 200 ms to the response time for just one field.
AssociationLoader
and its Drawbacks
For the relationships subject to the N+1 problem, the graphql-batch Ruby gem and its AssociationLoader
provide some relief. This gem was developed by Shopify. Using Ruby promises (provided by the promise.rb Ruby gem), it fundamentally alters the order of field resolution for GraphQL queries, and in a sense, converts the query from a depth-first traversal to a breadth-first traversal.
As a breadth-first traversal, a call to resolve visits
for a single job doesn’t actually fetch data for the job’s visits. Instead, it returns a promise. The resolver returns a promise for each call to resolve visits. Once they're all batched, the data can be fetched from the backend in a single operation.
Under the hood, the AssociationLoader
leverages ::ActiveRecord::Associations::Preloader
. Resolving a field with promises involves collecting all of the records (e.g., jobs) where an association (e.g., visits) needs to be resolved. The Active Record Preloader
then goes ahead and fetches all of the data in a single data fetch operation. After loading the data, the individual promises are fulfilled using the Active Record data loaded into memory.
Using the AssociationLoader
and GraphQL Ruby field extensions, developers can easily configure a field to preload the required associations when it's resolved:
field :visits, Types::VisitType.connection_type, null: false, preload: :visits
Unfortunately, this approach has a weakness as soon as a GraphQL query includes pagination arguments. Consider the following slightly more complicated GraphQL query, which obtains the first three visits for each job.
query FirstThreeJobVisits {
jobs {
nodes {
visits(first: 3) {
nodes {
title
}
}
}
}
}
In this example, the Active Record Preloader
can still be used to satisfy the visits
association for each job. Suppose jobs
contains all of the jobs, resolving visits
in a single fetch might involve a call like
::ActiveRecord::Associations::Preloader.new.preload(jobs, :visits)
Unfortunately, this code will fetch all of the visits for each job, even though the GraphQL response will only include the first three visits for each job. In this approach, the Rails application will need to perform the pagination, and in the process, it fetches unnecessary data from the database, which consume unnecessary memory in the Rails application.
Background for the SQL Window Loader
Our initial solution to preloading was based on the AssociationLoader
, and it served us well to a point. It addressed the GraphQL N+1 problem, and it significantly reduced our API response times when compared to not addressing the N+1 problem. Unfortunately, as we increased the use of connection types (with their first
, last
, before
, and after
arguments), retrieving all associated records from the database only to paginate them in the Rails application was inefficient. For objects with many associated records, it consumes more memory than necessary.
After recognizing the problem, we brainstormed options to offload some of the work onto the database server to ultimately reduce the Rails application’s memory consumption. One particularly promising avenue involved SQL window functions. After deciding to pursue SQL window functions, we started our work by considering the WindowKeyLoader
example described in the graphql-batch repository.
Given that many readers may be unfamiliar with SQL window functions, the following subsections provide some background.
Aggregate Functions
An understanding of SQL aggregate functions will help in understanding SQL window functions. Consider a traditional SQL aggregate function such as count()
. In a statement involving such an aggregate function, the GROUP BY
clause groups records, each distinct group becomes a row in the result. The database management system (DBMS) applies the aggregate function to each group’s records to produce the function’s output.
For example, consider the following query:
SELECT
account_id,
count(user_id) AS user_count
FROM
users
GROUP BY
account_id
This query
- selects records from the
users
table, - groups those records by
account_id
and essentially flattens them so each row in the result has a distinctaccount_id
, and - computes the field
user_count
for each row in the result by applying the functioncount()
to the records in each group.
That (simplified) explanation of the GROUP BY
clause provides some background for understanding window functions.
Window Functions
When using a window function, records are conceptually grouped only for the context of the function. Those groups are not flattened in the result. The OVER
clause immediately following the function name will apply the function to a window, and the OVER
clause itself defines the groups of that window.
For example, consider the following query:
SELECT
account_id,
user_id,
login_count,
rank() OVER partition_by_account_id AS user_rank
FROM
users
WINDOW partition_by_account_id AS (
PARTITION BY account_id
ORDER BY
login_count DESC
)
In this example, the DBMS applies rank()
to subsets of the records, where PARTITION BY account_id
defines the subsets. For each unique account ID, the result includes a field user_rank
with values from 1 to n where n is the number of users for the unique account ID. The users are ranked by the number of times they have logged into the system.
Memory-Efficient N+1 Resolution
Recall the following GraphQL query from earlier, which obtains the first three visits for each job.
query FirstThreeJobVisits {
jobs {
nodes {
visits(first: 3) {
nodes {
title
}
}
}
}
}
If jobs typically have very few visits, the AssociationLoader
might be a reasonable solution. If jobs have many visits, it would load many visits from the database that would ultimately be discarded because of the first: 3
filter.
Using SQL window functions, it’s possible to apply the first: 3
filter at the query level and load only the necessary records into the Rails application. For example, the following query would only load the first 3 visits for each job.
SELECT
numbered_visits.*
FROM (
SELECT
*,
row_number() OVER partition_by_job_id AS row_number
FROM
visits
WHERE
job_id IN (...)
WINDOW partition_by_job_id AS (
PARTITION BY job_id
ORDER BY id ASC
)
) AS numbered_visits
WHERE
row_number <= 3
For direct associations (i.e., those where a foreign key links two tables), deriving this SQL is a rather mechanical process. It’s mechanical enough that we created a new WindowLoader
to make use of SQL window functions for resolving these associations.
Introducing the WindowLoader
When data access patterns suggest an SQL window function will improve the performance resolving a field, simply adding our new window_load
argument to the GraphQL field
method will cause the resolver to use SQL window functions when resolving the field. The new window_load
argument provides the name of the association, e.g.,
module GraphqlSchema
module Main
module Types
class JobType < GraphqlSchema::Common::Types::BaseObject
...
field :visits, Types::VisitType.connection_type, window_load: :visits
...
end
end
end
end
In our BaseField
, derived from the graphql Ruby gem’s Schema::Field
, the initializer accepts this window_load
argument. When the argument specifies an association, the BaseField
constructor adds a custom connection extension to the field (WindowConnectionExtension
).
Our WindowConnectionExtension
class inherits from the gem’s Schema::Field::ConnectionExtension
and Schema::FieldExtension
classes. This connection extension has two hooks that wrap field resolution: resolve
and after_resolve
. The former hook is called to resolve the field, and in this instance, it uses our WindowLoader
class to obtain a Ruby promise for the resolution of the field. The latter hook is called after field resolution and after the resolution of promises, and in this instance, it uses our WindowConnection
class.
The former class, GraphqlSchema::Common::Loaders::WindowLoader
, which inherits from GraphQL::Batch::Loader
, first records the foreign keys that will need to be used in the SQL query. In response to .load
calls, it returns promises. To finally resolve the promises, it generates and runs a single SQL query that uses the previously-described window functions. Most developers using this window loader are totally unaware these steps occur behind the scenes.
The latter class, GraphqlSchema::Common::Pagination::WindowConnection
, which inherits from GraphQL::Pagination::ArrayConnection
, produces a result with the expected fields for pagination, e.g., cursors and total counts.
Conclusion
Naively using the graphql Ruby gem to resolve GraphQL queries for a Ruby on Rails API leads to an implementation that suffers from GraphQL’s N+1 problem. Iterating on that solution with a gem such as graphql-batch with its AssociationLoader
can dramatically improve the situation by solving the N+1 problem and significantly reducing API response times. When a GraphQL query accepts arguments for pagination, a solution like the AssociationLoader
can lead to loading more data than necessary from the database, and as a result, higher than necessary memory consumption in the Rails server. With SQL window functions, it’s possible to offload the pagination onto the database server so that the Rails application does not receive more records than necessary. Given the flexibility of the graphql and graphql-batch gems, it’s possible to create an easy to use interface for loading data using SQL window functions.
About Jobber
Our awesome Jobber technology teams span across Payments, Infrastructure, AI/ML, Business Workflows & Communications. We work on cutting edge & modern tech stacks using React, React Native, Ruby on Rails, & GraphQL.
If you want to be a part of a collaborative work culture, help small home service businesses scale and create a positive impact on our communities, then visit our careers site to learn more!
Top comments (4)
Interesting. Could you share the extension code please?
@bruno We're working on open sourcing this and I'll share it here when it's available!
@crpahl Any chance you're still able to open source this? It would be incredibly useful for the Rails/GraphQL community!
That sounds smart. I need to check if it's possible to do something like that in Elixir/Absinthe