DEV Community

Juan P. Lima
Juan P. Lima

Posted on

Improving your App's Performance: Solving the N+1 Problem

In this article, we'll be using a simple NodeJS API and a Postgres DataBase, the DB will have the following model:
DB Model
We'll have the category table and the item table, and there can be many items associated with one category.
On our NodeJS API, we'll have a function that will get all categories and their following items:
The function that will be getting the categories
And we'll have a function that will be getting items by category:
The function that will be getting the items by category
for this article, we'll have: 
Category A with the items: Item 1, and Item 2
Category B with Item 3

The return from our getCategories function will be an array of categories, each one having a property called "items" that will be an array of items associated with that particular category:
The return from our getCategories function

The Problem

So, have you noticed any particular performance issue with the way we're getting our categories and their items? No? So let me explain…

The N+1 Problem

Have you ever heard about the N+1 problem? That's when we execute N queries when we could've executed just 1, executing more queries will not only decrease the performance but will also consume more resources, and that's something we don't want in our apps.
Going back to our example, we're executing 1 query statement to get all the categories, then we're executing another query for each category so we can get its items, in our case, we're running 3 queries, but as the amount of categories increases, so do our number of requests on the items table, making it a Big O(N).

Fixing our code

Now that we know where the problem is, let's get started with our fix, first things first, we no longer need to loop through all our categories, since the items property will be coming from our query, we can also return directly the rows returned from our query, and then we can start using the "LEFT JOIN" statement in our query to fetch all items from that category.

Our query would look something like this:SELECT * FROM category c LEFT JOIN item i ON i.category_id = c.id; , the problem with this query, is that it won't be returning the data in the format we're expecting, see an example from executing this query on DBeaver:
The data returned from DBeaver
It would be repeating the category on every item it has, that's not what we want, so we need to make one last change in our query, that is, selecting all category's properties, aggregate the values coming from the item table, and add a "Group By" clause with the id of our category on it, that would be our final query: SELECT c.*, json_agg(i) items FROM category c LEFT JOIN item i ON i.category_id = c.id GROUP BY c.id; , see how the result looks on DBeaver:
The data returned from DBeaver
Now going back to our code and making all the necessary changes, we'd end up with the following code:
The improved getCategories function
by making a request, we'd get the following result:
The return from our getCategories function
As you can see, it's the exact same result we had with our underperforming code, but using a way better approach.

Benefits

There are numerous benefits now that we are free from our N+1 problem, some of them are:
Improved performance: Now that we're no longer running multiple queries, the additional network latency plus the database processing that came with these other queries, will no longer exist.
Reduced network traffic: As mentioned above, we're running fewer queries, therefore, we won't be using as much traffic.
Scalability: By running fewer queries, we won't be overloading our database with multiple requests, so we can achieve more with fewer resources.

Conclusions

Hope you people could learn a little more about how to improve the performance of your APPs, both suggestions and corrections are welcome!

Top comments (0)