Thank you! How would first query plan change if you were to create an index I've suggested? I've got difficult time reading explain as I'm used to sql server plans.
But I see there is a scan and two sort operations.
With suggested index I wanted to eliminate any kind of sort operations as they're quite heavy on any relational database.
This is the part I'm not fully understanding. How is it that the same table gets hit so many times in a single transaction?
I believe you are looking at aggregates here, so these are all different types of queries based on different requests. This is one of the nuances of Skylight that are less intuitive IMO. A single transaction doesn't hit all of those (unless they all show 100% hits). @vaidehijoshi
correct me if I'm wrong.
articles most probably needs an index on published
Are indexes on boolean columns that beneficial? I'm all for adding it if they are.
tags most probably needs an index on lower(name)
I think there is one already or am I missing something?
articles definitely needs an index on path
Yes, definitely.
All in all I still want to bring up the question of performance vs memory. My thought was that the includes(:user) is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.
Queries that load users are perfectly fine, they use primary key to do so. Like you mentioned, I would just limit columns that it selects.
Now indexing a boolean column alone does not make much sense, but my intention was to create a filtered index (where published is true), the second column being published at with descending order meaning what within published articles, all of them will be already sorted in disk.
I have included rest of the columns that are in select clause so that database does not have to do extra lookups from the index to the actual table to get remaining columns.
Having such an index means it 100% satisfies published articles query.
Are indexes on boolean columns that beneficial? I'm all for adding it if they are.
The amount of "that" unfortunately needs to be measured. I don't think a full index would make a difference, but a partial one might (if published = false is significantly less used).
in reality this index might be ignored, I don't think the tags table is that big, PostgreSQL might do a sequential scan if it deems it faster
All in all I still want to bring up the question of performance vs memory. My thought was that the includes(:user) is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.
Have you tried the performance vs memory difference by forcing eager loading in the includes?
Thank you! How would first query plan change if you were to create an index I've suggested? I've got difficult time reading explain as I'm used to sql server plans.
But I see there is a scan and two sort operations.
With suggested index I wanted to eliminate any kind of sort operations as they're quite heavy on any relational database.
I believe you are looking at aggregates here, so these are all different types of queries based on different requests. This is one of the nuances of Skylight that are less intuitive IMO. A single transaction doesn't hit all of those (unless they all show 100% hits). @vaidehijoshi correct me if I'm wrong.
Are indexes on boolean columns that beneficial? I'm all for adding it if they are.
I think there is one already or am I missing something?
Yes, definitely.
All in all I still want to bring up the question of performance vs memory. My thought was that the
includes(:user)
is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.Queries that load users are perfectly fine, they use primary key to do so. Like you mentioned, I would just limit columns that it selects.
Now indexing a boolean column alone does not make much sense, but my intention was to create a filtered index (where published is true), the second column being published at with descending order meaning what within published articles, all of them will be already sorted in disk.
I have included rest of the columns that are in select clause so that database does not have to do extra lookups from the index to the actual table to get remaining columns.
Having such an index means it 100% satisfies published articles query.
The amount of "that" unfortunately needs to be measured. I don't think a full index would make a difference, but a partial one might (if
published
= false is significantly less used).Something like:
I'm trying to generate a massive
articles
table to see if I can have PostgreSQL activate the index.There is one, but it's on "name", not on lower(name):
with something like this:
in reality this index might be ignored, I don't think the tags table is that big, PostgreSQL might do a sequential scan if it deems it faster
Have you tried the performance vs memory difference by forcing eager loading in the
includes
?With: