Last summer we open sourced our platform.
But that was just the code. We're now opening up more of our operations by providing our Skylight dashboard to the community. This will help provide more context for everyone committing code to the project, and will also act as a learning opportunity for anyone who wants to peer into our operations as we grow. After all, there's always been something inherently meta about software designed for people to gather and talk about software.
👉 DEV Skylight Dashboard 📊
Skylight provides insights into response times, slow queries, inefficient memory allocation—and overall helps us paint a picture of what's happening in the aggregate as folks use the platform.
We make heavy use of our CDN to actually serve most of our traffic without having to hit the origin server at all. You can read about that here:
What it Takes to Render a Complex Web App in Milliseconds
Ben Halpern ・ Aug 18 '17
But, as we add new functionality and grow, we have more and more types of requests hitting the origin one way or another. And these are the insights that we can glean from Skylight.
How it works
Here is an example of Skylight in action. On this page in particular, here is the performance impact of the sticky sidebar of this post (visible if you're on a desktop device)
On fast requests
This diagram shows the performance of this part of the page for fast requests. You’ll see that this area of the page is rendered 24% of the time, and when it does render, it's fairly performant. This situation is a result of fragment caching, so 76% of the time this is served as stored HTML from memory as opposed to fresh queries into the database.
On slow requests
On the other hand, we can learn a lot by examining what is happening when the page is slow. Part of the slowness here is because we are serving this area cold more often and performing other cold queries underneath, resulting in a pretty big performance hit.
Tuning Rails performance is a lot about efficient queries and sensible caching strategies.
Russian doll caching in Rails
Jeff Kreeftmeijer for AppSignal ・ Apr 4 '18
Article No Longer Available
Skylight is fairly intuitive, but this stuff is inherently pretty complex, so it may take some time to understand just what you're looking at. Skylight's getting started guide is a great place to start.
Some of the sloppiest code in the application is a result of looking for performance boosts here and there, so it is always going to be a matter of give and take. You can check out our open source CodeClimate for some insights into some of the parts of the codebase that need the most work in terms of technical debt and code readability.
Our project, community and codebase are all in a much better place than when we first released the code open source. We're about half a year into this particular part of the journey and we cannot wait to keep going with you. Successes and failures of the project have rarely been strictly technical in nature, but we have had some really great technical wins as well. Every time we take a step like this, it is another step towards being the open and transparent tech company that we wish more were like.
Happy coding!
Top comments (21)
This is so cool! It helps to understand which parts could be prioritized as contributions 🚀🚀🚀
It's not a hair-on-fire problem, but memory allocation is an area that can always be improved in our type of application.
Here's an example of a type of query that happens all over the app:
I believe
includes(:user)
is a huge memory hog because all the user associations get loaded into memory in their entirety when we typically only need the username, profile pic, etc. I think this could be overcome with a different type of query but I haven't really had the time to look into it (and for whatever reason I've never been good at wrapping my head around query optimization or DB queries in general). I've also had lingering doubts about whether I understand this well enough to bother with this optimization.So I've spent most of my time elsewhere and thus far nobody else has swooped in to tackle this specific issue, but I'm pretty sure it could go a long way in terms of our memory allocation, which then will open us up to further parallelize our processes and generally be more performant.
We also have some lingering n+1 issues and other types of repeated queries. Within 2019 we'd love to get to a place where we can let other folks run the application as their own community app and a part of that is improving efficiency so people can run the app more cheaply and reliably.
I believe you, it's probably one of those cases where the DB behaves perfectly fine until you have lots and lots of data. It would be nice if we could create a script, separate from seeds.rb, that creates a DB with a size comparable to the real one. This way one could measure performance locally and see the query plan to understand what's going on. The DB can change how it executes a query depending on the estimated cost.
Another thing that could help is using the latest versions of PostgreSQL. PostgreSQL 10 and 11 execute more queries and joins in parallel, which could help a bit. But I would start by "explaining" the queries if that's the bottleneck (added to what you hinted at as Rails taking too much memory transforming rows into objects). Upping PostgreSQL's version, unless there's a specific reason to do it, won't make a huge difference
You can run
explain
like this in the console:This is a cool guide to interpret the info explain outputs: Reading a Postgres EXPLAIN ANALYZE Query Plan
Shouldn't n+1 be notified by the gem
bullet
? Is it because they are triggered only in production?Anyhow, I'm optimistic about the fact that devto can be optimized!
Yes, having an idea of how the app scales as the user base grows could be helpful for other communities.
I've got no Rails experience, but I've done quite a lot of database work. Am I looking at right place for article indexes?
github.com/thepracticaldev/dev.to/...
I'm not really seeing an index that would help this query which probably translates into this?:
It is usually a good idea to follow POCO (Partition, Order, COlumns) indexing convention. So in this case I'd probably go for code that would generate me such as:
This might be not exact syntax. But if I managed to translate ruby code well into what SQL query might look like, this could give you nice query plans.
oss.skylight.io/app/applications/K...
This dashboard has a sql query that probably lists columns from article table twice. It has "articles".* (all columns from articles table) and then lists more columns from the very same table.
Also this uses lower function. Usually if SQL uses functions, these can result in table scans. Unless you store normalized tag version, index function expression or your database is not case sensitive.
Also, dashboard says that the following query occurs in 100% cases:
I'm not sure if it has to select everything, but it also seems that there's no index on
path
column, hence - this query should scan the whole tableHi Evaldas, great analysis, in reality though this query:
translates to the following two queries:
followed by a
SELECT *
for the users with those ids:(I'm running it on my local copy)
If we examine the query plan for the first one, we can see it's pretty basic:
Running
EXPLAIN ANALYZE
on the query itself doesn't yield much more info, because on a small subset is not a costly operation:By examining the indexes on the table
articles
we can see that, although there's an index onpublished_at
which helps the query, there is no index on the other important filtering field:published
. Published articles are hit every single time they are queried from the DB (unpublished articles are visible only to their creator, which already is looking into a smallest sample of all articles, instead every public query filters onpublished = true
).My suggestion is to run
EXPLAIN ANALYZE
on the query, create an index, re-run theEXPLAIN ANALYZE
afterwards and see what happens. The estimated cost should lower.Not being a join, there's no foreign keys involved. The other query ActiveRecord does involves only the primary key so there's not much that can be done on that, for now.
This is the part I'm not fully understanding. How is it that the same table gets hit so many times in a single transaction? Is it because Skylight aggregates different paths/URLs in one single page? Otherwise it would be worrysome if a single page were to go back to the
articles
andusers
table so many times:Yeah, functions kill performance a little :D I suspect that is autogenerated by a library devto is using, it might require an investigation. There's an index on the
tags
table but it's not an index onlower
which as you say will result in a table scan instead :(Yep, there's no index on
articles.path
either.I'll recap:
articles
most probably needs an index onpublished
tags
most probably needs an index onlower(name)
articles
definitely needs an index onpath
All of these indexes can be tested on a copy of the production DB before.
These should help query performance, before diving in how to select less data or caching stuff or other tricks.
Pinging @ben here :)
Thank you! How would first query plan change if you were to create an index I've suggested? I've got difficult time reading explain as I'm used to sql server plans.
But I see there is a scan and two sort operations.
With suggested index I wanted to eliminate any kind of sort operations as they're quite heavy on any relational database.
I believe you are looking at aggregates here, so these are all different types of queries based on different requests. This is one of the nuances of Skylight that are less intuitive IMO. A single transaction doesn't hit all of those (unless they all show 100% hits). @vaidehijoshi correct me if I'm wrong.
Are indexes on boolean columns that beneficial? I'm all for adding it if they are.
I think there is one already or am I missing something?
Yes, definitely.
All in all I still want to bring up the question of performance vs memory. My thought was that the
includes(:user)
is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.Queries that load users are perfectly fine, they use primary key to do so. Like you mentioned, I would just limit columns that it selects.
Now indexing a boolean column alone does not make much sense, but my intention was to create a filtered index (where published is true), the second column being published at with descending order meaning what within published articles, all of them will be already sorted in disk.
I have included rest of the columns that are in select clause so that database does not have to do extra lookups from the index to the actual table to get remaining columns.
Having such an index means it 100% satisfies published articles query.
The amount of "that" unfortunately needs to be measured. I don't think a full index would make a difference, but a partial one might (if
published
= false is significantly less used).Something like:
I'm trying to generate a massive
articles
table to see if I can have PostgreSQL activate the index.There is one, but it's on "name", not on lower(name):
with something like this:
in reality this index might be ignored, I don't think the tags table is that big, PostgreSQL might do a sequential scan if it deems it faster
Have you tried the performance vs memory difference by forcing eager loading in the
includes
?With:
A few more specific things to note:
The
StoriesController#index
andStoriesController#show
endpoints are sort of re-used for multiple types of pages because we use top level routing in the controller, sodev.to/:user
anddev.to/:organization
for example, share a controller action. I'm not sure what the right way to do this but it makes the Skylight dashboards a bit more complicated.Any PRs to effectively refactor this stuff is welcome. @lightalloy you did some work here, but we never really touched on this elements. Feel free to give it some thought.
There's plenty we can do to modify and tune performance outside of the context of Skylight, so anyone outside the absolute core team will not get the whole picture, but we will be trying to offer more dashboards in the future, to really open things up. But we will be doing so conservatively, with a very strong bias towards principles of least privilege until we feel like it's better to have certain things totally in the open.
Ben, thanks for mentioning me.
I still think that the
StoriesController
has too many responsibilities, and they should be separated for better maintainability and profiling.I've worked on separating views to start with, but that's only a small part of the task.
I would love to continue work on the refactoring, but I suppose that's not the top priority right now. I'll return to this issue when it's possible, or I can help if there are other contributors who want to work on it.
I suppose it's possible to separate those routes using route constraints.
I saw
StoriesController#show
in drilled down , it seemsSELECT FROM articles
This query is getting hit multiple times (4 times, 2 with multiple parameters and 2 times with same parameters) in single request. (as you said, this might be due to using same endpoints for multiple routing)
Still average response time is better than
Comment
query with 2 calls and taking 171 ms average. (Definitely area for improvement compared to above)Overall very interesting insights. It will be interesting to see details on what benchmarks (SLA) you would create to further enhance the app
This is so wonderful! I'm really excited to see what contributors will focus on and fix using Skylight :)
I've used Skylight in the past for rails projects. How are you open sourcing the data? Can you just open a Skylight application to the public? (Like a public Trello board?)
I wasn't aware this was a feature.
Hi Sergio,
I work at Skylight so I can jump in here :)
We launched the Skylight for Open Source Program last year, which grants free Skylight accounts to open source apps, and provides them with a public dashboard. You can read more about it here, if you're interested!
That sounds terrific!
Skylight is so easy to use and makes your efforts LASER-POINTED to slowness in your rails apps.
Yay! I'm glad you've had a good experience using Skylight ☺️
Did this integration stopeed working?
Yeah, sorry we moved away from this and are currently in the process of getting our replacement open-sourced. We only switched off of skylight because we needed to aggregate some of our monitoring. We’ll be back with our open metrics in not too long.
That response was way quicker than I expected :) thanks for the clarification!
Cool, if I coordinate my bot network correctly this could become performance art!
:D