Ben Halpern

for The DEV Team

Posted on Feb 7, 2019

DEV Community Is Open-Sourcing Its Performance Metrics via Skylight

#meta #opensource #showdev #rails

Last summer we open sourced our platform.

dev.to is now open source

Ben Halpern ・ Aug 8 '18

#meta #opensource

But that was just the code. We're now opening up more of our operations by providing our Skylight dashboard to the community. This will help provide more context for everyone committing code to the project, and will also act as a learning opportunity for anyone who wants to peer into our operations as we grow. After all, there's always been something inherently meta about software designed for people to gather and talk about software.

👉 DEV Skylight Dashboard 📊

Skylight provides insights into response times, slow queries, inefficient memory allocation—and overall helps us paint a picture of what's happening in the aggregate as folks use the platform.

We make heavy use of our CDN to actually serve most of our traffic without having to hit the origin server at all. You can read about that here:

What it Takes to Render a Complex Web App in Milliseconds

Ben Halpern ・ Aug 18 '17

#webperf #meta #cdn #webdev

But, as we add new functionality and grow, we have more and more types of requests hitting the origin one way or another. And these are the insights that we can glean from Skylight.

How it works

Here is an example of Skylight in action. On this page in particular, here is the performance impact of the sticky sidebar of this post (visible if you're on a desktop device)

On fast requests

This diagram shows the performance of this part of the page for fast requests. You’ll see that this area of the page is rendered 24% of the time, and when it does render, it's fairly performant. This situation is a result of fragment caching, so 76% of the time this is served as stored HTML from memory as opposed to fresh queries into the database.

On slow requests

On the other hand, we can learn a lot by examining what is happening when the page is slow. Part of the slowness here is because we are serving this area cold more often and performing other cold queries underneath, resulting in a pretty big performance hit.

Tuning Rails performance is a lot about efficient queries and sensible caching strategies.

Russian doll caching in Rails

Jeff Kreeftmeijer for AppSignal ・ Apr 4 '18

#performance #caching #optimization #rails

Article No Longer Available

42 performance tips for Ruby on Rails

Magnus Skog ・ Jan 26 '19

#ruby #rails #performance

Skylight is fairly intuitive, but this stuff is inherently pretty complex, so it may take some time to understand just what you're looking at. Skylight's getting started guide is a great place to start.

Some of the sloppiest code in the application is a result of looking for performance boosts here and there, so it is always going to be a matter of give and take. You can check out our open source CodeClimate for some insights into some of the parts of the codebase that need the most work in terms of technical debt and code readability.

Our project, community and codebase are all in a much better place than when we first released the code open source. We're about half a year into this particular part of the journey and we cannot wait to keep going with you. Successes and failures of the project have rarely been strictly technical in nature, but we have had some really great technical wins as well. Every time we take a step like this, it is another step towards being the open and transparent tech company that we wish more were like.

Happy coding!

Top comments (21)

rhymes • Feb 7 '19

This is so cool! It helps to understand which parts could be prioritized as contributions 🚀🚀🚀

Ben Halpern • Feb 7 '19

It's not a hair-on-fire problem, but memory allocation is an area that can always be improved in our type of application.

Here's an example of a type of query that happens all over the app:

Article.where(published: true).
        includes(:user).
        select(:published_at, :processed_html, :user_id, :organization_id, :title, :path).
        order("published_at DESC").
        page(@page).per(12)

I believe includes(:user) is a huge memory hog because all the user associations get loaded into memory in their entirety when we typically only need the username, profile pic, etc. I think this could be overcome with a different type of query but I haven't really had the time to look into it (and for whatever reason I've never been good at wrapping my head around query optimization or DB queries in general). I've also had lingering doubts about whether I understand this well enough to bother with this optimization.

So I've spent most of my time elsewhere and thus far nobody else has swooped in to tackle this specific issue, but I'm pretty sure it could go a long way in terms of our memory allocation, which then will open us up to further parallelize our processes and generally be more performant.

We also have some lingering n+1 issues and other types of repeated queries. Within 2019 we'd love to get to a place where we can let other folks run the application as their own community app and a part of that is improving efficiency so people can run the app more cheaply and reliably.

rhymes • Feb 7 '19

I believe includes(:user) is a huge memory hog because all the user associations get loaded into memory in their entirety when we typically only need the username, profile pic, etc.

I believe you, it's probably one of those cases where the DB behaves perfectly fine until you have lots and lots of data. It would be nice if we could create a script, separate from seeds.rb, that creates a DB with a size comparable to the real one. This way one could measure performance locally and see the query plan to understand what's going on. The DB can change how it executes a query depending on the estimated cost.

Another thing that could help is using the latest versions of PostgreSQL. PostgreSQL 10 and 11 execute more queries and joins in parallel, which could help a bit. But I would start by "explaining" the queries if that's the bottleneck (added to what you hinted at as Rails taking too much memory transforming rows into objects). Upping PostgreSQL's version, unless there's a specific reason to do it, won't make a huge difference

You can run explain like this in the console:

query.explain

This is a cool guide to interpret the info explain outputs: Reading a Postgres EXPLAIN ANALYZE Query Plan

We also have some lingering n+1 issues and other types of repeated queries.

Shouldn't n+1 be notified by the gem bullet? Is it because they are triggered only in production?

Anyhow, I'm optimistic about the fact that devto can be optimized!

Within 2019 we'd love to get to a place where we can let other folks run the application as their own community app and a part of that is improving efficiency so people can run the app more cheaply and reliably.

Yes, having an idea of how the app scales as the user base grows could be helpful for other communities.

Evaldas Buinauskas • Feb 7 '19 • Edited

I've got no Rails experience, but I've done quite a lot of database work. Am I looking at right place for article indexes?

github.com/thepracticaldev/dev.to/...

I'm not really seeing an index that would help this query which probably translates into this?:

SELECT a.published_at, a.processed_html, a.organization_id, a.title, a.path, u.*
FROM articles AS a
INNER JOIN users AS u
    ON u.user_id = a.user_id
WHERE a.published = true
ORDER BY a.published_at DESC
LIMIT 12 OFFSET @page * 12;

It is usually a good idea to follow POCO (Partition, Order, COlumns) indexing convention. So in this case I'd probably go for code that would generate me such as:

CREATE INDEX a_very_fancy_index ON articles (published, published_at DESC, user_id, processed_html, organization_id, title, path)
WHERE published = true

This might be not exact syntax. But if I managed to translate ruby code well into what SQL query might look like, this could give you nice query plans.

oss.skylight.io/app/applications/K...

This dashboard has a sql query that probably lists columns from article table twice. It has "articles".* (all columns from articles table) and then lists more columns from the very same table.

SELECT "articles".*,
       "articles"."path",
       "articles"."title",
       "articles"."id",
       "articles"."comments_count",
       "articles"."positive_reactions_count",
       "articles"."cached_tag_list",
       "articles"."main_image",
       "articles"."main_image_background_hex_color",
       "articles"."updated_at",
       "articles"."slug",
       "articles"."video",
       "articles"."user_id",
       "articles"."organization_id",
       "articles"."video_source_url",
       "articles"."video_code",
       "articles"."video_thumbnail_url",
       "articles"."video_closed_caption_track_url",
       "articles"."published_at",
       "articles"."crossposted_at",
       "articles"."boost_states",
       "articles"."description",
       "articles"."reading_time"
FROM "articles"
WHERE (EXISTS
         (SELECT *
          FROM "taggings"
          WHERE "taggings"."taggable_id" = "articles"."id"
            AND "taggings"."taggable_type" = ?
            AND "taggings"."tag_id" IN
              (SELECT "tags"."id"
               FROM "tags"
               WHERE (LOWER("tags"."name") ILIKE ? ESCAPE ?
                      OR LOWER("tags"."name") ILIKE ? ESCAPE ?
                      OR LOWER("tags"."name") ILIKE ? ESCAPE ?
                      OR LOWER("tags"."name") ILIKE ? ESCAPE ?))))
  AND (comments_count > ?)
  AND "articles"."published" = ?
  AND ("articles"."id" != ?)
  AND ("articles"."user_id" != ?)
  AND (featured_number > ?)
ORDER BY RANDOM()
LIMIT ?

Also this uses lower function. Usually if SQL uses functions, these can result in table scans. Unless you store normalized tag version, index function expression or your database is not case sensitive.

Also, dashboard says that the following query occurs in 100% cases:

SELECT  "articles".*
FROM "articles"
WHERE "articles"."path" = ?
LIMIT ?

I'm not sure if it has to select everything, but it also seems that there's no index on path column, hence - this query should scan the whole table

rhymes • Feb 8 '19 • Edited

Hi Evaldas, great analysis, in reality though this query:

Article.
  includes(:user).
  where(published: true).
  select(:published_at, :processed_html, :user_id, :organization_id, :title, :path).
  order("published_at DESC").
  page(2).per(12)

translates to the following two queries:

SELECT  "articles"."published_at", "articles"."processed_html", 
  "articles"."user_id", "articles"."organization_id", "articles"."title", 
  "articles"."path" 
FROM "articles" 
WHERE "articles"."published" = 't'
ORDER BY published_at DESC LIMIT 12 OFFSET 12;

followed by a SELECT * for the users with those ids:

SELECT "users".* FROM "users" 
WHERE "users"."id" IN (4, 9, 1, 3, 2, 10, 6, 7)

(I'm running it on my local copy)

If we examine the query plan for the first one, we can see it's pretty basic:

                              QUERY PLAN
-----------------------------------------------------------------------
 Limit  (cost=8.02..8.05 rows=12 width=485)
   ->  Sort  (cost=7.99..8.07 rows=29 width=485)
         Sort Key: published_at DESC
         ->  Seq Scan on articles  (cost=0.00..7.29 rows=29 width=485)
               Filter: published

Running EXPLAIN ANALYZE on the query itself doesn't yield much more info, because on a small subset is not a costly operation:

PracticalDeveloper_development> explain analyze SELECT  "articles"."published_at", "articles"."processed_html",
   "articles"."user_id", "articles"."organization_id", "articles"."title",
   "articles"."path"
 FROM "articles"
 WHERE "articles"."published" = 't'
 ORDER BY published_at DESC LIMIT 12 OFFSET 12;
+------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                       |
|------------------------------------------------------------------------------------------------------------------|
| Limit  (cost=8.02..8.05 rows=12 width=485) (actual time=0.096..0.100 rows=12 loops=1)                            |
|   ->  Sort  (cost=7.99..8.07 rows=29 width=485) (actual time=0.094..0.096 rows=24 loops=1)                       |
|         Sort Key: published_at DESC                                                                              |
|         Sort Method: quicksort  Memory: 44kB                                                                     |
|         ->  Seq Scan on articles  (cost=0.00..7.29 rows=29 width=485) (actual time=0.017..0.059 rows=25 loops=1) |
|               Filter: published                                                                                  |
| Planning Time: 0.232 ms                                                                                          |
| Execution Time: 0.127 ms                                                                                         |
+------------------------------------------------------------------------------------------------------------------+

By examining the indexes on the table articles we can see that, although there's an index on published_at which helps the query, there is no index on the other important filtering field: published. Published articles are hit every single time they are queried from the DB (unpublished articles are visible only to their creator, which already is looking into a smallest sample of all articles, instead every public query filters on published = true).

My suggestion is to run EXPLAIN ANALYZE on the query, create an index, re-run the EXPLAIN ANALYZE afterwards and see what happens. The estimated cost should lower.

Not being a join, there's no foreign keys involved. The other query ActiveRecord does involves only the primary key so there's not much that can be done on that, for now.

This dashboard has a sql query that probably lists columns from article table twice. It has "articles".* (all columns from articles table) and then lists more columns from the very same table.

This is the part I'm not fully understanding. How is it that the same table gets hit so many times in a single transaction? Is it because Skylight aggregates different paths/URLs in one single page? Otherwise it would be worrysome if a single page were to go back to the articles and users table so many times:

Also this uses lower function. Usually if SQL uses functions, these can result in table scans. Unless you store normalized tag version, index function expression or your database is not case sensitive.

Yeah, functions kill performance a little :D I suspect that is autogenerated by a library devto is using, it might require an investigation. There's an index on the tags table but it's not an index on lower which as you say will result in a table scan instead :(

I'm not sure if it has to select everything, but it also seems that there's no index on path column, hence - this query should scan the whole table

Yep, there's no index on articles.path either.

I'll recap:

articles most probably needs an index on published
tags most probably needs an index on lower(name)
articles definitely needs an index on path

All of these indexes can be tested on a copy of the production DB before.

These should help query performance, before diving in how to select less data or caching stuff or other tricks.

Pinging @ben here :)

Evaldas Buinauskas • Feb 8 '19 • Edited

Thank you! How would first query plan change if you were to create an index I've suggested? I've got difficult time reading explain as I'm used to sql server plans.

But I see there is a scan and two sort operations.

With suggested index I wanted to eliminate any kind of sort operations as they're quite heavy on any relational database.

Ben Halpern • Feb 8 '19

This is the part I'm not fully understanding. How is it that the same table gets hit so many times in a single transaction?

I believe you are looking at aggregates here, so these are all different types of queries based on different requests. This is one of the nuances of Skylight that are less intuitive IMO. A single transaction doesn't hit all of those (unless they all show 100% hits). @vaidehijoshi correct me if I'm wrong.

articles most probably needs an index on published

Are indexes on boolean columns that beneficial? I'm all for adding it if they are.

tags most probably needs an index on lower(name)

I think there is one already or am I missing something?

articles definitely needs an index on path

Yes, definitely.

All in all I still want to bring up the question of performance vs memory. My thought was that the includes(:user) is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.

Evaldas Buinauskas • Feb 8 '19

Queries that load users are perfectly fine, they use primary key to do so. Like you mentioned, I would just limit columns that it selects.

Now indexing a boolean column alone does not make much sense, but my intention was to create a filtered index (where published is true), the second column being published at with descending order meaning what within published articles, all of them will be already sorted in disk.

I have included rest of the columns that are in select clause so that database does not have to do extra lookups from the index to the actual table to get remaining columns.

Having such an index means it 100% satisfies published articles query.

rhymes • Feb 8 '19

Are indexes on boolean columns that beneficial? I'm all for adding it if they are.

The amount of "that" unfortunately needs to be measured. I don't think a full index would make a difference, but a partial one might (if published = false is significantly less used).

Something like:

create index articles_published on articles (published) where published;

I'm trying to generate a massive articles table to see if I can have PostgreSQL activate the index.

I think there is one already or am I missing something?

There is one, but it's on "name", not on lower(name):

PracticalDeveloper_development> \d+ tags;
...
    "index_tags_on_name" UNIQUE, btree (name)

with something like this:

create index tags_lower_name on tags(lower(name));

in reality this index might be ignored, I don't think the tags table is that big, PostgreSQL might do a sequential scan if it deems it faster

All in all I still want to bring up the question of performance vs memory. My thought was that the includes(:user) is what is eating up a good deal of memory but is not necessarily killing the query. Its purpose here is to eager load user associations but in doing so without selecting columns, it loads the whole objects into memory, which can be costly.

Have you tried the performance vs memory difference by forcing eager loading in the includes?

With:

Article.eager_load(:user).where(published: true).select(:published_at, :processed_html, :user_id, :organization_id, :title, :path).order("published_at DESC").page(2).per(12)

Ben Halpern • Feb 7 '19

A few more specific things to note:

The StoriesController#index and StoriesController#show endpoints are sort of re-used for multiple types of pages because we use top level routing in the controller, so dev.to/:user and dev.to/:organization for example, share a controller action. I'm not sure what the right way to do this but it makes the Skylight dashboards a bit more complicated.

Any PRs to effectively refactor this stuff is welcome. @lightalloy you did some work here, but we never really touched on this elements. Feel free to give it some thought.

There's plenty we can do to modify and tune performance outside of the context of Skylight, so anyone outside the absolute core team will not get the whole picture, but we will be trying to offer more dashboards in the future, to really open things up. But we will be doing so conservatively, with a very strong bias towards principles of least privilege until we feel like it's better to have certain things totally in the open.

Anna Buianova • Feb 8 '19 • Edited

Ben, thanks for mentioning me.
I still think that the StoriesController has too many responsibilities, and they should be separated for better maintainability and profiling.
I've worked on separating views to start with, but that's only a small part of the task.
I would love to continue work on the refactoring, but I suppose that's not the top priority right now. I'll return to this issue when it's possible, or I can help if there are other contributors who want to work on it.
I suppose it's possible to separate those routes using route constraints.

Pritesh Usadadiya • Feb 8 '19 • Edited

I saw StoriesController#show in drilled down , it seems

SELECT FROM articles
This query is getting hit multiple times (4 times, 2 with multiple parameters and 2 times with same parameters) in single request. (as you said, this might be due to using same endpoints for multiple routing)

Still average response time is better than Comment query with 2 calls and taking 171 ms average. (Definitely area for improvement compared to above)

Overall very interesting insights. It will be interesting to see details on what benchmarks (SLA) you would create to further enhance the app

Vaidehi Joshi • Feb 7 '19

This is so wonderful! I'm really excited to see what contributors will focus on and fix using Skylight :)

deleteme deleteme • Feb 7 '19 • Edited

I've used Skylight in the past for rails projects. How are you open sourcing the data? Can you just open a Skylight application to the public? (Like a public Trello board?)

I wasn't aware this was a feature.

Vaidehi Joshi • Feb 7 '19

Hi Sergio,

I work at Skylight so I can jump in here :)

We launched the Skylight for Open Source Program last year, which grants free Skylight accounts to open source apps, and provides them with a public dashboard. You can read more about it here, if you're interested!

deleteme deleteme • Feb 7 '19

That sounds terrific!

Skylight is so easy to use and makes your efforts LASER-POINTED to slowness in your rails apps.

Vaidehi Joshi • Feb 7 '19

Yay! I'm glad you've had a good experience using Skylight ☺️

Franco Correa • Apr 27 '20

Did this integration stopeed working?

Ben Halpern • Apr 27 '20

Yeah, sorry we moved away from this and are currently in the process of getting our replacement open-sourced. We only switched off of skylight because we needed to aggregate some of our monitoring. We’ll be back with our open metrics in not too long.