DEV Community

Cover image for Test and Optimize Your Ruby on Rails Database Performance
Milap Neupane for AppSignal

Posted on • Originally published at blog.appsignal.com

Test and Optimize Your Ruby on Rails Database Performance

In this article, you will learn how to test database performance in Rails and solve some of the most common database performance issues.

When you develop a Rails application, ActiveRecord is the default tool that manages your database. ActiveRecord provides an easy and fast interface to query and insert data using commands like .where, .save, .create, and .update. Rails does the work of converting these commands to SQL queries, which is a good thing, but sometimes can cause performance issues. It is important that you understand some of the common issues and how to optimize performance.

A Quick Note on ActiveRecord in Ruby on Rails

Rails ActiveRecord is a layer in Model-View-Controller (MVC) that manages your database by representing it as a business object. The ActiveRecord pattern uses the ORM technique to connect objects of an application to the relational database table management system.

Now, let's get going!

3 Ways to Identify and Test Database Performance Issues in Rails

1. Run Explain on ActiveRecord Queries

An explain statement displays information about the execution plan of an SQL query - how a query will be executed, including how many rows will be scanned, what index will be used, and how tables are joined.

The execution plan helps us figure out what slows down query execution by looking at:

  • What index you should add to improve the performance of a query.
  • If tables are joined in an optimal order. You can use STRAIGHT_JOIN to force table order in a join statement for better performance.

Explain works on SELECT, DELETE, INSERT, REPLACE and UPDATE statements as well.

Using explain with ActiveRecord is very straightforward. Enter the following:

2.4.2 :004 > User.where(id: 1).explain

 => EXPLAIN for: SELECT "users".* FROM "users" WHERE "users"."id" = $1 [["id", 1]]
                                QUERY PLAN
--------------------------------------------------------------------------
 Index Scan using users_pkey on users  (cost=0.14..8.16 rows=1 width=792)
   Index Cond: (id = 1)
(2 rows)

2.4.2 :005 >
Enter fullscreen mode Exit fullscreen mode

Adding .explain at the end of the command provides the query plan for the ActiveRecord commands. The above example is a very simple query that uses the id (primary key) to query the table. The output of the explain statement shows that it is using the pkey. From this, you can be sure that the statement is optimal and fast.

You can try adding an .explain command to your slow queries to uncover the order of execution, along with the index used. If the query plan shows Seq Scan, the index is not being used and requires a query change or a new index to be added.

Here's another example with join:

2.4.2 :062 > User.where(id: 1).joins(:collaborations).explain

 => EXPLAIN for: SELECT "users".* FROM "users" INNER JOIN "collaborations" ON "collaborations"."user_id" = "users"."uid" WHERE "users"."id" = $1 [["id", 1]]
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Hash Join  (cost=8.17..27.31 rows=4 width=792)
   Hash Cond: ((collaborations.user_id)::text = (users.uid)::text)
   ->  Seq Scan on collaborations  (cost=0.00..17.20 rows=720 width=32)
   ->  Hash  (cost=8.16..8.16 rows=1 width=792)
         ->  Index Scan using users_pkey on users  (cost=0.14..8.16 rows=1 width=792)
               Index Cond: (id = 1)
(6 rows)
Enter fullscreen mode Exit fullscreen mode

Here, the user table is joined with the collaborations table. When you look into the query plan, the collaboration table is using seq scan and is executing first — whereas the user table is using the pkey index and is executing later. You can add an index on the user_id column in the collaborations table to optimize this query. Explain helps in breaking down the query, so you can figure out where optimization is needed.

2. Measure Key Database Metrics

Query time is not the only metric to measure to see if a query is performant — look at several other database metrics, including:

  1. CPU usage
  2. Memory usage
  3. Disk queue for waiting IO
  4. Network bandwidth for inbound and outbound traffic
  5. Available disk space
  6. Throughput

The query may slow down when these metrics go over a certain threshold. You must look at a data point across a time range to understand performance issues.

The data point that needs to be measured depends on lots of factors, like:

  • The database type:

    • Relational
    • In-memory
    • No-SQL
    • Data-Warehouse
  • How the server is hosted:

    • On-premises
    • On the cloud

There is no single way to monitor database metrics — it depends on different factors.

3. Measure Rails App Performance using AppSignal

It can get difficult to manage all your performance metrics without a central place that gives you visibility over all queries. Adding performance code to every code block can be cumbersome and unmanageable.

With tools like AppSignal, you can easily integrate performance measurement into your application. AppSignal supports Rails out of the box. Learn about the simple AppSignal installation process from the 'AppSignal for Ruby' documentation.

Some of the critical metrics to keep an eye on are:

  • Slow queries
  • Database performance based on throughput
  • N+1 queries
  • Database latency
  • Number of active connections

Here's how an AppSignal dashboard might look:

demo

7 Ways to Optimize Ruby on Rails Database Performance

1. Eager Loading for N+1 Queries

N+1 queries are the most common database performance problem. Let us see an example of an N+1 query where you have two models — user and project:

class User < ActiveRecord::Base
  has_many :projects
end

class Project < ActiveRecord::Base
  belongs_to :user
end
Enter fullscreen mode Exit fullscreen mode

Now, if you want to find the user and project names, run the following code:

users = User.where(country: "Germany")

users.each do |user|
  puts "#{user.name} | #{user.project.name}"
end
Enter fullscreen mode Exit fullscreen mode

The code above will query the database with each loop and cause performance issues. The total number of queries executed will be the number of users + 1.

Eliminating this issue is very simple: eager load the association. Simply add .includes(:projects) at the end of the query:

users = User.where(country: "Germany").includes(:projects)
Enter fullscreen mode Exit fullscreen mode

Now, executing the loop will not query the database, as the query above eager loads the projects:

users.each do |user|
  puts "#{user.name} | #{user.project.name}"
end
Enter fullscreen mode Exit fullscreen mode

Rails 6.1 provides strict loading to ensure that the association is eager loaded before it is accessed. To enable strict loading, add the following line in the model:

class User < ApplicationRecord
  has_many :projects, strict_loading: true
end
Enter fullscreen mode Exit fullscreen mode

Now, when you try to access the projects without eager loading, Rails will throw an ActiveRecord::StrictLoadingViolationError exception.

If you don't have Rails 6.1, you can use gems like Bullet.

2. Use a Database Index

Databases provide indexes to help retrieve data faster. Using the explain command that we covered earlier, you will be able to figure out if a query is using a proper index.

You can change a slow query to use an already existing index or an added index that helps improve performance.

There are four different types of indexes in MySQL:

  1. Primary key - Index is automatically added to the primary key, which also ensures that it is unique
  2. Unique - Unique key index ensures that the items added in an attribute are always unique
  3. Index - Added to attributes other than the primary key
  4. Full text - Helps to query against character-based data

An index is stored in a B-Tree or a Hash format.

Indexes can be added to a single field or created as a composition of multiple fields. A composite index is useful to optimize queries that include multiple fields. When only one index is used, it requires a large dataset scan.

For example, in the following query, there are two fields:

User.where(project: "abc", country: "Germany")
Enter fullscreen mode Exit fullscreen mode

There could be a lot of users in the project "abc" and the country field will be scanned. This can be a slow process because of the large result dataset. In this case, you can add a composite index to both the project and country fields to improve performance.

You can add the index to Rails with the following ActiveRecord migration commands:

Single index:

class AddIndexOnProjectToUsers < ActiveRecord::Migration[6.0]
  def change
    add_index :users, :project
  end
end
Enter fullscreen mode Exit fullscreen mode

Composite index:

class AddIndexOnProjectAndCountryToUsers < ActiveRecord::Migration[6.0]
  def change
    add_index :users, [:project, :country]
  end
end
Enter fullscreen mode Exit fullscreen mode

3. Use Limits

The more records that are returned, the slower performance can get. It is better to do multiple queries than a single query that returns a large data set.

User.where(country: "Germany").limit(100)
Enter fullscreen mode Exit fullscreen mode

To fetch the next 100 batches, you can use the offset:

User.where(country: "Germany").limit(100).offset(100)
Enter fullscreen mode Exit fullscreen mode

This will significantly improve performance. One thing to keep in mind is that with a higher offset, a query gets slower. Add a limit to the offset.

4. Use find_each To Load a Large Number of Items

When iterating over records, batch them in Rails for better performance:

User.where(country: "Germany").each do |user|
  puts user
end
Enter fullscreen mode Exit fullscreen mode

This will query all the records in the database once and cause memory and database performance issues.

Using find_each or find_in_batches will help improve performance by doing the same operation in a batch:

User.where(country: "Germany").find_each do |user|
  puts user
end
Enter fullscreen mode Exit fullscreen mode

By default, find_each queries result in a batch of 1,000. You can change the batch size by defining it as an argument:

User.where(country: "Germany").find_each(:batch_size: 5000)
Enter fullscreen mode Exit fullscreen mode

You can also use find_in_batches based on the operation you need to perform. The difference between find_in_batches and find_each is that find_in_batches yields the result as an array of models instead of individual records.

5. Select Your Required Field Using Pluck

The Pluck command directly converts a query's result to an array instead of an ActiveRecord object.

If a query returns large results, using Pluck will improve code performance. Pluck will only select a required field from a database:

User.pluck(:id)
# SELECT "users"."id" FROM "users"
Enter fullscreen mode Exit fullscreen mode

The result is fetched from an index, not the main table, and is more effective with queries involving sorts.

6. Use Bulk Operations

Bulk Delete
A delete operation looping over the ActiveRecord object will delete records one at a time:

users = User.where(country: "Germany")

users.each do |user|
  user.delete
end

# >
# DELETE FROM users WHERE id = 1;
# DELETE FROM users WHERE id = 5;
Enter fullscreen mode Exit fullscreen mode

Deleting each record requires many queries to be made to the database. Instead, it's optimal to use a single bulk delete_all query:

users = User.where(country: "Germany")

users.delete_all

# >
# DELETE FROM users WHERE users.country = 'Germany';
Enter fullscreen mode Exit fullscreen mode

Bulk Create
Some people don't realize that similar to bulk delete, you can also perform a bulk insert with ActiveRecord. This can reduce the n number of queries to only one. The ActiveRecord::Base create method accepts an array of hashes as input:

users = [
  {name: "Milap", email: "milap@country.com", country: "Germany"},
  {name: "Aastha", email: "aastha@country.com", country: "Germany"}
]

User.create(users)

# INSERT INTO users (name, email)
# VALUES
#   ('Milap', 'milap@country.com', 'Germany'),
#   ('Aastha', 'aastha@country.com', 'Germany')
Enter fullscreen mode Exit fullscreen mode

7. Use In-Memory Calculation if Needed

In some instances, an in-memory calculation is preferable to querying. Suppose we want to find countries in our database that do not have a record of users:

countries = [
  "Germany",
  "UK",
  "Norway",
  "Netherlands"
]

countries.each do |country|
  unless User.where(country: country).exists?
    puts country
  end
end
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Germany' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'UK' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Norway' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Netherlands' LIMIT 1
Enter fullscreen mode Exit fullscreen mode

The above query required N queries to get the result. Instead of this, we can write a single query to find the users in the given countries and do the other calculation in memory:

existing_countries = User.distinct.pluck(:countries)
puts countries - existing_countries

# SELECT DISTINCT `users`.`countries` FROM `users`
Enter fullscreen mode Exit fullscreen mode

You can cache to reuse the request-response cycle and reduce database load in some cases. Rails provides three types of caching techniques: page, action, and fragment caching (fragment caching is offered by default).

Wrap Up: Optimize Your Ruby on Rails Performance with ActiveRecord and AppSignal

Okay, time to recap! In this post, we covered three ways to identify and test database performance issues in Rails by:

  • Running Explain on ActiveRecord queries
  • Measuring key database metrics
  • Measuring Rails app performance using AppSignal

And seven ways to optimize your database performance, including the use of:

  • Eager loading for N+1 queries
  • A database index
  • limits
  • find_each to load a large number of items
  • Pluck to select a required field
  • Bulk operations
  • In-memory calculation

Rails makes it super easy and fast to develop an application. ActiveRecord helps with the productivity, reusability, and maintainability of database code. Understanding how ActiveRecord queries translate to SQL queries and execute is important.

However, the most crucial thing you need to optimize your Rails database performance is visibility over performance data. Performance issues are common, but you can resolve them once you have this visibility. You need a proper monitoring tool that provides database metrics. We like AppSignal ;)

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Top comments (0)