In this article, you will learn how to test database performance in Rails and solve some of the most common database performance issues.
When you develop a Rails application, ActiveRecord is the default tool that manages your database. ActiveRecord provides an easy and fast interface to query and insert data using commands like .where
, .save
, .create
, and .update
. Rails does the work of converting these commands to SQL queries, which is a good thing, but sometimes can cause performance issues. It is important that you understand some of the common issues and how to optimize performance.
A Quick Note on ActiveRecord in Ruby on Rails
Rails ActiveRecord is a layer in Model-View-Controller (MVC) that manages your database by representing it as a business object. The ActiveRecord pattern uses the ORM technique to connect objects of an application to the relational database table management system.
Now, let's get going!
3 Ways to Identify and Test Database Performance Issues in Rails
1. Run Explain on ActiveRecord Queries
An explain statement displays information about the execution plan of an SQL query - how a query will be executed, including how many rows will be scanned, what index will be used, and how tables are joined.
The execution plan helps us figure out what slows down query execution by looking at:
- What index you should add to improve the performance of a query.
- If tables are joined in an optimal order. You can use
STRAIGHT_JOIN
to force table order in a join statement for better performance.
Explain works on SELECT, DELETE, INSERT, REPLACE and UPDATE statements as well.
Using explain with ActiveRecord is very straightforward. Enter the following:
2.4.2 :004 > User.where(id: 1).explain
=> EXPLAIN for: SELECT "users".* FROM "users" WHERE "users"."id" = $1 [["id", 1]]
QUERY PLAN
--------------------------------------------------------------------------
Index Scan using users_pkey on users (cost=0.14..8.16 rows=1 width=792)
Index Cond: (id = 1)
(2 rows)
2.4.2 :005 >
Adding .explain
at the end of the command provides the query plan for the ActiveRecord commands. The above example is a very simple query that uses the id
(primary key) to query the table. The output of the explain
statement shows that it is using the pkey
. From this, you can be sure that the statement is optimal and fast.
You can try adding an .explain
command to your slow queries to uncover the order of execution, along with the index used. If the query plan shows Seq Scan
, the index is not being used and requires a query change or a new index to be added.
Here's another example with join:
2.4.2 :062 > User.where(id: 1).joins(:collaborations).explain
=> EXPLAIN for: SELECT "users".* FROM "users" INNER JOIN "collaborations" ON "collaborations"."user_id" = "users"."uid" WHERE "users"."id" = $1 [["id", 1]]
QUERY PLAN
--------------------------------------------------------------------------------------
Hash Join (cost=8.17..27.31 rows=4 width=792)
Hash Cond: ((collaborations.user_id)::text = (users.uid)::text)
-> Seq Scan on collaborations (cost=0.00..17.20 rows=720 width=32)
-> Hash (cost=8.16..8.16 rows=1 width=792)
-> Index Scan using users_pkey on users (cost=0.14..8.16 rows=1 width=792)
Index Cond: (id = 1)
(6 rows)
Here, the user table is joined with the collaborations table. When you look into the query plan, the collaboration table is using seq scan
and is executing first — whereas the user table is using the pkey
index and is executing later. You can add an index on the user_id
column in the collaborations table to optimize this query. Explain helps in breaking down the query, so you can figure out where optimization is needed.
2. Measure Key Database Metrics
Query time is not the only metric to measure to see if a query is performant — look at several other database metrics, including:
- CPU usage
- Memory usage
- Disk queue for waiting IO
- Network bandwidth for inbound and outbound traffic
- Available disk space
- Throughput
The query may slow down when these metrics go over a certain threshold. You must look at a data point across a time range to understand performance issues.
The data point that needs to be measured depends on lots of factors, like:
-
The database type:
- Relational
- In-memory
- No-SQL
- Data-Warehouse
-
How the server is hosted:
- On-premises
- On the cloud
There is no single way to monitor database metrics — it depends on different factors.
3. Measure Rails App Performance using AppSignal
It can get difficult to manage all your performance metrics without a central place that gives you visibility over all queries. Adding performance code to every code block can be cumbersome and unmanageable.
With tools like AppSignal, you can easily integrate performance measurement into your application. AppSignal supports Rails out of the box. Learn about the simple AppSignal installation process from the 'AppSignal for Ruby' documentation.
Some of the critical metrics to keep an eye on are:
- Slow queries
- Database performance based on throughput
- N+1 queries
- Database latency
- Number of active connections
Here's how an AppSignal dashboard might look:
7 Ways to Optimize Ruby on Rails Database Performance
1. Eager Loading for N+1 Queries
N+1 queries are the most common database performance problem. Let us see an example of an N+1
query where you have two models — user and project:
class User < ActiveRecord::Base
has_many :projects
end
class Project < ActiveRecord::Base
belongs_to :user
end
Now, if you want to find the user and project names, run the following code:
users = User.where(country: "Germany")
users.each do |user|
puts "#{user.name} | #{user.project.name}"
end
The code above will query the database with each loop and cause performance issues. The total number of queries executed will be the number of users + 1.
Eliminating this issue is very simple: eager load the association. Simply add .includes(:projects)
at the end of the query:
users = User.where(country: "Germany").includes(:projects)
Now, executing the loop will not query the database, as the query above eager loads the projects:
users.each do |user|
puts "#{user.name} | #{user.project.name}"
end
Rails 6.1 provides strict loading to ensure that the association is eager loaded before it is accessed. To enable strict loading, add the following line in the model:
class User < ApplicationRecord
has_many :projects, strict_loading: true
end
Now, when you try to access the projects without eager loading, Rails will throw an ActiveRecord::StrictLoadingViolationError
exception.
If you don't have Rails 6.1, you can use gems like Bullet.
2. Use a Database Index
Databases provide indexes to help retrieve data faster. Using the explain command that we covered earlier, you will be able to figure out if a query is using a proper index.
You can change a slow query to use an already existing index or an added index that helps improve performance.
There are four different types of indexes in MySQL:
- Primary key - Index is automatically added to the primary key, which also ensures that it is unique
- Unique - Unique key index ensures that the items added in an attribute are always unique
- Index - Added to attributes other than the primary key
- Full text - Helps to query against character-based data
An index is stored in a B-Tree or a Hash format.
Indexes can be added to a single field or created as a composition of multiple fields. A composite index is useful to optimize queries that include multiple fields. When only one index is used, it requires a large dataset scan.
For example, in the following query, there are two fields:
User.where(project: "abc", country: "Germany")
There could be a lot of users in the project "abc"
and the country field will be scanned. This can be a slow process because of the large result dataset. In this case, you can add a composite index to both the project and country fields to improve performance.
You can add the index to Rails with the following ActiveRecord migration commands:
Single index:
class AddIndexOnProjectToUsers < ActiveRecord::Migration[6.0]
def change
add_index :users, :project
end
end
Composite index:
class AddIndexOnProjectAndCountryToUsers < ActiveRecord::Migration[6.0]
def change
add_index :users, [:project, :country]
end
end
3. Use Limits
The more records that are returned, the slower performance can get. It is better to do multiple queries than a single query that returns a large data set.
User.where(country: "Germany").limit(100)
To fetch the next 100 batches, you can use the offset:
User.where(country: "Germany").limit(100).offset(100)
This will significantly improve performance. One thing to keep in mind is that with a higher offset, a query gets slower. Add a limit to the offset.
4. Use find_each
To Load a Large Number of Items
When iterating over records, batch them in Rails for better performance:
User.where(country: "Germany").each do |user|
puts user
end
This will query all the records in the database once and cause memory and database performance issues.
Using find_each
or find_in_batches
will help improve performance by doing the same operation in a batch:
User.where(country: "Germany").find_each do |user|
puts user
end
By default, find_each
queries result in a batch of 1,000. You can change the batch size by defining it as an argument:
User.where(country: "Germany").find_each(:batch_size: 5000)
You can also use find_in_batches
based on the operation you need to perform. The difference between find_in_batches
and find_each
is that find_in_batches
yields the result as an array of models instead of individual records.
5. Select Your Required Field Using Pluck
The Pluck command directly converts a query's result to an array instead of an ActiveRecord object.
If a query returns large results, using Pluck will improve code performance. Pluck will only select a required field from a database:
User.pluck(:id)
# SELECT "users"."id" FROM "users"
The result is fetched from an index, not the main table, and is more effective with queries involving sorts.
6. Use Bulk Operations
Bulk Delete
A delete operation looping over the ActiveRecord object will delete records one at a time:
users = User.where(country: "Germany")
users.each do |user|
user.delete
end
# >
# DELETE FROM users WHERE id = 1;
# DELETE FROM users WHERE id = 5;
Deleting each record requires many queries to be made to the database. Instead, it's optimal to use a single bulk delete_all
query:
users = User.where(country: "Germany")
users.delete_all
# >
# DELETE FROM users WHERE users.country = 'Germany';
Bulk Create
Some people don't realize that similar to bulk delete, you can also perform a bulk insert with ActiveRecord. This can reduce the n
number of queries to only one. The ActiveRecord::Base
create
method accepts an array of hashes as input:
users = [
{name: "Milap", email: "milap@country.com", country: "Germany"},
{name: "Aastha", email: "aastha@country.com", country: "Germany"}
]
User.create(users)
# INSERT INTO users (name, email)
# VALUES
# ('Milap', 'milap@country.com', 'Germany'),
# ('Aastha', 'aastha@country.com', 'Germany')
7. Use In-Memory Calculation if Needed
In some instances, an in-memory calculation is preferable to querying. Suppose we want to find countries in our database that do not have a record of users:
countries = [
"Germany",
"UK",
"Norway",
"Netherlands"
]
countries.each do |country|
unless User.where(country: country).exists?
puts country
end
end
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Germany' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'UK' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Norway' LIMIT 1
# SELECT 1 AS one FROM `users` WHERE users`.`country` = 'Netherlands' LIMIT 1
The above query required N queries to get the result. Instead of this, we can write a single query to find the users in the given countries and do the other calculation in memory:
existing_countries = User.distinct.pluck(:countries)
puts countries - existing_countries
# SELECT DISTINCT `users`.`countries` FROM `users`
You can cache to reuse the request-response cycle and reduce database load in some cases. Rails provides three types of caching techniques: page, action, and fragment caching (fragment caching is offered by default).
Wrap Up: Optimize Your Ruby on Rails Performance with ActiveRecord and AppSignal
Okay, time to recap! In this post, we covered three ways to identify and test database performance issues in Rails by:
- Running Explain on ActiveRecord queries
- Measuring key database metrics
- Measuring Rails app performance using AppSignal
And seven ways to optimize your database performance, including the use of:
- Eager loading for N+1 queries
- A database index
- limits
-
find_each
to load a large number of items - Pluck to select a required field
- Bulk operations
- In-memory calculation
Rails makes it super easy and fast to develop an application. ActiveRecord helps with the productivity, reusability, and maintainability of database code. Understanding how ActiveRecord queries translate to SQL queries and execute is important.
However, the most crucial thing you need to optimize your Rails database performance is visibility over performance data. Performance issues are common, but you can resolve them once you have this visibility. You need a proper monitoring tool that provides database metrics. We like AppSignal ;)
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!
Top comments (0)