Haseeb Annadamban

Posted on Jul 26, 2023 • Edited on Oct 8, 2024

Notes on Performance Optimization in Rails Applications

#rails

Building web applications in Rails is a rewarding journey. However, as your audience grows and your platform gains traction, you might have noticed a new challenge on the horizon: the need for performance optimization. With more users accessing your application, ensuring its responsiveness and scalability becomes more important to maintain a seamless user experience.

In this post, we'll delve into the world of Performance Optimization in Rails Applications, exploring the best practices, tools, and techniques to make your application fast and capable of handling increased user traffic with ease.

I work as a Rails performance consultant, Helping businesses optimize their Rails applications. I fix performance issues from simple N+1 issues to complex memory issues. If you're looking to improve your app's performance, the strategies discussed in this post can be a great starting point. Performance optimization isn't just about faster load times and smoother interactions; it's about enhancing user satisfaction, boosting search engine rankings, and, ultimately, solidifying your position in an ever-competitive digital landscape. This blog post aims to guide Rails developers through performance optimization strategies, where performance is much more than just speed - it includes scalability, reliability, and efficiency. This post discusses how to avoid having these issues by following best practices and early-stage optimizations. I will avoid anything that is treated as over-engineering or premature optimizations. Instead, These are treated as best practices in the Rails community.

In my experience, Performance optimizations in Rails are usually a mix of three factors

Database optimization
Memory Optimization
Use of the correct algorithm

Database optimization

The database is often the primary bottleneck, causing lags in your Rails application's performance. Optimizing your database can greatly improve the speed and efficiency of your Rails app. These are some of the common issues to consider

N+1 Query problem

The issue

The N+1 query problem is a common performance issue that can significantly slow down Rails applications. This problem occurs when you're pulling in associated data for each record individually, instead of retrieving all at once. For instance, when you're fetching a list of posts and their comments, Rails might make one database call for the posts and then another for each post's comments.

The solution

Use eager loading techniques like includes, preload, and eager_load. These methods allow you to load all the associated records in one go, thus reducing the number of queries.

Let's consider an example where we have a User model and a Post model. Each User has many Posts.

Without eager loading, if we were to get all posts for all users, we might write something like:

users = User.all

users.each do |user|
  puts "User: #{user.name}"
  user.posts.each do |post|
    puts "Title: #{post.title}"
  end
end

This code has the N+1 query problem. The User.all query pulls all users, and then for each user, we're making another query to fetch their posts with user.posts. If we have 100 users, this results in 101 queries - one for fetching users and 100 for fetching the posts of each user.

To solve this N+1 query problem, we can use eager loading with the includes method:

users = User.includes(:posts)

users.each do |user|
  puts "User: #{user.name}"
  user.posts.each do |post|
    puts "Title: #{post.title}"
  end
end

Now, only two queries are executed, regardless of the number of users. One query fetches all users and another fetches all posts associated with those users. This is a significant improvement, especially as the number of users increases.

Database indexing

Database indexing is a powerful tool in a developer's arsenal that can greatly boost application performance when implemented correctly.

Think of a database as a massive library and the data as its collection of books. Without any system of organization, finding a specific book would involve going through the entire collection, which is not efficient. This is where an 'index' comes into play. In the context of a library, an index could be a catalogue system which allows you to locate a book by its title, author, or subject etc. This same concept applies to databases.

In a database, an index allows the database server to find and retrieve specific rows much faster through the index. However, creating an index requires additional disk space, and maintaining an index can slow down data-writing operations. This is why it's crucial to use indexes carefully and only on columns that will be searched or sorted frequently or if you feel indexing will boost performance significantly.

Ruby on Rails provides support for database indexing out of the box through ActiveRecord migrations. When you create a new migration to add an index, Rails will generate the SQL necessary to add the new index to your database.

To add an index, you can use the add_index method in your migration:

add_index :table_name, :column_name

Indexes are not only for individual columns. Rails also supports composite (multi-column) indexes. These can be useful for queries that filter or sort by multiple columns.

Effective indexing requires a solid understanding of the data and the queries your application makes. An index might greatly speed up data retrieval for one type of query but might be useless or even detrimental for another.

Knowing the distribution and characteristics of the data in your table is essential. If a column has a high degree of uniqueness, it's usually a good candidate for indexing. Conversely, if a column has very low uniqueness, an index on this column may not be beneficial.

While indexes can speed up data retrieval, they come at a cost. Every time a write operation (INSERT, UPDATE, DELETE) occurs on a table, all indexes on that table must also be updated. Therefore, while adding more indexes can speed up read operations, it can also slow down write operations. You need to strike a balance based on your application's specific read-write patterns.

We have an option to explain our queries in rails. See this guide for more information. You can use it to effectively determine if indexes are needed whether it's using your indexes effectively.

Common query optimizations

`select` and `pluck`

When you only need specific fields, you can use the select and pluck methods to increase performance.

The select method limits the retrieved data to specified columns:

User.select(:name, :email)

The pluck method retrieves specified columns a converts them into and converts them into array of values, bypassing the creation of ActiveRecord objects. It should be at the end of your query because no further chaining is possible since it returns an array:

User.pluck(:name, :email)

Batch processing

ActiveRecord’s all, find, and where methods load all matching records into memory at once. While this is fine for small datasets, it could lead to high memory usage when dealing with large tables. The find_each and find_in_batches methods offer a way out by retrieving records in batches. Both the methods below fetch a 1000 records only at a time

User.find_each(batch_size: 1000) do |user|
  # Process individual user
  puts user.name
end

User.find_in_batches(batch_size: 1000) do |group|
  # Process a batch of users
  group.each do |user|
    puts user
  end
end

You can avoid callbacks if you know what you are doing

ActiveRecord callbacks can cause a significant performance hit, especially when processing large batches of records. If you're performing a mass update/insert and the model has complex callbacks, consider bypassing them using the methods like update_all or insert_all. However, while avoiding callbacks can improve performance, particularly for bulk operations, it's crucial to understand when doing so could be harmful to your application. Here are some scenarios when avoiding callbacks might be a bad idea. It can skip business logic added as callbacks and data validations and important side effects added through callbacks like before_update etc.

Try to minimize db access

ActiveRecord is powerful and flexible. However, it's easy to write inefficient queries without realizing it. So, when writing queries keep an intention to reduce the number of db hits

# Don't do this
# It will hit db for each record
Post.all.select{ |p| p.published? }.each { |p| puts p.title }

# This hits db only once
Post.where(published: true).each { |p| puts p.title }

Counter cache

When you frequently need to get the count of associated records, consider using counter_cache, which stores the count in a column on the parent record.

# In your migration
add_column :posts, :comments_count, :integer, default: 0

# In your model
class Comment < ActiveRecord::Base
  belongs_to :post, counter_cache: true
end

This eliminates the need for a count query each time you need the number of associated records, saving memory and processing time.

Caching

Proper use of caching can drastically improve the speed and responsiveness of your application. There are multiple types of caching in rails. Rails has an awesome guide about caching

Pagination

The basic idea behind pagination is to only load a subset of data at a time, rather than trying to load all records into memory at once, which can have a significant impact on performance.

Rails provides some built-in methods for easy pagination with the use of the limit and offset commands

# This will retrieve the first 20 records
@posts = Post.limit(20)

# This will retrieve the next 20 records
@posts = Post.limit(20).offset(20)

There are also gems like kaminari and will_paginate that provide more powerful and customizable pagination features, including generating pagination links in views and handling edge cases.

Advanced: Read replicas

When your application scales, your single database server might become a bottleneck due to the increasing number of read operations. In this case, Use read replicas to distribute the load. A read replica is a copy of the primary database that can handle read traffic. This can greatly increase the performance of your Rails application. You can easily configure ActiveRecord to direct read queries to the replica.

production:
  primary:
    adapter: postgresql
    database: my_primary_database
  replica:
    adapter: postgresql
    database: my_replica_database
    replica: true

There are even more advanced strategies like Sharding, Denormalization etc. They are outside the scope of this post.

Memory optimization

Many of the techniques discussed earlier like Avoiding N+1 queries, use of 'pluck' and 'select' methods, Batch processing etc helps to optimize memory usage. Here are some other strategies

Use latest stable version of ruby and rails

First and foremost, ensure you're using the most recent version of Ruby. Ruby's garbage collector, responsible for freeing up memory that is no longer in use, has been significantly improved in recent versions. The latest Ruby versions have a more efficient garbage collector that helps reduce memory usage and enhance performance.

Utilize lazy loading when working with large collections

collections are eager-loaded by default in ruby, meaning all objects in the collection are loaded into memory when the collection is initialized. When working with large collections, this can lead to substantial memory consumption.

# The program will get stuck when you run this. Tip: Hit ctrl+c
(1..Float::INFINITY).map{ |n| n * 2 }.first(50)

# This will return data correctly
(1..Float::INFINITY).lazy.map{ |n| n * 2 }.first(50)

Use symbols instead of strings where appropriate

In Ruby, symbols are immutable and unique, which makes them memory efficient. When the same symbol is used multiple times, Ruby points to the same object in memory, whereas with strings, a new object is created every time, consuming more memory.

Limit session data

Storing excessive data in sessions can consume significant server-side memory, especially if you're using a server-side session store. Try to limit session data to only what's necessary and avoid storing large objects or data sets.

Use background jobs for long running processes

Long running processes can significantly consume memory and CPU. If the task does not need to be completed immediately, consider moving it to a background job. Background jobs can be a great way to move resource-intensive tasks, such as sending emails or processing images, out of the request-response cycle. This ensures your users aren't left waiting for these tasks to complete and your application remains responsive.

Rails has built-in support for creating background jobs using Active Job. Which internally uses tools like Sidekiq etc. It will be better to use Sidekiq directly for advanced uses,

Use of correct algorithm

Proper algorithm selection and implementation can have profound effects on the performance of your application. Remember that a well-optimized application not only provides a better user experience but also consumes fewer resources, which leads to cost-effectiveness in the long run.

Use built-in ruby methods

When it comes to improving performance, one of the most important steps you can take is to use built-in Ruby methods wherever possible. Ruby has an extensive library of built-in methods that are optimized for performance and memory usage.

Built-in methods are already compiled and optimized, making them faster than custom methods. They also tend to be more reliable and less prone to errors, as they've been extensively tested and refined over the years.

A few examples of built-in Ruby methods include sort, map, select, and reject. These methods can be used for a wide range of tasks, including sorting arrays, transforming data, and filtering data.

Memoization in Rails

Memoization is a powerful technique that involves storing the results of expensive function calls and reusing the cached result when the same inputs occur again. This technique can significantly improve the performance of your Rails application by avoiding redundant calculations or database queries.

In Rails, you can achieve memoization using the ||= operator.

def expensive_method
  @result ||= ExpensiveOperation.new.call
end

Avoiding nested loops

Nested loops can be a major source of performance issues in your Rails application. Each additional level of nesting increases the complexity of the algorithm, leading to exponential growth in execution time as the size of your data sets increases.

While there are situations where nested loops are necessary, they should be avoided whenever possible. If you find yourself writing nested loops, it may be worth looking for a different approach or using a more efficient data structure.

For example, you can often replace nested loops with Ruby's built-in methods or use a hash table to look up values in constant time, rather than looping through an array.

How to find what's the issue with a slow endpoint

If you you are truly having this issue and want to learn more about it fixing performance issues in Rails apps, Then it is actually a long topic that won't fit in one or two blog posts. I have a book recommendation: The Complete Guide to Rails Performance by Nate Berkopec.

When testing locally, Use rack-mini-profiler. You will be able to use it to find issues like N+1 and slow queries easily. It has other functionalities like memory profiling, flamegraph etc. It can be run in production too if needed. Also you can run it locally making rails run in production mode.

You can make use of monitoring tools (APM) like newrelic in production to continuously monitor your performance metrics.

Improving performance of specific slow SQL query.

Sometimes you get alerts about a specific database query getting timing out. In most of these cases for web apps indexing will improve performance significantly. You will have to check where to add index using #expalin method. I have detailed two part blog series on this topic. You can start here: https://haseebeqx.com/posts/rails-explain-analyze-explained/

Conclusion

Rails performance optimization is a vital aspect of developing efficient and scalable applications. It's a continuous process that requires periodic review and adjustment. Remember, it's not about creating an application that merely functions, but one that flourishes under load, provides a seamless user experience, and stands the test of time.

DEV Community