DEV Community

Cover image for From Slow to Lightning Fast: Optimizing String Concatenation in Ruby on Rails
Pimp My Ruby
Pimp My Ruby

Posted on

From Slow to Lightning Fast: Optimizing String Concatenation in Ruby on Rails

Developers working with Ruby on Rails often face the challenge of concatenating database fields to create complex strings. Whether you need to generate reports, build search strings, or format data for display, there are several methods to accomplish this task. In this article, we will explore six different ways to concatenate fields using Ruby on Rails.

For this article, i used a repository that you can find here.

Introduction

Our need is as follows: we have a large number of accounts in the database, and we want to easily extract basic account information. The extraction format is as follows:

{FirstName} {LastName} ({Role}), can be contacted at {Email} or {Phone}

Let's explore how to achieve this together!

Data Set Overview

Here is the table we need to export:

# db/schema.rb
create_table "accounts", force: :cascade do |t|
    t.string "first_name"
    t.string "last_name"
    t.string "phone"
    t.string "email"
    t.string "role"
  end
Enter fullscreen mode Exit fullscreen mode

For our test data, we will use the following seed:

# db/seeds.rb
FactoryBot.define do
  factory :account do
    first_name { Faker::Name.first_name }
    last_name { Faker::Name.last_name }
    phone { Faker::PhoneNumber.phone_number }
    email { Faker::Internet.email }
    role { 'user' }
  end
end

FactoryBot.create_list(:account, 1_000_000)
Enter fullscreen mode Exit fullscreen mode

Everything is set up and ready to start working!

1. Using .map

The .map method is the simplest way to concatenate fields. It involves loading all records from the "Account" table into memory and then using a loop to iterate over each record.

Account.all.map do |account|
  "#{account.first_name} #{account.last_name} (#{account.role}), can be contacted at #{account.email} or #{account.phone}"
end
Enter fullscreen mode Exit fullscreen mode

However, this approach is the least optimized as it loads all records from the "Account" table into memory.

Honestly, it's the worst way to do it.

2. Using .pluck

The .pluck() method is similar to using .map(), but it is more memory-efficient. It extracts only the values of the fields we need using a single SQL query.

This reduces the amount of data loaded into memory, which is more efficient.

Account.all.pluck(:first_name, :last_name, :role, :email, :phone).map do |info|
  "#{info[0]} #{info[1]} (#{info[2]}), can be contacted at #{info[3]} or #{info[4]}"
end
Enter fullscreen mode Exit fullscreen mode

This method is much more efficient than the first one! But beware, the methods to follow will be even more efficient.

3. Using Raw SQL Queries

By using raw SQL queries, we can concatenate fields directly at the database level. We use the SQL "CONCAT" function to create the concatenated string.

Account.select(
  "CONCAT(first_name, ' ', last_name, ' (', role, '), can be contacted at ', email, ' or ', phone)
   AS insql_concatenated_string"
).map(&:insql_concatenated_string)
Enter fullscreen mode Exit fullscreen mode

This method is very performance-efficient because concatenation is performed at the database level. However, it requires the use of raw SQL, which can make the code less readable and less portable.

4. Using Arel

Using Arel, a Ruby-based SQL query construction language, allows us to create complex SQL queries.

# A little helper needed
def to_sql(string)
  Arel.sql(string)
end

# We will use the account table, so I'll extract it
account_table = Account.arel_table

# Each step is essentially what you can do in a raw SQL query, but it's sugar syntax thanks to Arel.
concatenate_arel = account_table[:first_name].concat(to_sql("' '"))
                                             .concat(account_table[:last_name])
                                             .concat(to_sql("' ('"))
                                             .concat(account_table[:role])
                                             .concat(to_sql("'), can be contacted at '"))
                                             .concat(account_table[:email])
                                             .concat(to_sql("' or '"))
                                             .concat(account_table[:phone])
                                             .as('arel_concatenated_string')

# Now we need to run our select over all our records and grab `arel_concatenated_string`
Account.select(concatenate_arel).map(&:arel_concatenated_string)
Enter fullscreen mode Exit fullscreen mode

This approach offers a high degree of customization and allows us to generate complex SQL queries using Ruby code. So we can expect performance close to that of a raw SQL query.

The downside of Arel, in my opinion, is its verbosity. As you can see, the query is quite long. However, it is more portable than raw SQL.

5. Using SQL Views with Scenic

Using SQL views with the Scenic gem is an interesting approach for concatenation.

Before we begin, we need to add the gem to our project and generate our SQL view.

bundle add scenic
rails generate scenic:model account_information
Enter fullscreen mode Exit fullscreen mode
# db/views/account_informations_v01.sql

SELECT CONCAT(first_name, ' ', last_name, ' (', role, '), can be contacted at ', email, ' or ', phone) as concatenated_string
FROM accounts

# app/models/account_information.rb

class AccountInformation < ApplicationRecord
  def readonly?
    true
  end
end
Enter fullscreen mode Exit fullscreen mode

By doing this, our view is accessible via the associated model:

AccountInformation.pluck(:concatenated_string)
Enter fullscreen mode Exit fullscreen mode

This approach is very powerful! SQL views are a concept not natively exploited by Ruby on Rails. In this case, the query is very simple; we're just concatenating a string. But if your query involves joins and searching, using SQL views becomes even more relevant!

6. Using Virtual Columns

Using virtual columns is an interesting approach that involves adding an "account_information" column to the "Account" table using a migration.

# db/migrate/20231018094315_add_virtual_column_on_accounts.rb
class AddVirtualColumnOnAccounts < ActiveRecord::Migration[7.0]
  def change
    add_column :accounts, :account_information, :virtual, type: :string,
               as: "first_name || ' ' || last_name || ' (' || role || ') can be contacted at ' ||  email || ' or ' || phone",
               stored: true
  end
end
Enter fullscreen mode Exit fullscreen mode

Afterward, we have access to our information using:

Account.pluck(:account_information)
Enter fullscreen mode Exit fullscreen mode

This approach is very powerful because the virtual column is synchronized with the fields in its expression. If you update the value of first_name, the account_information column is automatically updated without the need for manual updates or additional transactions.

Take into account the stored: true argument, which allows saving the value in the database. On my version of PostgreSQL, stored: false is not supported. When stored is set to false, the column's value is recalculated with each call.

Benchmark Analysis

Now that we know all these methods, let's find out which one is the fastest! You can find the benchmark here.

Here is the time it takes to display our concatenated string for 650K records. Make your bets!

Results

Label User System Total Real
Using .map 10.848756 1.556344 12.405100 12.750942
Using .pluck 1.501343 0.159192 1.660535 1.809598
Using Raw SQL 2.774949 0.409105 3.184054 3.425237
Using Arel 2.602465 0.726804 3.329269 3.509664
Using Scenic 0.358593 0.030825 0.389418 0.628602
Using Virtual Column 0.295781 0.041697 0.337478 0.452611

🥇Virtual Columns (28x faster than .map)

🥈Scenic views (20x faster than .map)

🥉Pluck (7x faster than .map)

To be honest, I'm quite surprised that .pluck is so fast at scale!

Interpretation

  • .map: This method is the slowest, taking approximately 12.75 seconds to process 650,000 records. This is due to loading all ActiveRecord records into memory.
  • .pluck: Compared to .map, this method is about 7 times faster, taking only 1.81 seconds for the same task. By selecting only the necessary columns, it significantly reduces memory consumption.
  • Raw SQL Query: This method performs concatenation directly at the database level. It is faster than .map but slower than .pluck. However, this solution excels for complex queries requiring joins or the use of other SQL functions.
  • Using Arel: Using Arel allows the generation of complex SQL queries in Ruby. While more verbose, it offers good performance, slightly behind raw SQL.
  • Using Scenic Views: Using SQL views with Scenic is approximately 20 times faster than .map. This demonstrates the power of SQL views for improving concatenation performance. In general, using SQL views is a good practice for very large databases.
  • Using Virtual Column: Virtual columns proved to be the fastest, being approximately 28 times faster than .map. Virtual columns offer the advantage of automatically maintaining data synchronized with source fields, without requiring manual updates or additional transactions.

Recommendations

  • For very simple read-only queries, .pluck is more than sufficient.
  • If your requirements are even more extensive and require minimal processing, use Virtual Columns.
  • For very complex needs, involving multiple tables and requiring regular updates, use SQL Views.

And above all, never use .map for string processing from the database!

Conclusion

Concatenating fields in a database is a common task, but there are several ways to accomplish it in Ruby on Rails. Each of the methods we've explored has its advantages and disadvantages. By understanding the differences between these approaches, you can choose the one that best suits your project. Whatever method you choose, Ruby on Rails provides the flexibility needed to handle your data efficiently.

Learn More

Top comments (1)

Collapse
 
maprangsoft profile image
Maprangsoft

thank you.