<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Daniela Baron</title>
    <description>The latest articles on DEV Community by Daniela Baron (@danielabar).</description>
    <link>https://dev.to/danielabar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1343718%2F5b5e169f-7d04-4a30-81e0-57c5ac23d8cd.png</url>
      <title>DEV Community: Daniela Baron</title>
      <link>https://dev.to/danielabar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/danielabar"/>
    <language>en</language>
    <item>
      <title>Efficient Database Queries in Rails: A Practical Approach</title>
      <dc:creator>Daniela Baron</dc:creator>
      <pubDate>Sun, 10 Mar 2024 20:08:20 +0000</pubDate>
      <link>https://dev.to/danielabar/efficient-database-queries-in-rails-a-practical-approach-12hl</link>
      <guid>https://dev.to/danielabar/efficient-database-queries-in-rails-a-practical-approach-12hl</guid>
      <description>&lt;p&gt;This post will walk through a step-by-step approach to PostgreSQL query enhancement in Rails applications. From indexing strategies to running migrations without downtime, to efficient column selection, you'll learn some techniques to ensure optimal query performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;To get started quicker with a Rails application, schema design, and data, I've forked the &lt;a href="https://github.com/andyatkinson/rideshare" rel="noopener noreferrer"&gt;Rideshare Rails application&lt;/a&gt; on GitHub. This is used for exercises in the book &lt;em&gt;High Performance PostgreSQL for Rails&lt;/em&gt;. I had the opportunity to provide a technical review for the beta version of this book. Many insights shared in this post are derived from the lessons learned during that review process. If you're interested in the book, it can be &lt;a href="https://pragprog.com/titles/aapsql/high-performance-postgresql-for-rails/" rel="noopener noreferrer"&gt;purchased here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing Rideshare
&lt;/h2&gt;

&lt;p&gt;Rideshare is an API-only Rails application that implements a portion of a fictional rideshare service. Think of Uber or Lyft. The core Rideshare models are Drivers, Riders, Trips, Trip Requests. The database is PostgreSQL. The schema is shown below and was generated with the &lt;a href="https://github.com/voormedia/rails-erd" rel="noopener noreferrer"&gt;rails-erd&lt;/a&gt; gem:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foq4xf3z8t075g4ohd1l9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foq4xf3z8t075g4ohd1l9.png" alt="Rideshare erd"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Single Table Inheritance (&lt;a href="https://api.rubyonrails.org/classes/ActiveRecord/Inheritance.html" rel="noopener noreferrer"&gt;STI&lt;/a&gt;) is used to represent both Drivers and Riders in the Users table. This means that rows where &lt;code&gt;users.type = 'Driver'&lt;/code&gt; are Drivers, and can be joined to Trips on &lt;code&gt;driver_id&lt;/code&gt;. Rows where &lt;code&gt;users.type = 'Rider'&lt;/code&gt; are Riders, and can be joined to TripRequests on &lt;code&gt;rider_id&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here are the corresponding model classes, only focusing on the associations between them:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Trip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:trip_request&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:driver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;class_name: &lt;/span&gt;&lt;span class="s1"&gt;'User'&lt;/span&gt;
  &lt;span class="n"&gt;has_many&lt;/span&gt; &lt;span class="ss"&gt;:trip_positions&lt;/span&gt;

  &lt;span class="n"&gt;delegate&lt;/span&gt; &lt;span class="ss"&gt;:rider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;to: :trip_request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;allow_nil: &lt;/span&gt;&lt;span class="kp"&gt;false&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Location&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TripRequest&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:rider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;class_name: &lt;/span&gt;&lt;span class="s1"&gt;'User'&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:start_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;class_name: &lt;/span&gt;&lt;span class="s1"&gt;'Location'&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:end_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;class_name: &lt;/span&gt;&lt;span class="s1"&gt;'Location'&lt;/span&gt;
  &lt;span class="n"&gt;has_one&lt;/span&gt; &lt;span class="ss"&gt;:trip&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Generating Data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Rideshare comes with a task to generate sample data &lt;code&gt;bin/rails data_generators:generate_all&lt;/code&gt;. I've modified it to generate 1000 Drivers and Riders, and 50,000 Trips and Trip Requests. I've also modified it to generate random variability in the &lt;code&gt;trips.completed_at&lt;/code&gt; date from anywhere between 1 and 90 days ago. The data generator can be re-run at any time, but first run the truncate task &lt;code&gt;bin/rails db:truncate_all&lt;/code&gt; to quickly remove all data.&lt;/p&gt;

&lt;p&gt;While generating fake data for testing purposes can provide a valuable simulation of a sizeable database, seeded data tends to be very efficiently laid out, sequentially. This results in an unrealistically high number of rows in fewer pages. When the data is packed so efficiently, performance can be good for sequential scans. An equivalent range of data that grew and churned organically in production over time via &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt;, statements will result in a more dispersed distribution of data across pages.&lt;/p&gt;

&lt;p&gt;Therefore, when addressing a specific performance issues encountered in a production database, it might be necessary to replicate the environment in a test setup, by copying relevant portions of production data to a separate testing environment. This may also require anonymizing sensitive data during the copy process. The &lt;a href="(https://pragprog.com/titles/aapsql/high-performance-postgresql-for-rails/)"&gt;High Performance PostgreSQL for Rails&lt;/a&gt; book goes through this exercise with an example &lt;code&gt;users&lt;/code&gt; table.&lt;/p&gt;

&lt;p&gt;However, for the purpose of this blog post, we'll be focused on optimizing a query to meet a new business requirement, so we don't need to worry about the complexities of a live environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building an Admin Report
&lt;/h2&gt;

&lt;p&gt;We have a business requirement to create a dashboard type view for the admin team that manages Rideshare. They would like to have a view that shows recently completed trips. Each trip should display:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trip completed date&lt;/li&gt;
&lt;li&gt;Rating given by the rider&lt;/li&gt;
&lt;li&gt;Driver that provided the trip&lt;/li&gt;
&lt;li&gt;Rider that took the trip&lt;/li&gt;
&lt;li&gt;Location where the trip started&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Being an API-only application, we will only focus on the query part of this logic, and assume a separate front end application will take care of rendering the report.&lt;/p&gt;

&lt;h2&gt;
  
  
  First Attempt
&lt;/h2&gt;

&lt;p&gt;Before getting into the complexity of joins to get all the data, let's first focus on the main requirement, which is to show recently completed trips. We can use the &lt;code&gt;completed_at&lt;/code&gt; timestamp column on the &lt;code&gt;Trip&lt;/code&gt; model in a scope as follows:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Trip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:recently_completed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'completed_at &amp;gt;= ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;week&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;# ...&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Let's try this out in a Rails console &lt;code&gt;bin/rails c&lt;/code&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Trip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recently_completed&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT "trips"."id", "trips"."trip_request_id", "trips"."driver_id", "trips"."completed_at", "trips"."rating", "trips"."created_at", "trips"."updated_at"&lt;/span&gt;
&lt;span class="c1"&gt;# FROM "trips" WHERE (completed_at &amp;gt;= '2024-01-10 12:29:18.260820')&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 559&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attributes&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;# {"id"=&amp;gt;6011,&lt;/span&gt;
&lt;span class="c1"&gt;#  "trip_request_id"=&amp;gt;6031,&lt;/span&gt;
&lt;span class="c1"&gt;#  "driver_id"=&amp;gt;61038,&lt;/span&gt;
&lt;span class="c1"&gt;#  "completed_at"=&amp;gt;Wed, 10 Jan 2024 07:12:42.960652000 CST -06:00,&lt;/span&gt;
&lt;span class="c1"&gt;#  "rating"=&amp;gt;nil,&lt;/span&gt;
&lt;span class="c1"&gt;#  "created_at"=&amp;gt;Wed, 10 Jan 2024 02:12:42.960652000 CST -06:00,&lt;/span&gt;
&lt;span class="c1"&gt;#  "updated_at"=&amp;gt;Wed, 10 Jan 2024 07:12:42.960652000 CST -06:00}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In development mode, the console output displays the SQL query that ActiveRecord generated when the scope was executed. To understand the efficiency of this query, we need to view the query execution plan. To do this, launch a database console with &lt;code&gt;bin/rails dbconsole&lt;/code&gt; and then enter the SQL query that was generated by ActiveRecord, preceded by the &lt;code&gt;EXPLAIN (ANALYZE)&lt;/code&gt; command at the psql prompt:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;bin/rails dbconsole&lt;/code&gt; uses the database connection information specified in &lt;code&gt;config/database.yml&lt;/code&gt; to make a connection to the database your project is using. Since this project uses PostgreSQL, it's equivalent to running &lt;code&gt;psql postgresql://role:password@host:port/database_name&lt;/code&gt;. If &lt;code&gt;password&lt;/code&gt; isn't specified in the database url, then Postgres will prompt for it or you can save the password in &lt;code&gt;~/.pgpass&lt;/code&gt;. For even less typing, the alias &lt;code&gt;bin/rails db&lt;/code&gt; can be used.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"trip_request_id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"driver_id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"completed_at"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"rating"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"created_at"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"updated_at"&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 12:29:18.260820'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

--                               QUERY PLAN
-- ------------------------------------------------------------------------------------------------------
--  Seq Scan on trips  (cost=0.00..1106.00 rows=3521 width=48) (actual time=0.013..11.413 rows=3863 loops=1)
--    Filter: (completed_at &amp;gt;= '2024-01-04 12:20:54.270716'::timestamp without time zone)
--    Rows Removed by Filter: 46137
--  Planning Time: 1.193 ms
--  Execution Time: 11.880 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In PostgreSQL, the &lt;code&gt;EXPLAIN&lt;/code&gt; statement is used to analyze and show the execution plan of a SQL query. It shows how the database engine is planning to execute the query, such as which indexes will be used and estimated costs. The cost is a unit-less value representing the estimated computational effort required for the operation. Lower values are generally better.&lt;/p&gt;

&lt;p&gt;While &lt;code&gt;EXPLAIN&lt;/code&gt; can be run on its own, its typically used together with the &lt;code&gt;ANALYZE&lt;/code&gt; option. This option tells PostgreSQL to actually execute the query, so the output will include statistics such as the number of rows processed, and execution time.&lt;/p&gt;

&lt;p&gt;From the output above we can see that a sequential scan was performed to retrieve rows from the &lt;code&gt;trips&lt;/code&gt; table:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Seq Scan on trips  (cost=0.00..1106.00 rows=3521 width=48) (actual time=0.013..11.413 rows=3863 loops=1)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It also shows that the query took nearly 12 ms to execute, which is relatively slow for such a simple query:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Execution Time: 11.880 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;A sequential scan means that PostgreSQL reads the entire table, and &lt;em&gt;then&lt;/em&gt; filters the rows out that match the query conditions. The number of rows removed by filtering are shown in the execution plan:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Filter: (completed_at &amp;gt;= '2024-01-04 12:20:54.270716'::timestamp without time zone)
Rows Removed by Filter: 46137


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;There's more nuance here in that this particular query doesn't have a &lt;code&gt;LIMIT&lt;/code&gt; clause. When that's present, PostgreSQL will "short circuit" and stop processing when it finds enough matches. See this post that digs deeper into &lt;a href="https://andyatkinson.com/blog/2024/01/25/PostgreSQL-rows-removed-by-filter-meaning" rel="noopener noreferrer"&gt;Rows Removed By Filter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To understand why a slow sequential scan is being used to filter &lt;code&gt;trips&lt;/code&gt; rows based on &lt;code&gt;completed_at&lt;/code&gt;, we can take a look at the &lt;code&gt;trips&lt;/code&gt; table schema. Still in the database console, the &lt;code&gt;\d table_name&lt;/code&gt; meta-command can be used to display any table schema:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

=&amp;gt; \d trips

                                           Table "rideshare.trips"
     Column      |              Type              | Collation | Nullable |              Default
-----------------+--------------------------------+-----------+----------+-----------------------------------
 id              | bigint                         |           | not null | nextval('trips_id_seq'::regclass)
 trip_request_id | bigint                         |           | not null |
 driver_id       | integer                        |           | not null |
 completed_at    | timestamp without time zone    |           |          |
 rating          | integer                        |           |          |
 created_at      | timestamp(6) without time zone |           | not null |
 updated_at      | timestamp(6) without time zone |           | not null |
Indexes:
    "trips_pkey" PRIMARY KEY, btree (id)
    "index_trips_on_driver_id" btree (driver_id)
    "index_trips_on_rating" btree (rating)
    "index_trips_on_trip_request_id" btree (trip_request_id)
Check constraints:
    "chk_rails_4743ddc2d2" CHECK (completed_at &amp;gt; created_at) NOT VALID
    "rating_check" CHECK (rating &amp;gt;= 1 AND rating &amp;lt;= 5)
Foreign-key constraints:
    "fk_rails_6d92acb430" FOREIGN KEY (trip_request_id) REFERENCES trip_requests(id)
    "fk_rails_e7560abc33" FOREIGN KEY (driver_id) REFERENCES users(id)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;Indexes&lt;/code&gt; section of the output above lists all the indexes that are available on the &lt;code&gt;trips&lt;/code&gt; table. Notice there is no index on &lt;code&gt;completed_at&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Indexes:
  "trips_pkey" PRIMARY KEY, btree (id)
  "index_trips_on_driver_id" btree (driver_id)
  "index_trips_on_rating" btree (rating)
  "index_trips_on_trip_request_id" btree (trip_request_id)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;While the existing query may be sufficient for the current scale, this operation will become progressively slower as the number of trips increases. This means that the overall efficiency and performance of the application may be negatively impacted over time.&lt;/p&gt;

&lt;p&gt;Improving the scalability of the application involves optimizing queries to be less resource-intensive, and adding an index to the &lt;code&gt;completed_at&lt;/code&gt; column, is a step towards achieving this goal. Scaling, in this context, refers to the ability of the database system to handle a growing workload efficiently. By addressing the performance bottlenecks, such as slow sequential scans, the application can better leverage the capabilities of the PostgreSQL instance as the workload increases.&lt;/p&gt;

&lt;p&gt;So the next step in improving this query is to add an index, which can help with accessing the data at a lower cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Second Attempt: Add Index
&lt;/h2&gt;

&lt;p&gt;To add an index to an existing table in a Rails application, start by generating a database migration:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

bin/rails generate migration AddIndexToTripsCompletedAt


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Fill in the &lt;code&gt;change&lt;/code&gt; method, specifying what table the index should be added to, and on what column:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="c1"&gt;# db/migrate/20240111132403_add_index_to_trips_completed_at.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AddIndexToTripsCompletedAt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;7.1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;change&lt;/span&gt;
    &lt;span class="n"&gt;add_index&lt;/span&gt; &lt;span class="ss"&gt;:trips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:completed_at&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Run the migration with &lt;code&gt;bin/rails db:migrate&lt;/code&gt;. However, in the Rideshare application, the migration is not applied due to the following error:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Migrating to AddIndexToTripsCompletedAt (20240111132403)
== 20240111132403 AddIndexToTripsCompletedAt: migrating =======================
  TRANSACTION (0.7ms)  BEGIN
   (1.6ms)  SHOW server_version_num
   (0.7ms)  SET statement_timeout TO 3600000
   (0.7ms)  SET lock_timeout TO 10000
  TRANSACTION (0.7ms)  ROLLBACK
   (0.8ms)  SELECT pg_advisory_unlock(5537362570877065845)
bin/rails aborted!
StandardError: An error has occurred, this and all later migrations canceled: (StandardError)

=== Dangerous operation detected #strong_migrations ===

Adding an index non-concurrently blocks writes. Instead, use:

class AddIndexToTripsCompletedAt &amp;lt; ActiveRecord::Migration[7.1]
  disable_ddl_transaction!

  def change
    add_index :trips, :completed_at, algorithm: :concurrently
  end
end


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This error is coming from the &lt;a href="https://github.com/ankane/strong_migrations" rel="noopener noreferrer"&gt;Strong Migrations&lt;/a&gt; gem that is included in Rideshare. This gem prevents "dangerous" migrations from being added to the project.&lt;/p&gt;

&lt;p&gt;What makes this particular migration risky is that adding an index will lock the table that it's being added to, which means the application won't be able to read or write from/to the &lt;code&gt;trips&lt;/code&gt; table while this migration is in progress. Since adding an index can take a long time on large tables, this will result in application errors.&lt;/p&gt;

&lt;p&gt;Since access to the &lt;code&gt;trips&lt;/code&gt; table is critical to the functioning of the application, we can avoid potential downtime, by following the advice generated by the &lt;a href="https://github.com/ankane/strong_migrations" rel="noopener noreferrer"&gt;Strong Migrations&lt;/a&gt; gem. Notice the error message included how the migration should be written. Here is the corrected version:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AddIndexToTripsCompletedAt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;7.1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;disable_ddl_transaction!&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;change&lt;/span&gt;
    &lt;span class="n"&gt;add_index&lt;/span&gt; &lt;span class="ss"&gt;:trips&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:completed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;algorithm: :concurrently&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Specifying &lt;code&gt;algorithm: :concurrently&lt;/code&gt; tells PostgreSQL to add the index without locking the table. This means the &lt;code&gt;trips&lt;/code&gt; table can continue to be queried while the index is being added, which keeps the application responsive. The trade-off with &lt;code&gt;:concurrently&lt;/code&gt; is that it makes the operation take about twice as long.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;disable_ddl_transaction!&lt;/code&gt; disables the transaction for DDL (Data Definition Language) statements, because the concurrent approach can't be executed inside a transaction.&lt;/p&gt;

&lt;p&gt;With this change in place, &lt;code&gt;bin/rails db:migrate&lt;/code&gt; completes successfully and the index is now added.&lt;/p&gt;

&lt;p&gt;For a small and less busy table, creating the index in the standard way (i.e., without using &lt;code&gt;:concurrently&lt;/code&gt;) may be fine. The &lt;code&gt;strong_migrations&lt;/code&gt; gem issues warnings for potentially risky database operations, prompting developers to assess the situation and decide whether additional precautions, like &lt;code&gt;:concurrently&lt;/code&gt;, are necessary.&lt;/p&gt;

&lt;p&gt;Now we can, once again, examine the execution plan for the completed trips query after adding the index:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"trip_request_id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"driver_id"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"completed_at"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"rating"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"created_at"&lt;/span&gt;
  &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"updated_at"&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 12:29:18.260820'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

                                            QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
--  Bitmap Heap Scan on trips  (cost=67.58..592.59 rows=3521 width=48) (actual time=0.724..3.849 rows=3863 loops=1)
--    Recheck Cond: (completed_at &amp;gt;= '2024-01-04 12:20:54.270716'::timestamp without time zone)
--    Heap Blocks: exact=481
--    -&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at  (cost=0.00..66.70 rows=3521 width=0) (actual time=0.603..0.604 --rows=3863 loops=1)
--          Index Cond: (completed_at &amp;gt;= '2024-01-04 12:20:54.270716'::timestamp without time zone)
--  Planning Time: 0.242 ms
--  Execution Time: 4.229 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This time the query planner is using the index on &lt;code&gt;completed_at&lt;/code&gt; instead of a sequential scan. This is shown with the &lt;code&gt;Bitmap Index Scan&lt;/code&gt; which indicates that the &lt;code&gt;index_trips_on_completed_at&lt;/code&gt; index is being used:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

-&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at...
      Index Cond: (completed_at &amp;gt;= '2024-01-04 12:20:54.270716'::timestamp without time zone)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;And the overall time to complete this query is just over 4ms, a significant improvement over the previous ~12ms when using a sequential scan:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Execution Time: 4.229 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is because the index allows PostgreSQL to quickly locate and retrieve the relevant rows. Specifically, the index supports a faster filtering operation based on the query conditions. Note that since this isn't an &lt;a href="https://www.postgresql.org/docs/current/indexes-index-only-scans.html" rel="noopener noreferrer"&gt;index-only scan&lt;/a&gt;, a &lt;code&gt;Bitmap Heap Scan&lt;/code&gt; was still required to access data from the table, which is shown in the first line of the query plan:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Bitmap Heap Scan on trips  (cost=67.58..592.59 rows=3521 width=48) (actual time=0.724..3.849 rows=3863 loops=1)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The query plan output can list many different operations, only a small number of which are covered in this post. PgMustard, a paid tool that helps to review query plans hosts a very useful &lt;a href="https://www.pgmustard.com/docs/explain" rel="noopener noreferrer"&gt;EXPLAIN Glossary&lt;/a&gt; with definitions of many of the terms shown in PostgreSQL plans.&lt;/p&gt;

&lt;h2&gt;
  
  
  Third Attempt: Joins
&lt;/h2&gt;

&lt;p&gt;Now that use of &lt;code&gt;completed_at&lt;/code&gt; is optimized, we can continue development of the admin dashboard query. Recall we also need to show driver and rider information, as well as the trip location. To do this, we'll use the ActiveRecord &lt;a href="https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-joins" rel="noopener noreferrer"&gt;joins&lt;/a&gt; method.&lt;/p&gt;

&lt;p&gt;To get location information, the Trip model must be joined to &lt;code&gt;trip_request&lt;/code&gt;, which must then be joined to &lt;code&gt;start_location&lt;/code&gt; via a nested join.&lt;/p&gt;

&lt;p&gt;Getting rider and driver information requires joining the &lt;code&gt;users&lt;/code&gt; table twice, once for Riders (which are joined on &lt;code&gt;trip_request.rider_id&lt;/code&gt;), and another time for Drivers (which are joined on &lt;code&gt;trips.driver_id&lt;/code&gt;), due to use of Single Table Inheritance that models Drivers and Riders with the same underlying &lt;code&gt;users&lt;/code&gt; table. This requires the use of a string join rather than just specifying the model.&lt;/p&gt;

&lt;p&gt;The same &lt;code&gt;where&lt;/code&gt; clause as before is used to filter on recent trips.&lt;/p&gt;

&lt;p&gt;By default, the result of the ActiveRecord &lt;code&gt;joins&lt;/code&gt; method will only select the columns from the model class on which its invoked, &lt;code&gt;Trip&lt;/code&gt; in this case. But since the report needs to also show Driver, Rider, and Location information, this scope adds a &lt;code&gt;SELECT&lt;/code&gt; clause to get all fields from all tables using the ActiveRecord &lt;a href="https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-select" rel="noopener noreferrer"&gt;select&lt;/a&gt; method:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Trip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:recently_completed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;trip_request: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:start_location&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INNER JOIN users AS riders ON riders.id = trip_requests.rider_id"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INNER JOIN users AS drivers ON drivers.id = trips.driver_id"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"trips.completed_at &amp;gt; ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;week&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"trips.*, locations.*, drivers.*, riders.*"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here is what it looks like in a Rails console. Note that even though &lt;code&gt;results&lt;/code&gt; is a relation of Trip models, each instance contains attributes from trips, locations, and users tables. However, since both drivers and riders are actually from the same &lt;code&gt;users&lt;/code&gt; table, the last one pulled in (&lt;code&gt;riders&lt;/code&gt; in this case) "wins" and so the result only has the rider attributes. This is due to the order in which the attributes are selected in the SQL query and will be addressed in the next section.&lt;/p&gt;

&lt;p&gt;I've added &lt;code&gt;===&lt;/code&gt; comments to highlight which model each set of attributes comes from:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Trip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recently_completed&lt;/span&gt;
&lt;span class="c1"&gt;# Trip Load (2.5ms)  SELECT trips.*, locations.*, drivers.*, riders.*&lt;/span&gt;
&lt;span class="c1"&gt;# FROM "trips" INNER JOIN "trip_requests" ON "trip_requests"."id" = "trips"."trip_request_id"&lt;/span&gt;
&lt;span class="c1"&gt;# INNER JOIN "locations" ON "locations"."id" = "trip_requests"."start_location_id"&lt;/span&gt;
&lt;span class="c1"&gt;# INNER JOIN users AS riders ON riders.id = trip_requests.rider_id&lt;/span&gt;
&lt;span class="c1"&gt;# INNER JOIN users AS drivers ON drivers.id = trips.driver_id&lt;/span&gt;
&lt;span class="c1"&gt;# WHERE (trips.completed_at &amp;gt; '2024-01-10 12:20:01.192132')&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 559&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attributes&lt;/span&gt;
&lt;span class="c1"&gt;# {&lt;/span&gt;
&lt;span class="c1"&gt;#   === TRIP ATTRIBUTES ===&lt;/span&gt;
&lt;span class="c1"&gt;#   "id"=&amp;gt;61943,&lt;/span&gt;
&lt;span class="c1"&gt;#   "trip_request_id"=&amp;gt;56017,&lt;/span&gt;
&lt;span class="c1"&gt;#   "driver_id"=&amp;gt;60456,&lt;/span&gt;
&lt;span class="c1"&gt;#   "completed_at"=&amp;gt;Wed, 10 Jan 2024 07:21:26.462838000 CST -06:00,&lt;/span&gt;
&lt;span class="c1"&gt;#   "rating"=&amp;gt;5,&lt;/span&gt;

&lt;span class="c1"&gt;#   === LOCATION ATTRIBUTES ===&lt;/span&gt;
&lt;span class="c1"&gt;#   "address"=&amp;gt;"New York, NY",&lt;/span&gt;
&lt;span class="c1"&gt;#   "city"=&amp;gt;nil,&lt;/span&gt;
&lt;span class="c1"&gt;#   "state"=&amp;gt;"NY",&lt;/span&gt;
&lt;span class="c1"&gt;#   "position"=&amp;gt;#&amp;lt;struct ActiveRecord::Point x=40.7143528, y=-74.0059731&amp;gt;,&lt;/span&gt;

&lt;span class="c1"&gt;#   === USER AS RIDER ATTRIBUTES ===&lt;/span&gt;
&lt;span class="c1"&gt;#   "first_name"=&amp;gt;"Shantelle",&lt;/span&gt;
&lt;span class="c1"&gt;#   "last_name"=&amp;gt;"Kris",&lt;/span&gt;
&lt;span class="c1"&gt;#   "email"=&amp;gt;"Shantelle-Kris-522@email.com",&lt;/span&gt;
&lt;span class="c1"&gt;#   "type"=&amp;gt;"Rider",&lt;/span&gt;
&lt;span class="c1"&gt;#   "password_digest"=&amp;gt;"$2a$12...",&lt;/span&gt;
&lt;span class="c1"&gt;#   "trips_count"=&amp;gt;nil,&lt;/span&gt;
&lt;span class="c1"&gt;#   "drivers_license_number"=&amp;gt;nil&lt;/span&gt;
&lt;span class="c1"&gt;# }&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now this query can be analyzed in a database console session &lt;code&gt;bin/rails dbconsole&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="nv"&gt;"trip_requests"&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="nv"&gt;"trip_requests"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"trips"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"trip_request_id"&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="nv"&gt;"locations"&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="nv"&gt;"locations"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"trip_requests"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"start_location_id"&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rider_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 12:05:39.639423'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

--                                       QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
--  Hash Join  (cost=1399.44..2692.88 rows=524 width=453) (actual time=11.736..24.510 rows=559 loops=1)
--    Hash Cond: (trips.driver_id = drivers.id)
--    -&amp;gt;  Nested Loop  (cost=529.22..1821.28 rows=524 width=294) (actual time=1.109..13.507 rows=559 loops=1)
--          -&amp;gt;  Hash Join  (cost=528.93..1643.10 rows=524 width=139) (actual time=1.085..12.227 rows=559 loops=1)
--                Hash Cond: (trip_requests.start_location_id = locations.id)
--                -&amp;gt;  Hash Join  (cost=527.89..1637.76 rows=524 width=56) (actual time=1.054..11.992 rows=559 loops=1)
--                      Hash Cond: (trip_requests.id = trips.trip_request_id)
--                      -&amp;gt;  Seq Scan on trip_requests  (cost=0.00..917.10 rows=50010 width=16) (actual time=0.004..4.868 rows=50010 loops=1)
--                      -&amp;gt;  Hash  (cost=521.34..521.34 rows=524 width=48) (actual time=1.028..1.032 rows=559 loops=1)
--                            Buckets: 1024  Batches: 1  Memory Usage: 53kB
--                            -&amp;gt;  Bitmap Heap Scan on trips  (cost=12.35..521.34 rows=524 width=48) (actual time=0.116..0.896 rows=559 loops=1)
--                                  Recheck Cond: (completed_at &amp;gt; '2024-01-10 12:05:39.639423'::timestamp without time zone)
--                                  Heap Blocks: exact=328
--                                  -&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at  (cost=0.00..12.22 rows=524 width=0) (actual time=0.060..0.060 rows=559 loops=1)
--                                        Index Cond: (completed_at &amp;gt; '2024-01-10 12:05:39.639423'::timestamp without time zone)
--                -&amp;gt;  Hash  (cost=1.02..1.02 rows=2 width=87) (actual time=0.019..0.021 rows=2 loops=1)
--                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
--                      -&amp;gt;  Seq Scan on locations  (cost=0.00..1.02 rows=2 width=87) (actual time=0.011..0.013 rows=2 loops=1)
--          -&amp;gt;  Index Scan using users_pkey on users riders  (cost=0.29..0.34 rows=1 width=159) (actual time=0.002..0.002 rows=1 loops=559)
--                Index Cond: (id = trip_requests.rider_id)
--    -&amp;gt;  Hash  (cost=595.10..595.10 rows=22010 width=159) (actual time=10.601..10.601 rows=22010 loops=1)
--          Buckets: 32768  Batches: 1  Memory Usage: 3221kB
--          -&amp;gt;  Seq Scan on users drivers  (cost=0.00..595.10 rows=22010 width=159) (actual time=0.021..4.849 rows=22010 loops=1)
--  Planning Time: 1.046 ms
--  Execution Time: 24.936 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The SQL explain plan above shows this is a more complex query than the previous attempts. It includes many more operations than can be covered in this post, so we'll only be focusing on the join operations.&lt;/p&gt;

&lt;p&gt;Execution time has gone up, even though the query is still using the index on &lt;code&gt;trips.completed_at&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Execution Time: 24.936 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is in part due to the multiple joins. Joins between tables are expressed in the plan as a combination of a &lt;code&gt;Hash&lt;/code&gt; node, a &lt;code&gt;Hash Join&lt;/code&gt; which implements the join details, and a &lt;code&gt;Hash Cond&lt;/code&gt;, which shows the conditions that were used to perform the join, using the &lt;code&gt;Hash&lt;/code&gt; node.&lt;/p&gt;

&lt;p&gt;For example, the join from the &lt;code&gt;trips&lt;/code&gt; table to the &lt;code&gt;trip_requests&lt;/code&gt; table is shown in the plan as:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

-&amp;gt;  Hash Join  (cost=527.89..1637.76 rows=524 width=56) (actual time=1.054..11.992 rows=559 loops=1)
      Hash Cond: (trip_requests.id = trips.trip_request_id)
      -&amp;gt;  Hash  (cost=521.34..521.34 rows=524 width=48) (actual time=1.028..1.032 rows=559 loops=1)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here's a definition of the Hash Join operation from the &lt;a href="https://www.pgmustard.com/docs/explain/hash-join" rel="noopener noreferrer"&gt;pgMustard glossary&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;An implementation of join in which one of the collections of rows to be joined is hashed on the join keys using a separate 'Hash' node. PostgreSQL then iterates over the other collection of rows, for each one looking it up in the hash table to see if there are any rows it should be joined to.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;CAUTION:&lt;/strong&gt; When using joins, there should be an index on each foreign key column. In the example above, these are: &lt;code&gt;trips.trip_request_id&lt;/code&gt;, &lt;code&gt;trip_requests.start_location_id&lt;/code&gt;, &lt;code&gt;trip_requests.rider_id&lt;/code&gt;, and &lt;code&gt;trips.driver_id&lt;/code&gt;. If you used &lt;code&gt;references&lt;/code&gt; when generating the migration with Rails, or specified &lt;code&gt;t.references :some_model, foreign_key: true&lt;/code&gt; in the Rails migration file, then Rails (as of 5.2) will automatically generate an index on that column in addition to the foreign key constraint. In the Rideshare schema, all foreign keys are indexed.&lt;/p&gt;

&lt;p&gt;This version of the query is also fetching more data due to the &lt;code&gt;SELECT&lt;/code&gt; clause specifying all columns on all tables in the joins:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"trips.*, locations.*, drivers.*, riders.*"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Another way to improve performance is to reduce the "width" of the results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fourth Attempt: Reduce Columns
&lt;/h2&gt;

&lt;p&gt;The previous query selected all columns from all of the joined tables. For example, each result was including user details such as password digest and drivers license number. But recall the business requirements were that we only need to display a subset of user and trip information.&lt;/p&gt;

&lt;p&gt;Let's "narrow" the amount of data by restricting the columns in the &lt;code&gt;SELECT&lt;/code&gt; clause to only what we need. This also supports string concatenation to distinguish between driver and rider names:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Trip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:recently_completed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;trip_request: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:start_location&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'INNER JOIN users AS riders ON riders.id = trip_requests.rider_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'INNER JOIN users AS drivers ON drivers.id = trips.driver_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'trips.completed_at &amp;gt; ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;week&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"trips.completed_at, trips.rating, locations.address,
               CONCAT(drivers.first_name, ' ', drivers.last_name) driver_name,
               CONCAT(riders.first_name, ' ', riders.last_name) rider_name"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;# ...&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As of Rails 7.1, the &lt;code&gt;select&lt;/code&gt; method can &lt;a href="https://blog.kiprosh.com/rails-7-1-allows-activerecord-querymethods-select-and-reselect-to-receive-hash-values/" rel="noopener noreferrer"&gt;accept a hash of columns and aliases&lt;/a&gt; rather than raw SQL when specifying columns from the join tables. However, if the &lt;code&gt;select&lt;/code&gt; uses SQL functions as in this case where concatenation is used to generate the driver and rider names, then raw SQL is still required.&lt;/p&gt;

&lt;p&gt;Running this query in the Rails console shows the attributes contain exactly what's needed to display in the view, with no extraneous data.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Trip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recently_completed&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attributes&lt;/span&gt;
&lt;span class="c1"&gt;# {&lt;/span&gt;
&lt;span class="c1"&gt;#   "completed_at" =&amp;gt; Wed, 10 Jan 2024 07:12:42.960652000 CST -06:00,&lt;/span&gt;
&lt;span class="c1"&gt;#   "rating" =&amp;gt; nil,&lt;/span&gt;
&lt;span class="c1"&gt;#   "address" =&amp;gt; "New York, NY",&lt;/span&gt;
&lt;span class="c1"&gt;#   "driver_name" =&amp;gt; "Rasheeda Brakus",&lt;/span&gt;
&lt;span class="c1"&gt;#   "rider_name" =&amp;gt; "Corina Gorczany",&lt;/span&gt;
&lt;span class="c1"&gt;#   "id" =&amp;gt; nil&lt;/span&gt;
&lt;span class="c1"&gt;# }&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Again we can view the query execution plan:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;driver_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;rider_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;trip_request_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_location_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rider_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 12:27:19.062944'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

--                                         QUERY PLAN
-- -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
--  Hash Join  (cost=1398.27..2693.98 rows=516 width=88) (actual time=8.572..20.188 rows=559 loops=1)
--    Hash Cond: (trips.driver_id = drivers.id)
--    -&amp;gt;  Nested Loop  (cost=528.05..1817.24 rows=516 width=41) (actual time=1.246..12.363 rows=559 loops=1)
--          -&amp;gt;  Hash Join  (cost=527.76..1641.78 rows=516 width=32) (actual time=1.224..11.156 rows=559 loops=1)
--                Hash Cond: (trip_requests.start_location_id = locations.id)
--                -&amp;gt;  Hash Join  (cost=526.71..1636.51 rows=516 width=24) (actual time=1.191..10.985 rows=559 loops=1)
--                      Hash Cond: (trip_requests.id = trips.trip_request_id)
--                      -&amp;gt;  Seq Scan on trip_requests  (cost=0.00..917.10 rows=50010 width=16) (actual time=0.005..4.199 rows=50010 loops=1)
--                      -&amp;gt;  Hash  (cost=520.26..520.26 rows=516 width=24) (actual time=1.167..1.174 rows=559 loops=1)
--                            Buckets: 1024  Batches: 1  Memory Usage: 38kB
--                            -&amp;gt;  Bitmap Heap Scan on trips  (cost=12.29..520.26 rows=516 width=24) (actual time=0.138..1.060 rows=559 loops=1)
--                                  Recheck Cond: (completed_at &amp;gt; '2024-01-10 12:27:19.062944'::timestamp without time zone)
--                                  Heap Blocks: exact=328
--                                  -&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at  (cost=0.00..12.16 rows=516 width=0) (actual time=0.081..0.081 rows=559 loops=1)
--                                        Index Cond: (completed_at &amp;gt; '2024-01-10 12:27:19.062944'::timestamp without time zone)
--                -&amp;gt;  Hash  (cost=1.02..1.02 rows=2 width=20) (actual time=0.020..0.021 rows=2 loops=1)
--                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
--                      -&amp;gt;  Seq Scan on locations  (cost=0.00..1.02 rows=2 width=20) (actual time=0.012..0.013 rows=2 loops=1)
--          -&amp;gt;  Index Scan using users_pkey on users riders  (cost=0.29..0.34 rows=1 width=21) (actual time=0.002..0.002 rows=1 loops=559)
--                Index Cond: (id = trip_requests.rider_id)
--    -&amp;gt;  Hash  (cost=595.10..595.10 rows=22010 width=21) (actual time=7.296..7.296 rows=22010 loops=1)
--          Buckets: 32768  Batches: 1  Memory Usage: 1492kB
--          -&amp;gt;  Seq Scan on users drivers  (cost=0.00..595.10 rows=22010 width=21) (actual time=0.007..3.716 rows=22010 loops=1)
--  Planning Time: 1.649 ms
--  Execution Time: 20.553 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This time the performance is slightly improved:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Execution Time: 20.553 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It still needs to perform all the joins, but less overall data is being fetched from the database. This can be observed by the smaller &lt;code&gt;width&lt;/code&gt; attribute in the &lt;code&gt;Hash Join&lt;/code&gt; operations. The &lt;code&gt;width&lt;/code&gt; refers to the total number of bytes required to store the output of the join operation, and is proportional to the number of columns in the &lt;code&gt;SELECT&lt;/code&gt; clause.&lt;/p&gt;

&lt;p&gt;For example, focusing on the top level Hash Join operation, this query has a width of 88:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Hash Join  (cost=1398.27..2693.98 rows=516 width=88) (actual time=8.572..20.188 rows=559 loops=1)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Whereas the &lt;a href="//../rails-query-perf#third-attempt-joins"&gt;previous version of the query&lt;/a&gt; where all columns from all tables were being retrieved shows an overall width of 453:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Hash Join  (cost=1399.44..2692.88 rows=524 width=453) (actual time=11.736..24.510 rows=559 loops=1)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Selecting only the necessary columns can lead to performance improvements including reduced disk I/O and smaller memory requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fifth Attempt: Reduce Rows
&lt;/h2&gt;

&lt;p&gt;Another way to improve performance is to reduce the total number of rows returned by the query with additional filters (i.e. SQL &lt;code&gt;WHERE&lt;/code&gt; clauses). This could require some discussion between engineering and product teams to make sure everyone understands the business problem that is being solved.&lt;/p&gt;

&lt;p&gt;In this case, it turns out the Rideshare admins want to focus only on recent trips with &lt;em&gt;low ratings&lt;/em&gt;, so they can investigate what happened on some of those trips to improve customer experience. They've defined a "low" rating as 3 or fewer stars, based on a 5 star rating system. Up until now, they've been sorting the results client side by rating, and investigating those.&lt;/p&gt;

&lt;p&gt;Given this better understanding of the requirements, another &lt;code&gt;WHERE&lt;/code&gt; condition can be added to the scope, to only retrieve trips where the rating is less than or equal to 3.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CAUTION:&lt;/strong&gt; When adding further &lt;code&gt;WHERE&lt;/code&gt; clauses to reduce the amount of data being returned, always check whether there's an index on the column you're planning to filter by. Otherwise, you could inadvertently make performance worse while trying to improve it! In this case, there already is an index on &lt;code&gt;trips.rating&lt;/code&gt;. We saw this &lt;a href="//../rails-query-perf#first-attempt"&gt;earlier&lt;/a&gt; when inspecting the table schema with &lt;code&gt;\d trips&lt;/code&gt; in the database console:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

=&amp;gt; \d trips
...
Indexes:
  "index_trips_on_rating" btree (rating)
  ...


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here is the modified scope adding an additional &lt;code&gt;WHERE&lt;/code&gt; clause on &lt;code&gt;trip.rating&lt;/code&gt; to only retrieve low ratings&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Trip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:recently_completed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;trip_request: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:start_location&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'INNER JOIN users AS riders ON riders.id = trip_requests.rider_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'INNER JOIN users AS drivers ON drivers.id = trips.driver_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'trips.completed_at &amp;gt; ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;week&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'trips.rating &amp;lt;= ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"trips.completed_at, trips.rating, locations.address,
        CONCAT(drivers.first_name, ' ', drivers.last_name) driver_name,
        CONCAT(riders.first_name, ' ', riders.last_name) rider_name"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This time, the results are "shaped" the same as before because the &lt;code&gt;SELECT&lt;/code&gt; columns are the same, but the total number of results is smaller. This also ensures the query only returns trips with ratings, as before, we were also getting results with &lt;code&gt;nil&lt;/code&gt; ratings, i.e. where the customer didn't leave to leave a rating.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Trip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recently_completed&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 78&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attributes&lt;/span&gt;
&lt;span class="c1"&gt;# {&lt;/span&gt;
&lt;span class="c1"&gt;#   "completed_at"=&amp;gt;Mon, 22 Jan 2024 07:07:11.312416000 CST -06:00,&lt;/span&gt;
&lt;span class="c1"&gt;#   "rating"=&amp;gt;2,&lt;/span&gt;
&lt;span class="c1"&gt;#   "address"=&amp;gt;"New York, NY",&lt;/span&gt;
&lt;span class="c1"&gt;#   "driver_name"=&amp;gt;"Bobbi Cronin",&lt;/span&gt;
&lt;span class="c1"&gt;#   "rider_name"=&amp;gt;"Anibal Hammes",&lt;/span&gt;
&lt;span class="c1"&gt;#   "id"=&amp;gt;nil&lt;/span&gt;
&lt;span class="c1"&gt;# }&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Explain/analyze the query:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;driver_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rider_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;trip_request_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_location_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rider_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 13:08:58.257990'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=97.88..1340.17 rows=75 width=88) (actual time=0.514..1.937 rows=78 loops=1)
   Join Filter: (trip_requests.start_location_id = locations.id)
   Rows Removed by Join Filter: 78
   -&amp;gt;  Seq Scan on locations  (cost=0.00..1.02 rows=2 width=20) (actual time=0.012..0.015 rows=2 loops=1)
   -&amp;gt;  Materialize  (cost=97.88..1336.34 rows=75 width=42) (actual time=0.247..0.937 rows=78 loops=2)
         -&amp;gt;  Nested Loop  (cost=97.88..1335.96 rows=75 width=42) (actual time=0.488..1.819 rows=78 loops=1)
               -&amp;gt;  Nested Loop  (cost=97.59..862.59 rows=75 width=33) (actual time=0.463..1.492 rows=78 loops=1)
                     -&amp;gt;  Nested Loop  (cost=97.30..837.09 rows=75 width=24) (actual time=0.449..1.240 rows=78 loops=1)
                           -&amp;gt;  Bitmap Heap Scan on trips  (cost=97.01..298.02 rows=75 width=24) (actual time=0.436..0.645 rows=78 loops=1)
                                 Recheck Cond: ((completed_at &amp;gt; '2024-01-10 13:08:58.25799'::timestamp without time zone) AND (rating &amp;lt;= 3))
                                 Heap Blocks: exact=72
                                 -&amp;gt;  BitmapAnd  (cost=97.01..97.01 rows=75 width=0) (actual time=0.419..0.421 rows=0 loops=1)
                                       -&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at  (cost=0.00..12.05 rows=501 width=0) (actual time=0.083..0.083 rows=559 loops=1)
                                             Index Cond: (completed_at &amp;gt; '2024-01-10 13:08:58.25799'::timestamp without time zone)
                                       -&amp;gt;  Bitmap Index Scan on index_trips_on_rating  (cost=0.00..84.67 rows=7518 width=0) (actual time=0.314..0.314 rows=7531 loops=1)
                                             Index Cond: (rating &amp;lt;= 3)
                           -&amp;gt;  Index Scan using trip_requests_pkey on trip_requests  (cost=0.29..7.19 rows=1 width=16) (actual time=0.007..0.007 rows=1 loops=78)
                                 Index Cond: (id = trips.trip_request_id)
                     -&amp;gt;  Index Scan using users_pkey on users riders  (cost=0.29..0.34 rows=1 width=21) (actual time=0.003..0.003 rows=1 loops=78)
                           Index Cond: (id = trip_requests.rider_id)
               -&amp;gt;  Memoize  (cost=0.30..6.55 rows=1 width=21) (actual time=0.004..0.004 rows=1 loops=78)
                     Cache Key: trips.driver_id
                     Cache Mode: logical
                     Hits: 8  Misses: 70  Evictions: 0  Overflows: 0  Memory Usage: 9kB
                     -&amp;gt;  Index Scan using users_pkey on users drivers  (cost=0.29..6.54 rows=1 width=21) (actual time=0.003..0.003 rows=1 loops=70)
                           Index Cond: (id = trips.driver_id)
 Planning Time: 1.168 ms
 Execution Time: 2.360 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This time the query execution time is just over 2ms, a significant improvement over the previous ~20ms. This is due to the additional filter condition &lt;code&gt;(rating &amp;lt;= 3)&lt;/code&gt; that reduces the number of rows that need to be processed in join operations:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Execution Time: 2.360 ms


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;a href="https://www.pgmustard.com/docs/explain/bitmap-and" rel="noopener noreferrer"&gt;BitmapAnd&lt;/a&gt; operation shows that both the indexes on &lt;code&gt;completed_at&lt;/code&gt; and &lt;code&gt;rating&lt;/code&gt; are being used. This operation combines bitmaps generated by Bitmap Index Scans, allowing information from more than one index to be quickly aggregated together.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

-&amp;gt;  BitmapAnd  (cost=97.01..97.01 rows=75 width=0) (actual time=0.419..0.421 rows=0 loops=1)
  -&amp;gt;  Bitmap Index Scan on index_trips_on_completed_at  (cost=0.00..12.05 rows=501 width=0) (actual time=0.083..0.083 rows=559 loops=1)
        Index Cond: (completed_at &amp;gt; '2024-01-10 13:08:58.25799'::timestamp without time zone)
  -&amp;gt;  Bitmap Index Scan on index_trips_on_rating  (cost=0.00..84.67 rows=7518 width=0) (actual time=0.314..0.314 rows=7531 loops=1)
        Index Cond: (rating &amp;lt;= 3)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Visualize Query Plan
&lt;/h2&gt;

&lt;p&gt;Understanding query plans can be challenging, especially when dealing with complex queries. However, there's a free web-based tool that can help make sense of them: &lt;a href="https://explain.dalibo.com/" rel="noopener noreferrer"&gt;explain.dalibo.com&lt;/a&gt;. Here's how to use it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prepare Your Query&lt;/strong&gt;: Start by putting your query into a file. For instance, you can create a directory called &lt;code&gt;queries&lt;/code&gt; and save your SQL query in a file within it. In addition to the &lt;code&gt;ANALYZE&lt;/code&gt; argument that we've been using, also pass in &lt;code&gt;COSTS, VERBOSE, BUFFERS, FORMAT JSON&lt;/code&gt; to &lt;code&gt;EXPLAIN&lt;/code&gt; as shown below. This will include additional information in the query plan output, such as the amount of disk data that was fetched, and format the results as JSON:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;

&lt;span class="c1"&gt;-- queries/recently_completed_trips.sql&lt;/span&gt;
&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COSTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;VERBOSE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BUFFERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FORMAT&lt;/span&gt; &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;driver_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rider_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;trip_request_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_location_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;riders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trip_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rider_id&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;driver_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-10 13:08:58.257990'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trips&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Run the Query&lt;/strong&gt;: Execute the query from your terminal using &lt;a href="https://www.postgresql.org/docs/current/app-psql.html" rel="noopener noreferrer"&gt;psql&lt;/a&gt;, ensuring to redirect the output to a file. The &lt;code&gt;-XqAt&lt;/code&gt; flags make the output machine-readable:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

psql &lt;span class="nt"&gt;-h&lt;/span&gt; 127.0.0.1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 5432 &lt;span class="nt"&gt;-U&lt;/span&gt; owner &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; rideshare_development &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-XqAt&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; queries/recently_completed_trips.sql &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; queries/fifth.json
&lt;span class="c"&gt;# copy the contents of `queries/recently_completed_trips.json`&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Visualize Your Plan&lt;/strong&gt;: Visit &lt;a href="https://explain.dalibo.com/" rel="noopener noreferrer"&gt;explain.dalibo.com&lt;/a&gt; and paste the generated plan text and query. Then, hit Submit. The tool will generate a visualization of your query plan. Here's an example of the visualization for the fifth attempt version of the query from this post. It shows the different types of scans that were used and how the data gets combined. The duration of each operation is also shown:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazcw8f935wq9hv2dxyim.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazcw8f935wq9hv2dxyim.png" alt="Fifth attempt visualized"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also click on the down caret icon beside any node to get more details about that operation. For example, focusing on the &lt;code&gt;Bitmap Index Scan&lt;/code&gt; for use of the rating index on the &lt;code&gt;trips&lt;/code&gt; table:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmrr8we5q6lq0gmqfqtx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmrr8we5q6lq0gmqfqtx.png" alt="One explain node expanded"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this tool, deciphering the details of PostgreSQL query plans becomes a little more approachable. Just be mindful that using it involves sharing your query and its plan publicly. If privacy is a concern regarding sharing your query and its plan publicly, consider using &lt;a href="https://www.pgadmin.org/" rel="noopener noreferrer"&gt;pgAdmin&lt;/a&gt;, an open-source administrative tool for PostgreSQL that you can install on your local machine, which also offers a visualization feature for query plans.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This post has covered techniques to enhance PostgreSQL query performance in Rails applications. We began by examining a query execution plan using the &lt;code&gt;EXPLAIN (ANALYZE)&lt;/code&gt; feature to understand how PostgreSQL processes queries. We then explored indexing, covering the addition of an index for performance improvements and how to do so without causing downtime or table locking.&lt;/p&gt;

&lt;p&gt;Further into query optimization, we discussed the importance of careful column selection in the &lt;code&gt;SELECT&lt;/code&gt; statement, demonstrating how narrowing down the retrieved data can improve performance. We then covered aligning database queries with business needs, illustrating how additional filters can reduce the number of processed rows and enhance scalability. Finally we learned how to visualize query plans with a free web-based tool. These strategies provide developers with the tools to create efficient and resilient Rails applications with PostgreSQL.&lt;/p&gt;

&lt;p&gt;This post was originally published at: &lt;a href="https://danielabaron.me/blog/rails-query-perf/" rel="noopener noreferrer"&gt;https://danielabaron.me/blog/rails-query-perf/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>rails</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
