DEV Community

fabriciomsdev
fabriciomsdev

Posted on

If you see that a SQL SELECT query is slow - what would you do to improve it?

If a SQL SELECT query is slow, there are some simple steps we can take to identify the cause and improve its performance.

I will explain the above step by step:

1 Step - First see the plan for the execution of the query:

To do that you can use EXPLAIN or EXPLAIN ANALYZE to understand how this query is working behind the scenes.

For example, I built a simple book stores system, with 3 entities, Books, Authors, and Publisher, I ran a simple query with a merge between Books and Authors to show how it works

Image description

As you can see in the table Result 1 this command can show the query execution plan with all process in the query running it can be used to see the bottlenecks in the process, I will proposital add a bottleneck in the process for example adding a table in the query merge without foreign keys.

CREATE TABLE books_publisher (
    publisher_id SERIAL PRIMARY KEY,
    publisher_name VARCHAR(100) NOT NULL
);

ALTER TABLE books_book
ADD COLUMN publisher_id INTEGER;

INSERT INTO books_publisher (publisher_name) VALUES ('Penguin');
INSERT INTO books_publisher (publisher_name) VALUES ('Harper Collins');
INSERT INTO books_publisher (publisher_name) VALUES ('Oxford University Press');
INSERT INTO books_publisher (publisher_name) VALUES ('Pearson');

UPDATE books_book
SET publisher_id = (SELECT publisher_id FROM books_publisher ORDER BY RANDOM() LIMIT 1);
Enter fullscreen mode Exit fullscreen mode

Now I run the query doing a join with publisher:

Image description

Note: The time of execution is 11.180 ms

2 Step - Verify foreign keys and indexes:

I will add a foreign key index to improve the algorithm when we run the query:

ALTER TABLE books_book
ADD CONSTRAINT fk_publisher
FOREIGN KEY (publisher_id)
REFERENCES books_publisher(publisher_id);

CREATE INDEX idx_publisher_id
ON books_book (publisher_id);

CREATE INDEX books_book_btree_publisher_id ON public.books_book USING btree (publisher_id);
Enter fullscreen mode Exit fullscreen mode

After improving in DDL structure:

Image description

Note: The time of execution is 4.168 ms

3 Step - Verify what business needs, and filter data:

For example to count how many books a publisher send for each author we should do the query:

select
    books_publisher.publisher_name,
    books_author."name"  as author,
    count(*) qty
from books_book
inner join books_author 
on books_book.author_id = books_author.id
inner join books_publisher 
on books_publisher.publisher_id = books_book.publisher_id
group by books_publisher.publisher_name, books_author."name";
Enter fullscreen mode Exit fullscreen mode

This query took:

Image description

We can remove the books where we did not have publishers to avoid unnecessary loops I will add a filter:

  where books_book.publisher_id is not null
Enter fullscreen mode Exit fullscreen mode

It is the result:

Image description

It is a very small example, on the large datasets other things we can do are:

  • remove necessary data from attribute selection
  • do views or materialized views with your subqueries to preprocess data
  • organize data to follow normalization rules or unfollow (it can work too), depending on the kind of feature you want to provide to your customer
    • Verify resources of the database, such as: connections opened, connections idle, CPU and RAM use.

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay