If someone suggests rewriting your SQL query to use EXISTS instead of an IN subquery, look at the execution plan. Below are two examples illustrating uncorrelated and correlated subqueries.
Example
I have two tables: "users" and "messages."
A message originates from a user, so we find the "user_id" in both as a primary key for "users" and a foreign key for "messages." It can be null when a message has no author. Users can belong to a country ("user_country"), and messages can originate from a country ("message_country"):
create extension if not exists pgcrypto;
create table users (
user_id uuid primary key default gen_random_uuid()
, user_country text
, active boolean default true
);
create table messages (
message_id uuid primary key default gen_random_uuid()
, user_id uuid references users
, message_country text
);
insert into users (active, user_country)
select
random()>0.1
,(ARRAY['CH','FR','ES'])[3*random()]
from generate_series(1,10)
;
insert into messages (user_id,message_country)
select
case when random()>0.1 then user_id end
,(ARRAY['CH','FR','ES'])[3*random()]
from users, generate_series(1,1000)
;
create index on users(user_country);
create index on messages(message_country asc, user_id asc);
analyze users, messages;
Uncorrelated Subquery
An uncorrelated subquery doesn't need to be executed for each row where it is evaluated because it does not reference the columns of those rows and will always produce the same result.
The following query lists the messages for which the user is active. It selects the active users in a subquery that returns the "user_id", and uses it in a where clause with IN ()
, = ANY ()
or = SOME ()
(those are equivalent):
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where user_id in (
select user_id
from users u
where active
)
;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=3.013..9.913 rows=8091 loops=1)
Hash Cond: (m.user_id = u.user_id)
-> Seq Scan on messages m (actual time=1.248..6.190 rows=10000 loops=1)
Storage Table Read Requests: 12
Storage Table Read Execution Time: 4.078 ms
Storage Table Rows Scanned: 10000
-> Hash (actual time=1.745..1.745 rows=9 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on users u (actual time=1.731..1.734 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.884 ms
Storage Table Rows Scanned: 10
Nine rows have been read from "users" and joined to "messages".
The same could have been written as a join:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
join users u
on (m.user_id=u.user_id)
where u.active
;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=2.469..9.494 rows=8091 loops=1)
Hash Cond: (m.user_id = u.user_id)
-> Seq Scan on messages m (actual time=1.274..6.371 rows=10000 loops=1)
Storage Table Read Requests: 12
Storage Table Read Execution Time: 4.188 ms
Storage Table Rows Scanned: 10000
-> Hash (actual time=1.178..1.179 rows=9 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on users u (actual time=1.127..1.131 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 1.052 ms
Storage Table Rows Scanned: 10
It can also be written with EXISTS:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where exists (
select from users u
where
m.user_id = u.user_id
and active
)
;
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (actual time=1.875..8.790 rows=8091 loops=1)
Hash Cond: (m.user_id = u.user_id)
-> Seq Scan on messages m (actual time=1.154..6.126 rows=10000 loops=1)
Storage Table Read Requests: 12
Storage Table Read Execution Time: 4.002 ms
Storage Table Rows Scanned: 10000
-> Hash (actual time=0.706..0.706 rows=9 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on users u (actual time=0.692..0.695 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.621 ms
Storage Table Rows Scanned: 10
The query planner transformed the subquery to a join in all cases, and they performed the same way.
Correlated Subquery
I'm adding a filter: the messages must originate from the user's country. As it filters to result in fewer rows, we could expect a lower response time, but this query takes five seconds instead of eight milliseconds:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where user_id in (
select user_id
from users u
where active
and u.user_country = m.message_country
)
;
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on messages m (actual time=4.163..5657.120 rows=2084 loops=1)
Filter: (SubPlan 1)
Rows Removed by Filter: 7916
SubPlan 1
-> Seq Scan on users u (actual time=0.559..0.560 rows=2 loops=10000)
Storage Filter: (active AND (user_country = m.message_country))
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.501 ms
Storage Table Rows Scanned: 11
The high response time results from loops=10000
: the subquery has been executed for each row to evaluate the new filter involving both tables.
A Filter with a subquery, indicated by Filter: (SubPlan 1)
, is similar to a nested loop join. The loops
is the indicator of the time complexity.
The PostgreSQL query planner couldn't transform this as a join because the subquery is correlated: one of the join conditions is in the query, while the other is in the projection to the IN clause.
It is not better if I revert the two predicates:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where message_country in (
select user_country
from users u
where u.user_id=m.user_id
and active
)
;
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on messages m (actual time=4.835..5720.221 rows=2084 loops=1)
Filter: (SubPlan 1)
Rows Removed by Filter: 7916
SubPlan 1
-> Seq Scan on users u (actual time=0.566..0.567 rows=1 loops=10000)
Storage Filter: (active AND (user_id = m.user_id))
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.507 ms
Storage Table Rows Scanned: 11
The query planner wasn't able to transform this into a join, but I can rewrite the query using a join:
yugabyte=# explain (analyze, dist, costs off, summary off)
select *
from messages m
join users u on (
m.user_id=u.user_id
and
m.message_country=u.user_country
and u.active
) ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
YB Batched Nested Loop Join (actual time=4.527..9.453 rows=2084 loops=1)
Join Filter: ((m.user_id = u.user_id) AND (m.message_country = u.user_country))
-> Seq Scan on users u (actual time=0.790..0.794 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.691 ms
Storage Table Rows Scanned: 10
-> Index Scan using messages_message_country_user_id_idx on messages m (actual time=3.544..7.153 rows=2084 loops=1)
Index Cond: (ROW(message_country, user_id) = ANY (ARRAY[ROW(u.user_country, u.user_id), ROW($1, $1025), ROW($2, $1026), ..., ROW($1023, $2047)]))
Storage Table Read Requests: 3
Storage Table Read Execution Time: 4.701 ms
Storage Table Rows Scanned: 2084
Storage Index Read Requests: 3
Storage Index Read Execution Time: 0.831 ms
Storage Index Rows Scanned: 2084
This plan is efficient because it joins the small table ("users") to the large one ("message") and can push down all the join predicates. In PostgreSQL, this would show nine loops (loops=9
), but YugabyteDB optimizes it further to one loop only with Batched Nested Loop so that the nine values from the outer table are pushed down as an array.
In this case, it is also possible to de-correlate the subquery, having all join columns in the subquery projection:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where ( m.message_country,m.user_id) in (
select u.user_country, u.user_id
from users u
where active
)
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
YB Batched Nested Loop Join (actual time=5.342..10.323 rows=2084 loops=1)
Join Filter: ((m.message_country = u.user_country) AND (m.user_id = u.user_id))
-> Seq Scan on users u (actual time=0.929..0.933 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.830 ms
Storage Table Rows Scanned: 10
-> Index Scan using messages_message_country_user_id_idx on messages m (actual time=4.231..7.894 rows=2084 loops=1)
Index Cond: (ROW(message_country, user_id) = ANY (ARRAY[ROW(u.user_country, u.user_id), ROW($1, $1025), ROW($2, $1026), ..., ROW($1023, $2047)]))
Storage Table Read Requests: 3
Storage Table Read Execution Time: 5.359 ms
Storage Table Rows Scanned: 2084
Storage Index Read Requests: 3
Storage Index Read Execution Time: 0.840 ms
Storage Index Rows Scanned: 2084
Another possibility is to use EXISTS and have all columns in the correlated subquery:
yugabyte=# explain (analyze, dist, costs off, summary off)
select m.*
from messages m
where exists (
select from users u
where
m.user_id = u.user_id
and
m.message_country=u.user_country
and u.active
)
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
YB Batched Nested Loop Join (actual time=4.356..10.501 rows=2084 loops=1)
Join Filter: ((m.user_id = u.user_id) AND (m.message_country = u.user_country))
-> Seq Scan on users u (actual time=0.767..0.770 rows=9 loops=1)
Storage Filter: active
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.658 ms
Storage Table Rows Scanned: 10
-> Index Scan using messages_message_country_user_id_idx on messages m (actual time=3.409..8.277 rows=2084 loops=1)
Index Cond: (ROW(message_country, user_id) = ANY (ARRAY[ROW(u.user_country, u.user_id), ROW($1, $1025), ROW($2, $1026), ..., ROW($1023, $2047)]))
Storage Table Read Requests: 3
Storage Table Read Execution Time: 5.606 ms
Storage Table Rows Scanned: 2084
Storage Index Read Requests: 3
Storage Index Read Execution Time: 0.790 ms
Storage Index Rows Scanned: 2084
Note that the column used in the subquery is declared unique, which is the primary key in this example, and that's the reason why a join is equivalent. If this were not the case, you cannot replace the subquery with a join clause because it would result in one row for each matching row from the subquery. The IN() or EXISTS() clauses are semi-joins and there's no semi-join clause.
You can test it with:
alter table messages drop constraint messages_user_id_fkey;
alter table users drop constraint users_pkey;
create index on users ( user_id );
analyze users;
You can run the query above (with EXISTS) and test all join methods with hints
Nested Loop will duplicate the rows from the subquery with HashAggregate
:
YB Batched Nested Loop Join (actual time=37.856..43.425 rows=2084 loops=1)
Join Filter: ((m.user_id = u.user_id) AND (m.message_country = u.user_country))
-> HashAggregate (actual time=0.926..0.928 rows=9 loops=1)
Group Key: u.user_id, u.user_country
-> Seq Scan on users u (actual time=0.913..0.916 rows=9 loops=1)
Storage Filter: active
Merge Join (/*+ MergeJoin(u m) */
) and Hash Join (/*+ HashJoin(u m) */
) shows the semi-join:
Merge Semi Join (actual time=12.213..33.015 rows=2084 loops=1)
Merge Cond: ((m.message_country = u.user_country) AND (m.user_id = u.user_id))
-> Index Scan using messages_message_country_user_id_idx on messages m (actual time=3.746..30.814 rows=8322 loops=1)
or
Hash Semi Join (actual time=2.113..9.574 rows=2084 loops=1)
Hash Cond: ((m.user_id = u.user_id) AND (m.message_country = u.user_country))
-> Seq Scan on messages m (actual time=1.194..7.073 rows=10000 loops=1)
The PostgreSQL query planner can transform some correlated subqueries into a join and benefit from more join methods, but not in all cases. When you have any doubt, look at the execution plan.
When you rewrite a query, you expect the same result. The execution plan can help validate that, especially with a reproducible test case. Include all possibilities in your test case, such as null values and integrity constraints. And remember that this behavior may evolve with new database engine versions.
Not all databases are equal regarding optimizer transformations. Chris Antognini has compared MySQL, PostgreSQL, and Oracle with different variations of subqueries: https://antognini.ch/2017/12/how-well-a-query-optimizer-handles-subqueries.
In PostgreSQL or YugabyteDB, do not use IN()
, =ANY
, or =SOME
for correlated subqueries where the outer table has many rows because it may run the subquery for each row. Replace it with EXISTS to provide a semi-join for the query planner to choose a better join method.
Top comments (0)