Akash Kava

Posted on Jun 24, 2019

Improving SQL Query by Adding conditions in Joins

#sql #joins #performance #showdev

Recently I was trying to optimize query, this was the query,

SELECT ... 
FROM A
INNER JOIN B
   ON B.ID = A.ID
WHERE B.Status = 'Done' 
   AND B.DateCreated BETWEEN @Start and @End
   AND NOT EXISTS (...)
   AND A. ... other conditions

Table B is huge table, and query took couple of seconds. Even though there was an index of (ID, Status, DateCreated).

When I changed query to,

SELECT ... 
FROM A
INNER JOIN B
   ON B.ID = A.ID AND B.Status = 'Done'
   AND B.DateCreated BETWEEN @Start and @End
   AND NOT EXISTS (...)
WHERE A. ... other conditions

Surprisingly query only took few milliseconds. Upon further investigation I found that moving NOT EXISTS inside JOIN improved the speed.

Top comments (11)

Evaldas Buinauskas • Jun 24 '19

You were just lucky. Query optimizer has chosen a non cached version of plan to execute this query.

Akash Kava • Jun 24 '19

I don't think that was the case, Execution plans for both queries are different, Also these queries were executing frequently and I saw the difference even executing them in parallel (half queries in old way and half in new way simultaneously).

Evaldas Buinauskas • Jun 24 '19

Plans have obviously have to be different.

I see this is SQL Server and it will cache plan for an exact query you run. By moving date time clause to join condition you force a new plan generation.

You should be able to get same results by forcing plan recompilation

Akash Kava • Jun 24 '19

I looked at my query again and I found out that I had an extra clause NOT EXISTS which was causing difference in speed and plans, omitting not EXISTS didn't make difference in speed in both cases.

Jared Karney • Jun 24 '19

That's not luck, it would absolutely produce a new plan because the text changed. Even adding a space or making a letter upper case will generate a new plan.

Evaldas Buinauskas • Jun 24 '19 • Edited

Yes, this is correct.

I should've been more clear that moving clause from where to join does not add any performance on its own, new forcefully generated plan did. Thanks for bringing this up!

Cade Roux • Jun 25 '19

In any decent optimizer, INNER JOIN and WHERE conditions are basically interchangeable.

Also, I'm suspicious about any index that starts on ID because they are highly selective from the start.

Would like to see the execution plans of the two different queries.

Nathan S.R. • Dec 25 '23

Thanks for this very useful post. I just wanted to add that, there is a very easy way now, to test all the SQLs described here, using the free & portable tools, mentioned in my latest post here : dev.to/linuxguist/learn-sql-quickl...

Jared Karney • Jun 24 '19 • Edited

You also have to look at the possible 'reason for early termination' in the plan. If the plan is not able to be fully optimized and has a reason for early termination, it may have executed serially and in these cases you may get a better plan with the filter in the join. However, your problem lies in the fact that the optimizer isn't able to complete. Potential causes there are complex queries or auto updating statistics, to name a couple.

rhymes • Jun 24 '19

Can you share the execution plans?

alexander-suprun • Jun 24 '19

That's just a very useless observation and doesn't prove anything.
Have you figured out why exactly did this happen? What's in NOT EXISTS condition? What are the execution plans?

View full discussion (11 comments)