Common Table Expressions (CTEs) are underrated compared to the subquery. Here’s why I prefer to use CTEs when building SQL queries
In the example below I’ve used a CTE to:
- get all the data I need from the Email Sent table, told SQL to hold that in memory,
- then get what I need from the Unsubscribe table, told SQL to hold that in memory too,
- then joined them together with flags from the Customers table. Where you see a WITH, is the CTE starting and then I’m naming them 'sent' and ‘unsubs’ before starting to tell the CTE what I want to return:
; with sent as -- start the CTE with a semicolon to terminate anything above (select emailaddress, emailid, senddate from marketing.emailsent where senddate between '2018-01-01' and '2018-01-31' ), -- add a comma if you need to add a subsequent CTE unsubs as (select emailaddress, emailid, senddate from marketing.emailunsubscribe where senddate between '2018-01-01' and '2018-01-31' ) -- no comma for the last CTE select 'January' as monthdelivered, c.country, count(distinct sent.emailaddress) as countofdelivered, count(distinct unsubs.emailaddress) as countofunsubd from sent left join marketing.customers c on sent.email = unsubs.emailaddress left join unsubs on sent.email = unsubs.email and sent.emailid = unsubs.emailid
When I do the final 'join everything together' part I’m joining fields from the ‘sent’ dataset such as 'sent.email'.
The way I get my head around reading a subquery is by reading from the inside out. It's nesting everything you need together, but in my opinion, it tends to get ugly really quickly.
- The first step is the query in the centre starting 'SELECT accountID …' to get all orders greater than 30 from the "orderhistory" table.
- Then JOIN on the "account" table to look up which Accounts were from New Zealand.
- Then the top SELECT runs to return all the fields from the ord dataset and the three columns I want to see from the "account" table.
select ord.*, acc.country acc.city acc.createddate from ( select accountid, ordered, ordervalue from sales.orderhistory where ordervalue > '30' ) ord join sales.account acc on ord.accountid = acc.accountid where acc.country = 'New Zealand'
You can use them multiple times throughout your script and they are readable, you can return what you need then reference it later.
If you don’t have write permissions this may not be possible. And if it’s only used for this query your DBA might not be thrilled with you creating one-off tables.
CTEs don’t last forever and can only be used in the query you’re currently in, unlike temp tables or views that can survive outside the current script.
The database will always decide the best way to execute your query. If you ask your friendly DBA which strategy to use, they can fill you in on which strategy works best for your database. The CTE is all about readability, so if it works for you give it a try.
This post originally appeared on helenanderson.co.nz