DEV Community

Julien Dephix
Julien Dephix

Posted on • Updated on

Quiz: SQL query

Hello, coders! 💻

Today's quiz taken from a real world task is SQL related.

Database structure

There are two tables (very simplified for the sake of this quiz):

  1. companies with two columns: id and name
  2. users with two columns: company_id and is_pro (boolean)

Task

write a query to find all companies that only have pro users (is_pro=1).

My take on this

Click to see my solution using MySQL
SELECT
    c.name as company
FROM
    users u
JOIN companies c ON
    c.id = u.company_id
GROUP BY
    c.id
HAVING
    SUM(IF(is_pro, 0, 1)) = 0;
Enter fullscreen mode Exit fullscreen mode

IF(is_pro, 0, 1) means that I give 0 points to users that are pros (counter-intuitive but bear with me!). If all users are pros then the sum of points should be 0, right? That's what HAVING IF(is_pro, 0, 1)) = 0 does.

There you have it!

How would you have done it?

📝 Edit
As pointed out by @geraldew using SUM to COUNT is hacky and as such I'd better use COUNT and CASE:

SELECT
    c.name as company,
    COUNT(
        CASE WHEN can_use_app = 1 THEN 'x'
    END
    ) isPro,
    COUNT(
        CASE WHEN can_use_app = 0 OR can_use_app IS NULL THEN 'x'
    END
    ) isNotPro
FROM
    users u
JOIN companies c ON
    c.id = u.company_id
GROUP BY
    c.id
HAVING
    isPro > 0 and isNotPro = 0;
Enter fullscreen mode Exit fullscreen mode

Happy coding! ⌨️

Top comments (7)

Collapse
 
geraldew profile image
geraldew

I don't know which SQL dialects have such an IF function (none that I use do), so my question is: what happens when is_pro is null?

Collapse
 
joolsmcfly profile image
Julien Dephix • Edited

What dialect do you use? IF has been available in MySQL since at least v5.1(released in 2008!).
I should have mentioned MySQL in my solution, I'll edit.

in SQL Server you have IIF for instance.

If is_pro is null then it's not considered truthy so IF will return 1, which is what we want.

mysql> select IF(null, 0, 1);
+----------------+
| IF(null, 0, 1) |
+----------------+
|              1 |
+----------------+
Enter fullscreen mode Exit fullscreen mode

Official MySQL documentation on IF

Collapse
 
geraldew profile image
geraldew

Well my question comes more from the fact that there is no IF in the SQL standard, so any article not clearly saying it is about a particular dialect perhaps should not use it.

The standard has the CASE expression for that purpose (note: expression, not function).

But to answer your question, most of my work has been with Teradata, HiveQL, Impala and occasionally PostgreSQL and SQLite. In some of these, there is an "IF" construct but it is used outside of SQL statements to provide a kind of conditional execution that standard SQL itself lacks.

An obvious caveat is to say that all vendors vary the dialect and that's probably a good thing overall and has been part of the evolution of SQL.

The issue of what functions should do when encountering a Null is .. debatable - and really is a whole other topic. My personal preference is for Nulls to cascade except where a construct is explicit about handling them.

Similarly I don't ever write SQL code that counts by summing the number 1 - even though I see that done a lot. When I'm investigating problems, code that does this is an immediate red flag.

Thread Thread
 
geraldew profile image
geraldew

And having said all that, I probably should back that up with code. A first cut of how I'd tackle it (i.e. just air code as I'm now at home) would be:

SELECT
    c.id ,
    c.name AS company ,
    COUNT( CASE WHEN u.is_pro THEN 'x' END ) AS IsPro_Cnt ,
    COUNT( CASE WHEN NOT u.is_pro THEN 'x' END ) AS NotPro_Cnt ,
    COUNT( CASE WHEN u.is_pro IS NULL THEN 'x' END ) AS NullPro_Cnt
FROM
    companies AS c 
    INNER JOIN 
    users AS u ON
        u.company_id = c.id
GROUP BY
    c.id
HAVING
    IsPro_Cnt > 0
    AND
    NotPro_Cnt = 0
    /* depending on what we want to do about Nulls 
        for is_pro we might also make use of ..
        AND
        NullPro_Cnt = 0 */
;
Enter fullscreen mode Exit fullscreen mode

Where for clarity I've shown the subcounts as columns that are then quoted by name in the HAVING clause. As each is only used once, they could just be expressed in the HAVING clause and thus not output as columns.

  • Depending on context, such re-arrangements aren't needed - e.g. if used in defining a View it then makes no difference to a query planner when you don't ever use those named columns outside of the view definition.

Also, partly to make it clear - both to myself and a later reader of the code - I've put in the parts that would help deal with Nulls. This particularly helps when other requirements might prompt a change from using an INNER JOIN to using a LEFT OUTER JOIN, in which case you really do get a Null for the is_pro column via non-matches of the outer join. All the more reason to always be explicit about Null handling with SQL.

p.s. FWIW I don't personally find my THEN 'x' to be elegant but any non-null value will do there. I don't use the number 1 precisely to avoid confusions with places where people might use the SUM of 1 approach.

Thread Thread
 
joolsmcfly profile image
Julien Dephix

Thanks for your valuable input.

It's indeed more portable to use COUNT in conjunction with CASE. I wrote that query as a one-off, for stats, after wondering how many companies only had pro users. It's not production code. That being said, I'll keep COUNT and CASE in mind!

In my use case, NULL should be treated the same as 0 so I would write it as:

COUNT(
    CASE WHEN can_use_app = 0 OR can_use_app IS NULL THEN 'x'
END
) isNotPro
Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
geraldew profile image
geraldew

Ah yes, because you said is_pro was Boolean I just used that. Which is ironic as many data engines and their SQL dialects don't have a Boolean type at all, including the ones I mostly use. The joys of a language standard that is largely observed in the breach.

Thread Thread
 
joolsmcfly profile image
Julien Dephix

Yeah by Boolean I meant 0 or 1 (up to you to decide if null is allowed or not).
Boolean type in MySQL is sugar syntax for TINYINT(1).

I’ll be more precise next time!