In SQL, the HAVING and WHERE clauses have a similar function with a key difference. Both functions allow a user to filter data with respect to a certain condition. The difference between the two clauses has to do with when each is used. Basically, the WHERE can only be used in non-aggregated data (ie. data that has not been aggregated by GROUP BY). On the other hand, HAVING is used after a GROUP BY. Let's walk through some examples.
WHERE
As mentioned earlier, WHERE is used on data before and aggregation/filtering is done.
SELECT Employee_Name, Employee_ID
FROM Employee
WHERE Employee_Age > 30
The code above selects the names and ID's of all employees over the age of 30. Note that the WHERE clause is used on non-aggregated data.
HAVING (and GROUP BY)
HAVING is often used following a GROUP BY, which aggregates data by a certain feature. In the following example, let's say you have game-by-game data for each player for the first five games of the season and, as a coach, you wanted to see which players on your team made more than 10 3-pointers.
SELECT Player_Name, SUM(3PM_Made) as total
FROM Team
GROUP BY Player_Name
HAVING total > 100
Here, we want the name and number of 3-pointers each player has made. The GROUP BY clause is essential here because if we had used a simple WHERE with no GROUP BY or HAVING, we would get 5 different numbers for each player, representing how many 3-pointers each player made in each game. We want the total that each player made, thus the GROUP BY is needed. The HAVING acts as a filter post-aggregation to get the desired range of total 3-pointers.
While there can always be exceptions, a good rule of thumb is to treat WHERE as the clause used on non-aggregated data, while treating HAVING as the clause used on aggregated data (often in conjunction with GROUP BY).
Top comments (0)