It is very often that we need to calculate averages in our data for various analysis tasks. It quite easy when it come to calculating Mean, Median and Mode in tools such as Excel with the help of inbuilt functions.
In this this blog we will discuss ways to calculate these values using just SQL.
Let us consider table below to find mean, median, and mode of salaries department wise for all the departments.
╔═══════╦════════╦════════╗
║ empid ║ deptid ║ salary ║
╠═══════╬════════╬════════╣
║ 1 ║ 1 ║ 12000 ║
╠═══════╬════════╬════════╣
║ 2 ║ 2 ║ 3400 ║
╠═══════╬════════╬════════╣
║ 3 ║ 3 ║ 45000 ║
╠═══════╬════════╬════════╣
║ 4 ║ 4 ║ 1320 ║
╠═══════╬════════╬════════╣
║ 5 ║ 4 ║ 5600 ║
╠═══════╬════════╬════════╣
║ 6 ║ 2 ║ 2700 ║
╠═══════╬════════╬════════╣
║ 7 ║ 1 ║ 12000 ║
╠═══════╬════════╬════════╣
║ 8 ║ 3 ║ 34000 ║
╠═══════╬════════╬════════╣
║ 9 ║ 4 ║ 4600 ║
╠═══════╬════════╬════════╣
║ 10 ║ 1 ║ 9000 ║
╠═══════╬════════╬════════╣
║ 11 ║ 1 ║ 13000 ║
╠═══════╬════════╬════════╣
║ 12 ║ 2 ║ 12000 ║
╠═══════╬════════╬════════╣
║ 13 ║ 4 ║ 1320 ║
╠═══════╬════════╬════════╣
║ 14 ║ 3 ║ 24000 ║
╠═══════╬════════╬════════╣
║ 15 ║ 3 ║ 45000 ║
╠═══════╬════════╬════════╣
║ 16 ║ 2 ║ 3400 ║
╠═══════╬════════╬════════╣
║ 17 ║ 2 ║ 1230 ║
╠═══════╬════════╬════════╣
║ 18 ║ 1 ║ 15000 ║
╚═══════╩════════╩════════╝
Finding mean is quite easy as it is just the average of all the salary for a department and we can use the AVG function for that.
SELECT deptid
,AVG(SALARY) AS mean_salary
FROM employee
GROUP BY deptid
Calculating mode of salary for each department will be a bit tricky as we don't have inbuilt SQL functions for that.
Steps for it are :
For this we start by calculating salary count for each unique salary in each department.
Then we rank the salary by its frequency (salary count) in each department.
salary_counts AS (
SELECT deptid,
salary,
COUNT(salary) AS salary_counts
FROM employee
GROUP BY deptid, salary
)
-- ranking salaries by frequency
, mode_salaries AS (
SELECT deptid,
salary,
RANK() OVER( PARTITION BY deptid ORDER BY salary_counts DESC ) AS salary_counts_rank
FROM salary_counts
)
SELECT deptid,
salary as mode_salary
FROM mode_salaries
WHERE salary_counts_rank = 1
For calculating median of salary for each department PERCENTILE_DISC SQL function along with GROUP can help us calculate that.
median_salaries AS (
SELECT deptid,
PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY salary) AS median_salary
FROM employee
GROUP BY deptid
)
SELECT deptid,
median_salary
FROM median_salaries
And that is how we can calculate mean, median, mode in SQL.
Top comments (0)