Sometimes you have the requirement to calculate percentages on some values of your data. There are multiple ways of doing it, of course, but often people are not aware that you do not have to calculate these percentages in the application itself or via a SQL statement that queries the same table multiple times to first calculate the denominator and then calculate the actual percentage. Instead, you can use SQL window functions to run a variety of complex calculations over different groups of data in a single pass! Window functions were introduced in the SQL:2003 standard back in 2003 and although the SQL standard calls these Window Functions, Oracle Database has them documented as Analytic Functions.
Let’s take a look.
Example Data
First, we need to have some data that we can run queries against. Let’s use movies and calculate the percentages of the actors’ earnings – fictional, of course. 🙂
CREATE TABLE movie_earnings (
movie_name VARCHAR(30),
actor_name VARCHAR(30),
earnings NUMBER
);
INSERT INTO movie_earnings VALUES
('Thor: Love and Thunder', 'Chris Hemsworth', 5000000);
INSERT INTO movie_earnings VALUES
('Thor: Love and Thunder', 'Natalie Portman', 2000000);
INSERT INTO movie_earnings VALUES
('Thor: Love and Thunder', 'Christian Bale', 1000000);
INSERT INTO movie_earnings VALUES
('Minions: The Rise of Gru', 'Steve Carell', 500000);
INSERT INTO movie_earnings VALUES
('Minions: The Rise of Gru', 'Pierre Coffin', 100000);
COMMIT;
Percentage Total Earnings per Actor
Back in the days before window functions were available, and unfortunately still too often today, you will find queries with the two-step approach that will read the same table twice, first to get the denominator and second to run the actual computation of the percentage using the denominator:
SELECT movie_name, actor_name,
ROUND(100 * earnings/sums.total, 2) AS percent
FROM movie_earnings, (
SELECT SUM(earnings) AS total
FROM movie_earnings
) sums
ORDER BY percent DESC;
MOVIE_NAME ACTOR_NAME PERCENT
------------------------ --------------- -------
Thor: Love and Thunder Chris Hemsworth 58.14
Thor: Love and Thunder Natalie Portman 23.26
Thor: Love and Thunder Christian Bale 11.63
Minions: The Rise of Gru Steve Carell 5.81
Minions: The Rise of Gru Pierre Coffin 1.16
While this produces the correct result, it has an unnecessary second query on the same table. With window functions, however – and once again, they have been around in some databases for more 20 years now and added to the SQL standard in 2003 – you can calculate the denominator of the percentage in a single go, no need to read the table twice:
SELECT
movie_name, actor_name,
ROUND(
100 * earnings / SUM(earnings) OVER (),
2) AS percent
FROM movie_earnings
ORDER BY percent DESC;
MOVIE_NAME ACTOR_NAME PERCENT
------------------------ --------------- -------
Thor: Love and Thunder Chris Hemsworth 58.14
Thor: Love and Thunder Natalie Portman 23.26
Thor: Love and Thunder Christian Bale 11.63
Minions: The Rise of Gru Steve Carell 5.81
Minions: The Rise of Gru Pierre Coffin 1.16
The window function in the above statement is the SUM(earnings) OVER ()
part. Concise, isn’t it?
Percentage of Earnings per Movie
Of course, calculating the percentage of earnings for all movies is interesting but perhaps more interesting is to know who the highest-paid actor in a given movie is. Prior to window functions, you may have written a statement like this:
SELECT
e.movie_name, e.actor_name,
ROUND(100 * e.earnings/s.total, 2) AS percent
FROM movie_earnings e, (
SELECT SUM(earnings) AS total, movie_name
FROM movie_earnings
GROUP BY movie_name) s
WHERE e.movie_name = s.movie_name
ORDER BY movie_name, percent DESC;
MOVIE_NAME ACTOR_NAME PERCENT
------------------------ --------------- -------
Minions: The Rise of Gru Steve Carell 83.33
Minions: The Rise of Gru Pierre Coffin 16.67
Thor: Love and Thunder Chris Hemsworth 62.5
Thor: Love and Thunder Natalie Portman 25
Thor: Love and Thunder Christian Bale 12.5
The benefit of using window functions is that we can reuse the same query from before and just change the way how we partition the data for our analysis. The previous statement had a window function with a trailing OVER()
clause. That OVER()
clause is there to tell the window function, among other things, what part of the retrieved data to apply the calculation on. Window functions use values from one or multiple rows to return a value for each row. This contrasts with aggregate functions, which return a single value for multiple rows. In short, window functions always have an OVER()
clause while any function without an OVER()
clause is not a window function, but rather an aggregate or single-row (scalar) function.
To calculate the percentages per movie, all you have to do is to tell the window function to look at the data on a per-movie basis. In other words, partition the retrieved data by the movie_name and apply the calculation for each partition or window of data:
SELECT
movie_name, actor_name,
ROUND(
100 * earnings / SUM(earnings)
OVER (PARTITION BY movie_name),
2) AS percent
FROM movie_earnings
ORDER BY movie_name, percent DESC;
MOVIE_NAME ACTOR_NAME PERCENT
------------------------ --------------- -------
Minions: The Rise of Gru Steve Carell 83.33
Minions: The Rise of Gru Pierre Coffin 16.67
Thor: Love and Thunder Chris Hemsworth 62.5
Thor: Love and Thunder Natalie Portman 25
Thor: Love and Thunder Christian Bale 12.5
Pay attention to the (now in a separate line for further illustration purposes) OVER (PARTITION BY movie_name)
clause. That’s all that needed to change to execute the percentage calculations per movie.
Percentage of Earnings per Movie and Total Earnings
The cool thing about window functions is that you can use multiple window functions in one SQL statement! So, for example, instead of executing the two statements above, you can just run one single query and retrieve both calculations in one go:
SELECT
movie_name, actor_name,
ROUND(
100 * earnings / SUM(earnings)
OVER (PARTITION BY movie_name),
2) AS ptc_movie,
ROUND(
100 * earnings / SUM(earnings)
OVER (),
2) AS ptc_total
FROM movie_earnings
ORDER BY movie_name, ptc_movie DESC;
MOVIE_NAME ACTOR_NAME PTC_MOVIE PTC_TOTAL
------------------------ --------------- --------- ---------
Minions: The Rise of Gru Steve Carell 83.33 5.81
Minions: The Rise of Gru Pierre Coffin 16.67 1.16
Thor: Love and Thunder Chris Hemsworth 62.5 58.14
Thor: Love and Thunder Natalie Portman 25 23.26
Thor: Love and Thunder Christian Bale 12.5 11.63
This result provides some interesting insights right away. Here you can quickly see that although Steve Carell got the most money from his movie, it is just a tiny fraction of what others have made with their movies. We know this already, of course, because of the queries earlier on, but here you have the numbers next to each other, making it much more apparent. And, of course, we didn’t need to read the table twice to get the per movie and total earnings percentages either.
Many functions to choose from
Just like with aggregate functions, there are many different window functions available, so always check out the documentation! It just happens to be that Oracle Database has a RATIO_TO_REPORT
function that computes the ratio of a value to the sum of a set of values. In other words, there already is a window function to calculate the percentages. So the above query can be further simplified to the following, saving you from having to do the division manually:
SELECT
movie_name, actor_name,
ROUND(
100 * RATIO_TO_REPORT(earnings)
OVER (PARTITION BY movie_name),
2) AS ptc_movie,
ROUND(
100 * RATIO_TO_REPORT(earnings)
OVER (),
2) AS ptc_total
FROM movie_earnings
ORDER BY movie_name, ptc_movie DESC;
MOVIE_NAME ACTOR_NAME PTC_MOVIE PTC_TOTAL
------------------------ --------------- --------- ---------
Minions: The Rise of Gru Steve Carell 83.33 5.81
Minions: The Rise of Gru Pierre Coffin 16.67 1.16
Thor: Love and Thunder Chris Hemsworth 62.5 58.14
Thor: Love and Thunder Natalie Portman 25 23.26
Thor: Love and Thunder Christian Bale 12.5 11.63
Conclusion
Window functions, or analytical functions as Oracle calls them, are a powerful way to execute complex calculations over multiple “windows” in a query result.
They are much more compact than other methods and only need to read the data once instead of multiple times, giving you a performance boost for running complex calculations on large data sets.
Window functions should be in the toolbox of any developer who regularly writes SQL!
If you want to learn more about window functions, check out the free Analytic SQL for Developers course from Oracle!
Top comments (0)