We have a database table TBLCARS as follows:
We are trying to expand each PRICE value according to the corresponding QUANTITY and calculate the PRICE median.
SQL written in SQL Server:
WITH median_dt AS (
SELECT CASE
WHEN sum(quantity) % 2 = 0 THEN sum(quantity) / 2
ELSE sum(quantity) / 2 + 1
END AS lower_limit
, CASE
WHEN sum(quantity) % 2 = 0 THEN sum(quantity) / 2 + 1
ELSE sum(quantity) / 2 + 1
END AS upper_limit
FROM t
),
data AS (
SELECT *, sum(quantity) OVER (ORDER BY price ASC) AS cum_sum
FROM t
),
rnk_val AS (
SELECT *
FROM (
SELECT price, row_number() OVER (ORDER BY d.cum_sum ASC) AS rnk
FROM data d
JOIN median_dt b ON b.lower_limit <= d.cum_sum
) x
WHERE x.rnk = 1
UNION ALL
SELECT *
FROM (
SELECT price, row_number() OVER (ORDER BY d.cum_sum ASC) AS rnk
FROM data d
JOIN median_dt b ON b.upper_limit <= d.cum_sum
) x
WHERE x.rnk = 1
)
SELECT avg(price) AS median
FROM rnk_val
Not difficult at all. We just need to extend the PRICE value by copying it QUANTITY times, concatenate the result records, and calculate the PRICE median. As SQL sets are unordered and the language does not have an median function, we can only invent indexes using a complex method, concatenate result records, and then calculate the median.
It is easy to write the algorithm in the open-source esProc SPL:
Based on the ordered set, SPL is a professional data computing engine that offers all-round functions for performing set-oriented calculations and a wealth of mathematical functions. It combines merits of Java and SQL, making it convenient to handle the dynamic expansion cases.
Top comments (0)