DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

GBase 8a OLAP Window Functions in Practice: Ranking, Running Totals, MoM, and Ratio Analysis

GBase 8a fully supports OLAP window functions, making it a powerful gbase database for analytical workloads. This guide uses real sales scenarios to demonstrate ROW_NUMBER/RANK, moving aggregates, LAG/LEAD for period‑over‑period comparisons, ROLLUP subtotals, and how these functions execute in an MPP environment.

Window Function Syntax

function_name() OVER (
    [PARTITION BY column]   -- window group
    [ORDER BY column]       -- ordering within window
    [ROWS|RANGE BETWEEN ... AND ...]  -- frame definition
)
Enter fullscreen mode Exit fullscreen mode

Unlike GROUP BY, window functions do not collapse rows. Every row remains in the result set while still being able to “see” other rows in the window.

Ranking Functions

ROW_NUMBER / RANK / DENSE_RANK

SELECT
    dept_id,
    salesperson,
    sale_amount,
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY sale_amount DESC) AS row_num,
    RANK()       OVER (PARTITION BY dept_id ORDER BY sale_amount DESC) AS rnk,
    DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY sale_amount DESC) AS dense_rnk
FROM sales
WHERE sale_date >= '2024-01-01';
Enter fullscreen mode Exit fullscreen mode

Sample output:

dept_id salesperson sale_amount row_num rnk dense_rnk
101 Zhang San 95000 1 1 1
101 Li Si 95000 2 1 1
101 Wang Wu 82000 3 3 2
101 Zhao Liu 71000 4 4 3

Top 3 Sales per Department

WITH ranked AS (
    SELECT dept_id, salesperson, sale_amount,
           ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY sale_amount DESC) AS rn
    FROM sales WHERE YEAR(sale_date) = 2024
)
SELECT dept_id, salesperson, sale_amount
FROM ranked WHERE rn <= 3
ORDER BY dept_id, rn;
Enter fullscreen mode Exit fullscreen mode

NTILE: Bucketing

SELECT dept_id, salesperson, sale_amount,
       NTILE(4) OVER (PARTITION BY dept_id ORDER BY sale_amount DESC) AS quartile
FROM sales WHERE YEAR(sale_date) = 2024;
Enter fullscreen mode Exit fullscreen mode

Cumulative and Moving Aggregates

Year‑to‑Date

SELECT dept_id, sale_month, monthly_amount,
       SUM(monthly_amount) OVER (
           PARTITION BY dept_id ORDER BY sale_month
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) AS ytd_amount
FROM (
    SELECT dept_id,
           DATE_FORMAT(sale_date, '%Y-%m') AS sale_month,
           SUM(sale_amount) AS monthly_amount
    FROM sales
    GROUP BY dept_id, DATE_FORMAT(sale_date, '%Y-%m')
) t
ORDER BY dept_id, sale_month;
Enter fullscreen mode Exit fullscreen mode

Moving Average (3‑month)

SELECT dept_id, sale_month, monthly_amount,
       ROUND(AVG(monthly_amount) OVER (
           PARTITION BY dept_id ORDER BY sale_month
           ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
       ), 2) AS moving_avg_3m
FROM monthly_sales
ORDER BY dept_id, sale_month;
Enter fullscreen mode Exit fullscreen mode

Frame Syntax Quick Reference

-- Cumulative to current row
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
-- N preceding + current
ROWS BETWEEN N PRECEDING AND CURRENT ROW
-- Centered window
ROWS BETWEEN N PRECEDING AND N FOLLOWING
-- Entire partition
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
Enter fullscreen mode Exit fullscreen mode

Period Comparisons: LAG / LEAD

LAG: Month‑over‑Month

SELECT dept_id, sale_month, monthly_amount,
       LAG(monthly_amount, 1) OVER (PARTITION BY dept_id ORDER BY sale_month) AS prev_month_amount,
       ROUND(
           (monthly_amount - LAG(monthly_amount, 1) OVER (PARTITION BY dept_id ORDER BY sale_month))
           / NULLIF(LAG(monthly_amount, 1) OVER (PARTITION BY dept_id ORDER BY sale_month), 0) * 100, 2
       ) AS mom_growth_pct
FROM monthly_sales
ORDER BY dept_id, sale_month;
Enter fullscreen mode Exit fullscreen mode

LEAD: Next Period

SELECT sale_month, monthly_amount,
       LEAD(monthly_amount, 1) OVER (ORDER BY sale_month) AS next_month_amount,
       LEAD(monthly_amount, 3) OVER (ORDER BY sale_month) AS next_quarter_amount
FROM monthly_sales;
Enter fullscreen mode Exit fullscreen mode

Year‑over‑Year

SELECT sale_month, monthly_amount,
       LAG(monthly_amount, 12) OVER (PARTITION BY dept_id ORDER BY sale_month) AS same_month_last_year,
       ROUND(
           (monthly_amount - LAG(monthly_amount, 12) OVER (PARTITION BY dept_id ORDER BY sale_month))
           / NULLIF(LAG(monthly_amount, 12) OVER (PARTITION BY dept_id ORDER BY sale_month), 0) * 100, 2
       ) AS yoy_growth_pct
FROM monthly_sales
ORDER BY dept_id, sale_month;
Enter fullscreen mode Exit fullscreen mode

Percentage of Total

SELECT dept_id, sale_month, monthly_amount,
       ROUND(monthly_amount * 100.0 / SUM(monthly_amount) OVER (PARTITION BY sale_month), 2) AS pct_of_total,
       ROUND(monthly_amount * 100.0 / SUM(monthly_amount) OVER (PARTITION BY dept_id), 2) AS pct_of_dept_annual
FROM monthly_sales
ORDER BY sale_month, dept_id;
Enter fullscreen mode Exit fullscreen mode

ROLLUP and CUBE

ROLLUP: Subtotals and Grand Total

SELECT COALESCE(dept_id, 'Total') AS dept,
       COALESCE(sale_month, 'Subtotal') AS month,
       SUM(sale_amount) AS total_amount
FROM sales WHERE YEAR(sale_date) = 2024
GROUP BY ROLLUP(dept_id, sale_month)
ORDER BY dept_id, sale_month;
Enter fullscreen mode Exit fullscreen mode

CUBE: Cross‑dimensional Subtotals

SELECT COALESCE(CAST(dept_id AS CHAR), 'All') AS dept,
       COALESCE(region, 'All') AS region,
       SUM(sale_amount) AS total
FROM sales
GROUP BY CUBE(dept_id, region)
ORDER BY dept_id, region;
Enter fullscreen mode Exit fullscreen mode

First and Last Value in a Window

SELECT dept_id, sale_month, monthly_amount,
       FIRST_VALUE(monthly_amount) OVER (
           PARTITION BY dept_id ORDER BY monthly_amount ASC
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS hist_min,
       FIRST_VALUE(monthly_amount) OVER (
           PARTITION BY dept_id ORDER BY monthly_amount DESC
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS hist_max
FROM monthly_sales;
Enter fullscreen mode Exit fullscreen mode

How Window Functions Execute in the MPP Engine

Each gnode scans and pre‑aggregates locally → Hash Redistribute on PARTITION BY column → each gnode performs the window computation (sort + frame) → the gcluster merges the results. When the PARTITION BY column matches the table distribution key, no data shuffle is required, yielding the best performance. Multiple window functions sharing the same partition and ordering reuse a single redistribution step.

Quick Reference

Requirement Function
Unique rank within group ROW_NUMBER()
Rank with gaps / without gaps RANK() / DENSE_RANK()
Top N per group ROW_NUMBER() + outer filter
Cumulative sum (YTD) SUM() OVER (ORDER BY ...)
Moving average (N periods) AVG() OVER (ROWS ...)
Previous period (MoM) LAG(col, 1)
Same period last year (YoY) LAG(col, 12)
Percentage of total col / SUM(col) OVER (PARTITION BY ...)
Subtotals & grand total GROUP BY ROLLUP(...)
Cross‑dimensional subtotals GROUP BY CUBE(...)
First / last value in partition FIRST_VALUE() / LAST_VALUE()

Whether you are building a daily sales dashboard or a complex financial report, mastering these window functions in your gbase database will help you deliver insights faster and with less code.

Top comments (0)