DEV Community

Hui Zheng (I/Trust/You)
Hui Zheng (I/Trust/You)

Posted on

BigQuery SQL Tip: QUALIFY clause

In any SQL statement, There are two types of columns.

  • columns available via the FROM clause
  • columns as SELECT list aliases

I learn something new about QUALIFY clause in BigQuery lately on how it could access these columns differently based on the position of window definition.

Please see example below.

WITH a AS (
    SELECT 
        *
    FROM UNNEST([STRUCT(1 AS val, 'in' AS canopy),
    (2, 'out'),
    (3, 'above'),
    (1, 'in')
    ])
)

-- Approach 1: QUALIFY only have access to columns available via the FROM clause

-- Case 1.1: 

-- SELECT
--     CASE WHEN canopy = 'above' THEN 'in'
--     ELSE canopy
--     END AS mapped_canopy,
--     val,
--     ROW_NUMBER() OVER w1 as rn,
-- FROM a
-- WHERE val IS NOT NULL
-- QUALIFY rn = 1
-- WINDOW w1 AS (PARTITION BY canopy)

-- Case 2: Invalid SQL

-- SELECT
--     CASE WHEN canopy = 'above' THEN 'in'
--     ELSE canopy
--     END AS mapped_canopy,
--     val,
--     ROW_NUMBER() OVER w1 as rn,
-- FROM a
-- WHERE val IS NOT NULL
-- QUALIFY rn = 1
-- WINDOW w1 AS (PARTITION BY mapped_canopy)

-- Approach 2: QUALIFY have access to both columns via the FROM clause and columns as SELECT list aliases

-- Case 3:

SELECT
    CASE WHEN canopy = 'above' THEN 'in'
    ELSE canopy
    END AS mapped_canopy,
    val
FROM a
WHERE val IS NOT NULL
QUALIFY ROW_NUMBER() OVER (PARTITION BY canopy) = 1

-- Case 4

-- SELECT
--     CASE WHEN canopy = 'above' THEN 'in'
--     ELSE canopy
--     END AS mapped_canopy,
--     val
-- FROM a
-- WHERE val IS NOT NULL
-- QUALIFY ROW_NUMBER() OVER (PARTITION BY mapped_canopy) = 1
Enter fullscreen mode Exit fullscreen mode

Top comments (0)