We have a database table HEADER, which has data as follows:
There is no connection between ENG1, ENG2 and ENG3, and there are null values under them. We can record them as ENGX. We are trying to sort the table by UPDATEDDATE, and, if ENGX in the current record is null, enter ENGX in the previous record in. Below is the desired result table:
SELECT H.*, MAX(ENG1) OVER (PARTITION BY NAME, GRP1 ) AS IMPUTED_ENG1
, MAX(ENG2) OVER (PARTITION BY NAME, GRP2 ) AS IMPUTED_ENG2
, MAX(ENG3) OVER (PARTITION BY NAME, GRP3 ) AS IMPUTED_ENG3
FROM (
SELECT H.*, COUNT(ENG1) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP1
, COUNT(ENG2) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP2
, COUNT(ENG3) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP3
FROM HEADER H
) H;
It is easy to approach the task using the natural way of thinking. Sort rows by UPDATEDDATE, and for each null ENGX, assign the previous ENGX value to it. As SQL sets are unordered, we need the window function to mark ENGX null and non-null values with different identifiers and populate desired values to nulls. This is complex because we need two SELECTs.
It is simple to implement the algorithm using the open-source esProc SPL:
SPL sets are ordered. It is easy for it to implement computations between neighboring values/records.
Top comments (1)
Learn more about open source SPL:github.com/SPLWare/esProc