DEV Community

Cover image for Calculate a Pair of Minimum Values that Meet the Criteria within the Group — From SQL to SPL #12
Judith-Data-Processing-Hacks
Judith-Data-Processing-Hacks

Posted on

2 1 1 1 1

Calculate a Pair of Minimum Values that Meet the Criteria within the Group — From SQL to SPL #12

Problem description & analysis:

A table stores events that occur for multiple accounts on multiple dates.

source table

Task: Now we need to find a pair of events that meet the criteria under each account, namely: event a with the earliest date, and event b with the earliest date among events that are more than 30 days away from event a.

expected table

Solutions:

We can sort by date and group by account, get the first record a from each group, then filter out all records that are more than 30 days away from event a, and also get the first record b. It is not too difficult for SQL to extract a using window functions, but it cannot keep the subset and continue filtering b. This requires a shift in thinking. After getting a, use the original table and a to join and calculate all events that are more than 30 days away from event a. Then use a similar method to find a to obtain b. Finally, union a and b and sort it to obtain the result. SQL involves multiple CTE clauses, which is cumbersome. There are two times to retrieve the first record within the group, and SQL does not have a natural sequence number, and it is also troublesome to create an additional sequence number before retrieving it.

SPL Solution:

After SPL grouping, the grouped subsets can be retained, so this task can be implemented using the previous approach. SPL naturally supports sequence numbers, which can easily select the first record, the first record within the group, and the first record of the filtered result.

SPL code

A1: Load data from the database and sort it by date.

A2: Group by account using the group function, with #2 representing the second field.

A3: Select the first record from each group, and then filter out records that are more than 30 days away from the first record, and also select the first record; Union 2 records; Finally, union the calculation results of each group. A2.~1 represents the first record within the group, and can be abbreviated as ~1 if there is no misunderstanding about its scope of action. The select function is used for filtering, where @1 represents getting the first filtering result.


esProc SPL is free to download👉🏻: esProc SPL FREE Download

Image of Quadratic

Free AI chart generator

Upload data, describe your vision, and get Python-powered, AI-generated charts instantly.

Try Quadratic free

Top comments (1)

Collapse
 
judith677 profile image
Judith-Data-Processing-Hacks

Feel free to download esProc SPL and leave your feedback!

👩🏻‍💻Discord
👨‍💻Reddit

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

DEV is better (more customized, reading settings like dark mode etc) when you're signed in!

Okay