DEV Community

Judy
Judy

Posted on

2 1 1 1 1

Query a CSV File According to Specified Conditions #eg67

Problem description & analysis

Below is part of the CSV file books.csv:

"book_id","books_count","authors","original_publication_year","title","work_text_reviews_count","ratings_1","ratings_2","ratings_3","ratings_4","ratings_5","ratings_count","work_ratings_count","average_rating"

1,321,"author1",2013,"title1",81786,171184,194374,347116,148563,179728,1040965,1095481,2.82

2,399,"author2",2018,"title2",90118,184637,195711,325457,120472,153599,979876,988025,2.84

3,269,"author3",2007,"title3",81776,155035,145240,308206,142679,168599,919759,977895,2.85

4,229,"author4",2016,"title4",116917,184987,159633,389774,119616,142820,996830,1038375,2.76

5,356,"author5",2001,"title5",117952,167016,102358,345956,148374,131290,894994,924198,2.88

...

Tasks:

  1. List titles of all books whose ratings_5 values are greater than 150000.

  2. List titles of all books whose reviews count are at least 100000 and whose 3-star rating occupies at least 40%.

Below are the desired results:

Task 1:

title1

title2

title3

title7

title8

Task 2:

title28

title33

title34

title39

title41

Solution

Write the following script p1.dfx in esProc:

Image description
Explanation:

A1  Import the CSV file, during which the first row is read as column headers.

A2  Find titles of all books that get more than 150000 5-star ratings.

A3  Find titles of all books whose reviews count are at least 100000 and whose 3-star rating occupies at least 40%.

Find how to integrate the script code into a Java program in How to Call an SPL Script in Java.

SPL open source address

Download

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (1)

Collapse
 
esproc_spl profile image
Judy

Download it and try it, it will surprise you!

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay