How to execute SQL on CSV files with esProc SPL

esProc SPL supports simple SQL and can directly execute SQL statements on structured text files such as CSV, allowing for SQL calculations without the need for a database.
Download esProc SPL first: https://www.esproc.com/download-esproc/
Recommend the standard version, find the corresponding version and download it to install.
Prepare a CSV file of standard format.

Start esProc IDE, create a new script, and write SQL in cell A1:

$select * from d:/Orders.csv

Note that the preceding $ indicates that this statement is SQL, otherwise it is assumed to be a native SPL statement by default.
Press Ctrl-F9 to execute, and on the right side, you can see that the file is read in the form of a data table:

The SPL code is written in these cells, and after each cell is executed, there will be a value that can be seen in real-time on the right side, which is very convenient for debugging.
TAB separated txt files can also be supported, such as this file:

Write in A2:

$select * from d:/Orders.txt

And press Alt-Enter to execute the cell, esProc IDE allows for single step execution of a certain statement, which is particularly convenient for interactive data analysis.
You will see the same result on the right:

The SQL of esProc considers txt to be TAB separated and csv to be comma separated, and will automatically process them based on the extension. If the extension is written incorrectly, the result may be read incorrectly.
Some files do not have a title line:

Here, SPL native syntax is required to read it. Write in A3:

$select * from {file("d:/OrdersNT.csv").import@c()}

In the code, {…} is SPL native syntax, which means reading a text file without a title line and separated by commas.
After execution, the results can be seen:

esProc will automatically represent column names with sequence numbers.
Similarly, text with different separators can also be read using SPL native syntax.

Then, we can try calculating.
First, look at WHERE and write in A4:

$select * from d:/Orders.csv where Amount>=100 and Client like '%s%'

The result after execution:

Filter out records with Amount greater than 100 and Client containing s.
Date calculation:

A5:$select * from d:/Orders.csv where (OrderDate<date('2020-01-01') and Amount<=100) or (OrderDate>=date

CASE WHEN is also supported:

A6: $select case year(OrderDate) when 2010 then 'this year' when 2010 then 'last year' else 'previous years' end from d:/Orders.csv

Then take a look at GROUP BY:

A7:$select Client,year(OrderDate) y,sum(Amount) amt from d:/Orders.csv group by Client,year(OrderDate)

HAVING is also supported:

A8:$select Client,year(OrderDate) y,sum(Amount) amt from d:/Orders.csv 
group by Client,year(OrderDate) having sum(Amount)>1000

It also supports JOIN statements:

A9: $select o.OrderId,o.Client,e.Name e.Dept,e.EId from d:/Orders.csv o 
left join d:/Emp.csv e on o.SellerId=e.Eid

It seems that the support is quite comprehensive. The official claim is that it supports SQL92, and even WITH can be freely supported, but it does not yet support window functions. However, it’s not a big problem. Writing window functions is already quite cumbersome. The native SPL syntax of esProc is much simpler than window functions, so there’s not much need to support window functions anymore.

esProc can also execute SQL in a command-line manner, such as the GROUP BY mentioned above:

In addition to text files, esProc’s SQL also supports data sources such as XLS, MongoDB, RESTful and JSON, with similar usage. Interested parties can visit the official website.

Your AI Code Assistant

Implement features, document your code, or refactor your projects.
Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE