DEV Community

Cover image for Set operations of Excel inter row data (intersection, union, difference)
esProc Desktop
esProc Desktop

Posted on

Set operations of Excel inter row data (intersection, union, difference)

In the workplace, it is common to encounter tasks that require merging and comparing data from two or more spreadsheets, where duplicate data needs to be picked out. This is essentially a conventional set operation, but Excel cannot directly support it and requires a series of functions to implement it, which is cumbersome and not easy to understand.

Here is a useful plugin that can directly perform set operations.

This plugin is called SPL, which excels in handling various complex Excel operations and is used to assist Excel calculations, doubling the work efficiency in seconds.

For example, there are product names and salesperson names in the top ten sales rankings for January and February:

Sales table of Jan.

Sales table of Feb.

We use SPL to calculate intersection, union, and difference based on key columns and entire row data respectively.

1. Set operations based on key columns

(1)Intersect

Using the Product Name as the key column, identify the product data that entered the top ten for both January and February (just list the salesperson’s name for January).

The operation is simple, enter the following code in the blank cell:

=spl("=\[E(?1),E(?2)\].merge@oi(ProductName)",Jan!A1:B11,Feb!A1:B11)
Enter fullscreen mode Exit fullscreen mode

As shown in the figure:

The code

Then ctrl-enter returns the result directly, obtaining the intersection of product names in two sheets.

Result table of intersection

Code explanation: =spl()indicates calling the SPL plugin, ?1 and ?2 represents the data parameters involved in the calculation, here referring to Jan!A1:B11 and Feb!A1:B11. Product Name is the specified key column, and [].merge()represents merging the tables within [], the letter after the symbol @ represents the method of merging, i represents merging after intersection, and o represents regarding the key column ProductName to be unordered. Therefore, the meaning of the entire code means to intersect the tables of January and February according to ProductName.

Similarly, to implement union and difference, you only need to change the letter option after @.

(2)Union

Using the Product Name as the key column, identify the product data that entered the top ten once or more in January and February (just list the salesperson’s name for January).

To calculate union, just change the letter after @ to u, and the code is as follows:

=spl("=[E(?1),E(?2)].merge@ou(ProductName)",Jan!A1:B11,Feb!A1:B11)
Enter fullscreen mode Exit fullscreen mode

Result table of union

(3) Difference

Identify product sales data that entered the top ten in January but did not enter the top ten in February.

The difference is represented by the letter d

=spl("=\[E(?1),E(?2)\].merge@od(ProductName)",Jan!A1:C11,Feb!B1: B11)
Enter fullscreen mode Exit fullscreen mode

Result table of difference

If there are two or more key columns, simply add them in parentheses. For example, merge by the intersection of the ProductName and Name fields: merge@oi(ProductName,Name)

Of course, the merge function can also implement merging based on whether the entire row of data is the same, simply removing the field names in parentheses, such as Example 2 below.

2. Set operations based on entire row data

(1)Intersect

Identify the products and salesperson data that both entered the top ten in January and February.

=spl("=\[E(?1),E(?2)\].merge@oi()",Jan!A1:B11,Feb!A1:B11)
Enter fullscreen mode Exit fullscreen mode

Result table of intersection

(2)Union

Identify data on products and salespeople who have entered the top ten once or more.

=spl("=\[E(?1),E(?2)\].merge@ou()",Jan!A1:B11,Feb!A1:B11)
Enter fullscreen mode Exit fullscreen mode

Result table of union

(3)Difference

Identify the products and salespeople data that entered the top ten in January but did not enter the top ten in February.

=spl("=\[E(?1),E(?2)\].merge@od()",Jan!A1:B11,Feb!A1:B11)
Enter fullscreen mode Exit fullscreen mode

Result table of difference

3. Set operations on multiple sheets

Sometimes, there may be situations where multiple spreadsheets are merged and compared, such as the sales data of top10Sales.xlsx for multiple months, and the number of months may increase over time.

top10Sales.xlsx

We now need to perform set operations on data from multiple months, and of course, we can continue to use the writing method of [E(?1),E(?2),E(?3),……].merge() in the above examples. However, when there are many sheets, it is still a bit troublesome. At this point, we can operate more efficiently in the SPL ide.

(1)Intersecting multiple sheets

Identify the list of products that have all entered the top ten in recent months.

Enter the following code in the IDE:

    A
1   =file("top10Sales.xlsx").xlsopen()
2   =A1.(A1.xlsimport@t(;stname)).merge@oi(ProductName)
Enter fullscreen mode Exit fullscreen mode

A1 represents reading an Excel file and returning the names, number of rows and columns of all sheets in the file:

Result of A1 coding

A2 opens all data in the sheets based on the sheet name stname, and then calculates the intersection

Result of A2 coding

Similarly, you can calculate union and difference.

(2)Union

Identify the products that have made it to the top ten once or more:

    A
1   =file("top10Sales.xlsx").xlsopen()
2   =A1.(A1.xlsimport@t(;stname)).merge@ou(ProductName)
Enter fullscreen mode Exit fullscreen mode

Result table of union

(3) Difference

Identify the list of products that made it to the top ten in January but did not make it to the top ten in other months:

    A
1   =file("top10Sales.xlsx").xlsopen()
2   =A1.(A1.xlsimport@t(;stname)).merge@od(ProductName)
Enter fullscreen mode Exit fullscreen mode

Result table of difference

For complex set operations in Excel, with the help of SPL, all can be done with a merge() function!!!

In SPL, there are many flexible data processing functions that can handle various complex Excel operations, and the syntax is also simple and very user-friendly.

And SPL also comes with rich reference cases Desktop and Excel Data Processing Cases, where 90% of Excel problems in the workplace can be solved. The code in the book can be basically copied and then used with slight modifications.

SPL download address: esProc Desktop Download

Plugin Installation Method: SPL XLL Installation and Configuration

Top comments (0)