DEV Community

vindarel
vindarel

Posted on • Edited on

1

Common Lisp's groupBy is Serapeum:assort

It's the second time I search for such a function so here is it: the functional "group by" utility you are looking for is Serapeum's assort.

Example

I have a list of sell objects:

((:|isbn| "9782290252499" :|quantity| 2 :|price| 6.5d0 :|vat| 5.5d0
  :|distributor| "UNION DISTRIBUTION - UD" :|discount| 35.0d0 :|type_name|
  "Livre" :|type_vat| 5.5d0 :|price_bought| "price_bought" :|price_sold|
  5.915d0 :|quantity_sold| 15 :|sold_date| "2024-04-03 10:31:56")
 (:|isbn| "9791034742752" :|quantity| 1 :|price| 23.5d0 :|vat| 5.5d0
  :|distributor| "MDS" :|discount| 35.0d0 :|type_name| "Livre" :|type_vat|
  5.5d0 :|price_bought| "price_bought" :|price_sold| 22.325d0 :|quantity_sold|
  1 :|sold_date| "2024-04-03 08:41:09")
  
)
Enter fullscreen mode Exit fullscreen mode

Here it is a list of plists: a plist is a list that alternates a key (as a symbol) and a value.

I can have more than one plist for the same ISBN number (the "978…"). I want to group all of them together, so that it will be easier to work with them (I need to sum the total sold for each unique ISBN).

I can write my own loop, but I can also just use serapeum's assort:

CL-USER> (assort *SELLS* :key #'isbn)
(((:|isbn| "9782290252499" :|quantity| 2 :|price| 6.5d0 :|vat| 5.5d0
   :|distributor| "UNION DISTRIBUTION - UD" :|discount| 35.0d0 :|type_name|
   "Livre" :|type_vat| 5.5d0 :|price_bought| "price_bought" :|price_sold|
   5.915d0 :|quantity_sold| 15 :|sold_date| "2024-04-03 10:31:56")
  (:|isbn| "9782290252499" :|quantity| 1 :|price| 6.5d0 :|vat| 5.5d0
   :|distributor| "UNION DISTRIBUTION - UD" :|discount| 35.0d0 :|type_name|
   "Livre" :|type_vat| 5.5d0 :|price_bought| 0 :|price_sold|
   5.915d0 :|quantity_sold| 3 :|sold_date| "2024-04-03 10:55:56"))
  
)
Enter fullscreen mode Exit fullscreen mode

Yes, we have a triple (((, we have to follow. That's why it's easier sometimes to create an object class and to see printed object representations, or to use hash-tables (aka dictionaries). Serapeum has the great dict helper if you don't know it. I included it in my workflow. But so far I am following.

assort's full docstring

(assort seq &key key test start end hash)
Enter fullscreen mode Exit fullscreen mode

Return SEQ assorted by KEY.

 (assort (iota 10)
         :key (lambda (n) (mod n 3)))
 => '((0 3 6 9) (1 4 7) (2 5 8))
Enter fullscreen mode Exit fullscreen mode

Groups are ordered as encountered. This property means you could, in principle, use assort to implement remove-duplicates by taking the first element of each group:

 (mapcar #'first (assort list))
 ≡ (remove-duplicates list :from-end t)
Enter fullscreen mode Exit fullscreen mode

However, if TEST is ambiguous (a partial order), and an element could qualify as a member of more than one group, then it is not guaranteed that it will end up in the leftmost group that it could be a member of.

(assort '(1 2 1 2 1 2) :test #'<=)
=> '((1 1) (2 2 1 2))
Enter fullscreen mode Exit fullscreen mode

The default algorithm used by assort is, in the worst case, O(n) in the number of groups. If HASH is specified, then a hash table is used instead. However TEST must be acceptable as the :test argument to make-hash-table.


We also have serapeum's frequencies to check for our EAN13 frequencies:

CL-USER> (frequencies *SELLS* :key #'isbn)

 (dict  
  "9782290252499" 2
  "9791034742752" 1
  "9782361936150" 1
  "9782956296348" 1
  "9782846405287" 1
  "9782492939075" 1
  "9782889755462" 1
  "9791034747979" 1
  "9782203226692" 1
  "9791092752953" 1
  "9782874263699" 1 
 ) 
11
Enter fullscreen mode Exit fullscreen mode

Look at this "dict" representation, it's a hash-table, but user-readable, and that can be read back in by the lisp reader (if you serialize it for instance). You know this already if you read the CL Cookbook.

That's all, googlers o/


Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more