Brandon C

Posted on Feb 1, 2023

Intro to Enumerables in Ruby, and an example with Active Record

#beginners #ruby #tutorial #codenewbie

As a language that deals with server side applications, Ruby is often used to handle large amounts of data, and usually a way of parsing the data is needed. Ruby has access to enumerable methods for arrays and objects that can be used to do things like find, organize, or filter certain pieces of data especially when these methods are used alongside each other.

If you're familiar with JavaScript, map , filter, and find reappear in Ruby but they also have alternate names as collect, select, and detect while still functioning similarly to their JavaScript counterparts. map/collect creates a new array with the returns of the code block written, filter/select/find_all creates a new array with the objects that return a true value, find/detect returns the first object that returns a true value for the code block.

p ["cat", "dog", "bird"].map{|i| i.upcase }
//["CAT", "DOG", "BIRD"]
p [1,2,3,4,5].filter{|i| i > 2 }
//[3, 4, 5]
p [1,2,3,4,5].find{|i| i > 2 }
//3

count will count the amount of elements in the array and return the value, its a similar to the length method in JavaScript. Providing an argument or code block to count will return the number of results that fulfill the argument or block code conditional. tally creates a hash which counts the number of times each element appears in a given array, the element is assigned as a key while the count for each element is assigned as the value. any? will return true if any of the elements passes a conditional while all? will return true only if all of the elements passes a conditional. none? returns true if none of the elements pass a conditional code block while one? returns true if only one element passes the conditional code block. max returns the element with the maximum value, min returns the minimum value, and minmax returns the maximum and minimum value.

p ["a","a","b","b","b","c"].count
//6
p ["a","a","b","b","b","c"].count("a")
p ["a","a","b","b","b","c"].count{|i| i == "a" }
//2
p ["a","a","b","b","b","c"].tally
//{"a"=>2, "b"=>3, "c"=>1}
p [3,5,1,4,2].any?{|i| i > 3}
//true
p [3,5,1,4,2].all?{|i| i > 3}
//false
p [3,5,1,4,2].all?{|i| i > 0}
//true
p [3,5,1,4,2].none?{|i| i > 5 }
//true
p [3,5,1,4,2].one?{|i| i > 4 }
//true
p [3,5,1,4,2].max
//5
p ["kitty", "dog", "bird"].max{|a,b| a.length <=> b.length }
//"kitty"
p [3,5,1,4,2].min
//1
p ["kitty", "dog", "bird"].min{|a,b| a.length <=> b.length }
//"dog"
p [3,5,1,4,2].minmax
//[1, 5]
p ["kitty", "dog", "bird"].minmax{|a,b| a.length <=> b.length }
//["dog", "kitty"]

chunk creates a series of arrays each starting with the return value of the corresponding code block and separates the elements into nested arrays depending on whether they share the same return value consecutively. First, the return value of the code block generates a new array and the return value is placed as the first element of the new array, then the element that was passed into chunk is placed in a nested array after the chunk return value. At the end of chunk enumeration only an Enumerator class instance is returned which can be accessed with other enumerable methods like each or map.

array = [3,3,5,1,4,2].chunk{|a| a }
array.each{|i|p i}
// [3, [3, 3]] [5, [5]] [1, [1]] [4, [4]] [2, [2]]

Similar to chunk , chunk_while separates the elements into arrays based on the conditional return value when the elements are compared to each other depending on the order of the array elements provided and the conditional code block used. Every array chunk builds up elements as the conditional code block returns true then the new array chunk starts when the return of the conditional code block is false. Both chunk enumerables become more useful when we combine them with sorting methods.

array2 = [3,7,3,5,1,4,2].chunk_while{|a,b| a > b }
array2.each{|i|p i}
// [3], [7, 3], [5, 1], [4, 2]

Theres two common sorting enumerable methods typically used: sort and sort_by. sort works by comparing two objects in a code block and sorting based on the results of the comparison, a negative return value means the first object is placed ahead of the second, while positive return places the first object behind the second. sort_by works by generating a temporary array and takes a single argument at a time then organizes the array based on the results, which makes it more ideal when working with larger and more detailed information that needs to be compared to each other, especially when methods have to be called in the comparisons repeatedly. The behavior of sort_by only taking one element at a time is also ideal for working with hashes. Reversing the order of sort_by can be done by using the reverse method or putting a minus sign in front of the code block object when dealing with numbers.

p [3,5,1,4,2].sort{|a,b| a <=> b  }
p [3,5,1,4,2].sort{|a,b| a - b  }
//[1, 2, 3, 4, 5]
p [3,5,1,4,2].sort{|a,b| b <=> a  }
p [3,5,1,4,2].sort{|a,b| b - a  }
//[5, 4, 3, 2, 1]

array = [["a","b","c"], ["x","j","k","i"] , [1,8,2,4,6,4]]
p array.sort_by{|a| a.count }
//[["a", "b", "c"], ["x", "j", "k", "i"], [1, 8, 2, 4, 6, 4]]
p array.sort_by{|a| a.count }.reverse
//[[1, 8, 2, 4, 6, 4], ["x", "j", "k", "i"], ["a", "b", "c"]]

Some enumerables can also be utilized on hashes and take two arguments associated with the key-value pairs, the first argument is the key and the second is the value associated to the key. The result of map on a hash will still return an array of returns from the code block while filter will return a hash of the remaining key-value pairs that pass a conditional. The result of calling sort_by on a hash will return an array of arrays with each array containing a key and value pair from the original hash. This means you can sort the values of a hash to find which key corresponds with the highest value. As a comparison, the default behavior of sort on a hash will only sort by keys and not values, and has no access to the same two arguments of key and value that sort_by provides. You can separate the values from the keys, but then the keys will be missing from the returned array. The ability to enumerate using code blocks with hashes allows for more flexibility when searching for data inside nested arrays and hashes.

hash = {"a":5, "b":3, "c":4, "d":2, "e":1}
p hash.map{|k,v| v * 10 }
//[50, 30, 40, 20, 10]
p hash.filter{|k,v| v > 3 }
//{:a=>5, :c=>4}
p hash.sort_by{|k,v| v} 
//[[:e, 1], [:d, 2], [:b, 3], [:c, 4], [:a, 5]]
p hash.sort_by{|k,v| k}
//[[:a, 5], [:b, 3], [:c, 4], [:d, 2], [:e, 1]]
hash2 =  {d: 5 , e: 3 , a:4 , c: 2 , b:1}
p hash2.sort
//[[:a, 4], [:b, 1], [:c, 2], [:d, 5], [:e, 3]]
p hash2.values.sort
//[1, 2, 3, 4, 5]

max, min and minmax have counterparts max_by, min_by , and minmax_by which take a single argument at a time and are more suited for hashes and detailed data for the same reasons sort_by is used.

hash2 =  {d: 5 , e: 3 , a:4 , c: 2 , b:1}
p hash2.max_by{|k,v| v }
//[:d, 5]
p hash2.min_by{|k,v| v }
//[:b, 1]
p hash2.minmax_by{|k,v| v }
/[[:b, 1], [:d, 5]]

chunk can be called on a hash and still behaves similarly to its array method, creating a series of arrays based on returns for each individual key value pair in the code block and converting the key value pairs into arrays.

 hash = {a:5, b:1, c:4, d:2, e:5}
 chunk_hash = hash.chunk{|k,v| v > 1 }
 chunk_hash.each{|i| p i}
//[true, [[:a, 5]] ] 
//[false, [[:b, 1]] ] 
//[true, [[:c, 4], [:d, 2], [:e, 5]] ]

Heres an example of using enumerables to find data: Lets say we have two ActiveRecord models associated to sql table data, one which is Passenger and another which is Booking. Passengers have many bookings but only the passengers table only consists of a primary Id key and a name column with a string while bookings only have one passenger as a foreign key.

Passengers table:

Bookings table:

We want to find out what the highest number of bookings for a single passenger is based on the amount of times a passenger foreign key is found throughout all the bookings. We can access all the bookings by using Booking.all then chaining map to return an array of all the passenger foreign keys set in the bookings table as passenger_id

Booking.all.map{|booking| booking.passenger_id}
//[6, 34, 26, 30, 29, 3, 36, 16, 24, 10, 5, 19, 13, 39, 32, 37, 45, ....]

So we have this large array of unorganized passenger_id and we need to count how many times each passenger_id appears in the array. From here, we can use tally to create a hash consisting of key and value pairs, the keys will represent the passenger_id and the values represent the amount of times it shows up in the array.

tally_bookings = Booking.all.map{|booking| booking.passenger_id}.tally
//{6=>4,   34=>5,  26=>3,  30=>3,  29=>4,  3=>5,  36=>6,  16=>6,  24=>4,  10=>5,  5=>5, ...}
//passenger_id 34 shows up five times in the array

So now we have a hash containing the information we need but it isn't ordered. Here, we can use max_by to see the highest tally count.

highest_booking = tally_bookings.max_by{|k,v| v}
//[36, 6]

We get an array that has the passenger_id in the first index and the highest tally count in the second index. But what if there was more than one passenger that has 6 bookings? Theres different ways to find the other passengers. Since we have highest_booking, we can filter tally_bookings to find the passenger_ids.

highest_booking_passengers = tally_bookings.filter{|k,v| v == highest_booking[1]}
//{36=>6, 16=>6, 44=>6, 46=>6}

From here, we can use map or each to retrieve the passengers by using the built in ActiveRecord method find to match the foreign id keys of our hash with primary id keys of the Passenger instances.

highest_booking_passengers.map{|k,v| Passenger.find(k)}
//[#<Passenger:0x0000563b635eade0 id: 36, name: "Chae Jacobson">,
// #<Passenger:0x0000563b63596970 id: 16, name: "Mary Erdman V">,
 //#<Passenger:0x0000563b6358ef40 id: 44, name: "Lauryn Gusikowski">,
 //#<Passenger:0x0000563b6358d258 id: 46, name: "Ricky Strosin DC">]

An alternate way of finding the data we need is by using sort_by to arrange the tally into an array of key value pair arrays.

sorted_tally = tally_bookings.sort_by{|k,v| v}.reverse
//[[36, 6],  [16, 6],  [44, 6], [46, 6], [34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], ...]

Then, we can run filter on the array.

sorted_tally_max =  sorted_tally.filter{|b|  b[1] == sorted_tally[0][1]}
//[[36, 6], [16, 6], [44, 6], [46, 6]]

We can use chunk to organize the data into a series of arrays containing nested arrays based on tally count.

chunked_tally = sorted_tally.chunk{|i| i[1]}
chunked_tally.map{|i| i}
//[[6, [[36, 6], [16, 6], [44, 6], [46, 6]]],
 //[5, [[34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], [17, 5], [14, 5]]],
 //[4, [[6, 4], [29, 4], [24, 4], [13, 4], [45, 4], [35, 4], [49, 4], [41, 4], [48, 4], [21, 4], [31, 4], [28, 4], [23, 4], [11, 4], [33, 4]]],
 //[3, [[26, 3], [30, 3], [32, 3], [38, 3], [1, 3], [2, 3], [43, 3], [9, 3], [7, 3], [25, 3], [47, 3], [27, 3]]],
 //[2, [[22, 2], [12, 2], [4, 2], [18, 2], [40, 2], [42, 2], [15, 2]]],
 //[1, [[50, 1], [8, 1], [52, 1]]]]

To only return arrays of sorted passenger_id grouped together based on tally, we can even use a nested map. First, we select only the nested arrays of key value pairs with i[1], then we use map again to only return the passenger_id with j[0]. We get an array of arrays.

chunked_tally.map{|i| i[1].map{ |j| j[0]  }}
 //[[36, 16, 44, 46], 
//[34, 3, 10, 5, 19, 39, 37, 17, 14], 
//[6, 29, 24, 13, 45, 35, 49, 41, 48, 21, 31, 28, 23, 11, 33],  
//[26, 30, 32, 38, 1, 2, 43, 9, 7, 25, 47, 27], 
//[22, 12, 4, 18, 40, 42, 15], [50, 8, 52]]

We can even use chunk_while to group the data into arrays of different booking tallys without needing a return in the outer array first element.

chunked_while_tally = sorted_tally.chunk_while{|a,b| a[1] == b[1]}
chunked_while_tally.map{|i| i}
//[[[36, 6], [16, 6], [44, 6], [46, 6]],
 //[[34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], [17, 5], [14, 5]],
 //[[6, 4], [29, 4], [24, 4], [13, 4], [45, 4], [35, 4], [49, 4], [41, 4], [48, 4], [21, 4], [31, 4], [28, 4], [23, 4], [11, 4], [33, 4]],
 //[[26, 3], [30, 3], [32, 3], [38, 3], [1, 3], [2, 3], [43, 3], [9, 3], [7, 3], [25, 3], [47, 3], [27, 3]],
 //[[22, 2], [12, 2], [4, 2], [18, 2], [40, 2], [42, 2], [15, 2]],
 //[[50, 1], [8, 1], [52, 1]]]

We can sort it again by using a nested map like the other chunk array, this time we dont need to select the nested array to map with i[1], so we use i for the nested map

chunked_while_tally.map{|i| i.map{|j| j[0]} }
//[[36, 16, 44, 46], 
//[34, 3, 10, 5, 19, 39, 37, 17, 14], 
//[6, 29, 24, 13, 45, 35, 49, 41, 48, 21, 31, 28, 23, 11, 33], 
//[26, 30, 32, 38, 1, 2, 43, 9, 7, 25, 47, 27], 
//[22, 12, 4, 18, 40, 42, 15], 
//[50, 8, 52]]

The many enumerables Ruby provides are powerful tools and gives us many ways to organize and find the data we're looking for, especially when working with large and growing databases.

Resources

-https://flatironschool.com/
-https://ruby-doc.org/2.7.7/Enumerable.html

DEV Community

Intro to Enumerables in Ruby, and an example with Active Record

Resources

Top comments (0)

Read next

Big O Notation

How to delete cache from keycloak theme

JavaScript30 - 6 Ajax Type Ahead

Solving the Challenge of Connecting Stimulus Controllers Inside Shadow DOM