As a language that deals with server side applications, Ruby is often used to handle large amounts of data, and usually a way of parsing the data is needed. Ruby has access to enumerable methods for arrays and objects that can be used to do things like find, organize, or filter certain pieces of data especially when these methods are used alongside each other.
If you're familiar with JavaScript, map
, filter
, and find
reappear in Ruby but they also have alternate names as collect
, select
, and detect
while still functioning similarly to their JavaScript counterparts. map
/collect
creates a new array with the returns of the code block written, filter
/select
/find_all
creates a new array with the objects that return a true value, find
/detect
returns the first object that returns a true value for the code block.
p ["cat", "dog", "bird"].map{|i| i.upcase }
//["CAT", "DOG", "BIRD"]
p [1,2,3,4,5].filter{|i| i > 2 }
//[3, 4, 5]
p [1,2,3,4,5].find{|i| i > 2 }
//3
count
will count the amount of elements in the array and return the value, its a similar to the length
method in JavaScript. Providing an argument or code block to count
will return the number of results that fulfill the argument or block code conditional. tally
creates a hash which counts the number of times each element appears in a given array, the element is assigned as a key while the count for each element is assigned as the value. any?
will return true if any of the elements passes a conditional while all?
will return true only if all of the elements passes a conditional. none?
returns true if none of the elements pass a conditional code block while one?
returns true if only one element passes the conditional code block. max
returns the element with the maximum value, min
returns the minimum value, and minmax
returns the maximum and minimum value.
p ["a","a","b","b","b","c"].count
//6
p ["a","a","b","b","b","c"].count("a")
p ["a","a","b","b","b","c"].count{|i| i == "a" }
//2
p ["a","a","b","b","b","c"].tally
//{"a"=>2, "b"=>3, "c"=>1}
p [3,5,1,4,2].any?{|i| i > 3}
//true
p [3,5,1,4,2].all?{|i| i > 3}
//false
p [3,5,1,4,2].all?{|i| i > 0}
//true
p [3,5,1,4,2].none?{|i| i > 5 }
//true
p [3,5,1,4,2].one?{|i| i > 4 }
//true
p [3,5,1,4,2].max
//5
p ["kitty", "dog", "bird"].max{|a,b| a.length <=> b.length }
//"kitty"
p [3,5,1,4,2].min
//1
p ["kitty", "dog", "bird"].min{|a,b| a.length <=> b.length }
//"dog"
p [3,5,1,4,2].minmax
//[1, 5]
p ["kitty", "dog", "bird"].minmax{|a,b| a.length <=> b.length }
//["dog", "kitty"]
chunk
creates a series of arrays each starting with the return value of the corresponding code block and separates the elements into nested arrays depending on whether they share the same return value consecutively. First, the return value of the code block generates a new array and the return value is placed as the first element of the new array, then the element that was passed into chunk
is placed in a nested array after the chunk
return value. At the end of chunk
enumeration only an Enumerator class instance is returned which can be accessed with other enumerable methods like each
or map
.
array = [3,3,5,1,4,2].chunk{|a| a }
array.each{|i|p i}
// [3, [3, 3]] [5, [5]] [1, [1]] [4, [4]] [2, [2]]
Similar to chunk
, chunk_while
separates the elements into arrays based on the conditional return value when the elements are compared to each other depending on the order of the array elements provided and the conditional code block used. Every array chunk builds up elements as the conditional code block returns true then the new array chunk starts when the return of the conditional code block is false. Both chunk enumerables become more useful when we combine them with sorting methods.
array2 = [3,7,3,5,1,4,2].chunk_while{|a,b| a > b }
array2.each{|i|p i}
// [3], [7, 3], [5, 1], [4, 2]
Theres two common sorting enumerable methods typically used: sort
and sort_by
. sort
works by comparing two objects in a code block and sorting based on the results of the comparison, a negative return value means the first object is placed ahead of the second, while positive return places the first object behind the second. sort_by
works by generating a temporary array and takes a single argument at a time then organizes the array based on the results, which makes it more ideal when working with larger and more detailed information that needs to be compared to each other, especially when methods have to be called in the comparisons repeatedly. The behavior of sort_by
only taking one element at a time is also ideal for working with hashes. Reversing the order of sort_by
can be done by using the reverse
method or putting a minus sign in front of the code block object when dealing with numbers.
p [3,5,1,4,2].sort{|a,b| a <=> b }
p [3,5,1,4,2].sort{|a,b| a - b }
//[1, 2, 3, 4, 5]
p [3,5,1,4,2].sort{|a,b| b <=> a }
p [3,5,1,4,2].sort{|a,b| b - a }
//[5, 4, 3, 2, 1]
array = [["a","b","c"], ["x","j","k","i"] , [1,8,2,4,6,4]]
p array.sort_by{|a| a.count }
//[["a", "b", "c"], ["x", "j", "k", "i"], [1, 8, 2, 4, 6, 4]]
p array.sort_by{|a| a.count }.reverse
//[[1, 8, 2, 4, 6, 4], ["x", "j", "k", "i"], ["a", "b", "c"]]
Some enumerables can also be utilized on hashes and take two arguments associated with the key-value pairs, the first argument is the key and the second is the value associated to the key. The result of map
on a hash will still return an array of returns from the code block while filter
will return a hash of the remaining key-value pairs that pass a conditional. The result of calling sort_by
on a hash will return an array of arrays with each array containing a key and value pair from the original hash. This means you can sort the values of a hash to find which key corresponds with the highest value. As a comparison, the default behavior of sort
on a hash will only sort by keys and not values, and has no access to the same two arguments of key and value that sort_by
provides. You can separate the values from the keys, but then the keys will be missing from the returned array. The ability to enumerate using code blocks with hashes allows for more flexibility when searching for data inside nested arrays and hashes.
hash = {"a":5, "b":3, "c":4, "d":2, "e":1}
p hash.map{|k,v| v * 10 }
//[50, 30, 40, 20, 10]
p hash.filter{|k,v| v > 3 }
//{:a=>5, :c=>4}
p hash.sort_by{|k,v| v}
//[[:e, 1], [:d, 2], [:b, 3], [:c, 4], [:a, 5]]
p hash.sort_by{|k,v| k}
//[[:a, 5], [:b, 3], [:c, 4], [:d, 2], [:e, 1]]
hash2 = {d: 5 , e: 3 , a:4 , c: 2 , b:1}
p hash2.sort
//[[:a, 4], [:b, 1], [:c, 2], [:d, 5], [:e, 3]]
p hash2.values.sort
//[1, 2, 3, 4, 5]
max
, min
and minmax
have counterparts max_by
, min_by
, and minmax_by
which take a single argument at a time and are more suited for hashes and detailed data for the same reasons sort_by
is used.
hash2 = {d: 5 , e: 3 , a:4 , c: 2 , b:1}
p hash2.max_by{|k,v| v }
//[:d, 5]
p hash2.min_by{|k,v| v }
//[:b, 1]
p hash2.minmax_by{|k,v| v }
/[[:b, 1], [:d, 5]]
chunk
can be called on a hash and still behaves similarly to its array method, creating a series of arrays based on returns for each individual key value pair in the code block and converting the key value pairs into arrays.
hash = {a:5, b:1, c:4, d:2, e:5}
chunk_hash = hash.chunk{|k,v| v > 1 }
chunk_hash.each{|i| p i}
//[true, [[:a, 5]] ]
//[false, [[:b, 1]] ]
//[true, [[:c, 4], [:d, 2], [:e, 5]] ]
Heres an example of using enumerables to find data: Lets say we have two ActiveRecord models associated to sql table data, one which is Passenger and another which is Booking. Passengers have many bookings but only the passengers table only consists of a primary Id key and a name column with a string while bookings only have one passenger as a foreign key.
Passengers table:
Bookings table:
We want to find out what the highest number of bookings for a single passenger is based on the amount of times a passenger foreign key is found throughout all the bookings. We can access all the bookings by using Booking.all
then chaining map
to return an array of all the passenger foreign keys set in the bookings table as passenger_id
Booking.all.map{|booking| booking.passenger_id}
//[6, 34, 26, 30, 29, 3, 36, 16, 24, 10, 5, 19, 13, 39, 32, 37, 45, ....]
So we have this large array of unorganized passenger_id
and we need to count how many times each passenger_id
appears in the array. From here, we can use tally
to create a hash consisting of key and value pairs, the keys will represent the passenger_id
and the values represent the amount of times it shows up in the array.
tally_bookings = Booking.all.map{|booking| booking.passenger_id}.tally
//{6=>4, 34=>5, 26=>3, 30=>3, 29=>4, 3=>5, 36=>6, 16=>6, 24=>4, 10=>5, 5=>5, ...}
//passenger_id 34 shows up five times in the array
So now we have a hash containing the information we need but it isn't ordered. Here, we can use max_by
to see the highest tally count.
highest_booking = tally_bookings.max_by{|k,v| v}
//[36, 6]
We get an array that has the passenger_id
in the first index and the highest tally count in the second index. But what if there was more than one passenger that has 6 bookings? Theres different ways to find the other passengers. Since we have highest_booking
, we can filter tally_bookings
to find the passenger_id
s.
highest_booking_passengers = tally_bookings.filter{|k,v| v == highest_booking[1]}
//{36=>6, 16=>6, 44=>6, 46=>6}
From here, we can use map
or each
to retrieve the passengers by using the built in ActiveRecord method find
to match the foreign id keys of our hash with primary id keys of the Passenger instances.
highest_booking_passengers.map{|k,v| Passenger.find(k)}
//[#<Passenger:0x0000563b635eade0 id: 36, name: "Chae Jacobson">,
// #<Passenger:0x0000563b63596970 id: 16, name: "Mary Erdman V">,
//#<Passenger:0x0000563b6358ef40 id: 44, name: "Lauryn Gusikowski">,
//#<Passenger:0x0000563b6358d258 id: 46, name: "Ricky Strosin DC">]
An alternate way of finding the data we need is by using sort_by
to arrange the tally into an array of key value pair arrays.
sorted_tally = tally_bookings.sort_by{|k,v| v}.reverse
//[[36, 6], [16, 6], [44, 6], [46, 6], [34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], ...]
Then, we can run filter
on the array.
sorted_tally_max = sorted_tally.filter{|b| b[1] == sorted_tally[0][1]}
//[[36, 6], [16, 6], [44, 6], [46, 6]]
We can use chunk
to organize the data into a series of arrays containing nested arrays based on tally count.
chunked_tally = sorted_tally.chunk{|i| i[1]}
chunked_tally.map{|i| i}
//[[6, [[36, 6], [16, 6], [44, 6], [46, 6]]],
//[5, [[34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], [17, 5], [14, 5]]],
//[4, [[6, 4], [29, 4], [24, 4], [13, 4], [45, 4], [35, 4], [49, 4], [41, 4], [48, 4], [21, 4], [31, 4], [28, 4], [23, 4], [11, 4], [33, 4]]],
//[3, [[26, 3], [30, 3], [32, 3], [38, 3], [1, 3], [2, 3], [43, 3], [9, 3], [7, 3], [25, 3], [47, 3], [27, 3]]],
//[2, [[22, 2], [12, 2], [4, 2], [18, 2], [40, 2], [42, 2], [15, 2]]],
//[1, [[50, 1], [8, 1], [52, 1]]]]
To only return arrays of sorted passenger_id
grouped together based on tally, we can even use a nested map. First, we select only the nested arrays of key value pairs with i[1]
, then we use map
again to only return the passenger_id
with j[0]
. We get an array of arrays.
chunked_tally.map{|i| i[1].map{ |j| j[0] }}
//[[36, 16, 44, 46],
//[34, 3, 10, 5, 19, 39, 37, 17, 14],
//[6, 29, 24, 13, 45, 35, 49, 41, 48, 21, 31, 28, 23, 11, 33],
//[26, 30, 32, 38, 1, 2, 43, 9, 7, 25, 47, 27],
//[22, 12, 4, 18, 40, 42, 15], [50, 8, 52]]
We can even use chunk_while
to group the data into arrays of different booking tallys without needing a return in the outer array first element.
chunked_while_tally = sorted_tally.chunk_while{|a,b| a[1] == b[1]}
chunked_while_tally.map{|i| i}
//[[[36, 6], [16, 6], [44, 6], [46, 6]],
//[[34, 5], [3, 5], [10, 5], [5, 5], [19, 5], [39, 5], [37, 5], [17, 5], [14, 5]],
//[[6, 4], [29, 4], [24, 4], [13, 4], [45, 4], [35, 4], [49, 4], [41, 4], [48, 4], [21, 4], [31, 4], [28, 4], [23, 4], [11, 4], [33, 4]],
//[[26, 3], [30, 3], [32, 3], [38, 3], [1, 3], [2, 3], [43, 3], [9, 3], [7, 3], [25, 3], [47, 3], [27, 3]],
//[[22, 2], [12, 2], [4, 2], [18, 2], [40, 2], [42, 2], [15, 2]],
//[[50, 1], [8, 1], [52, 1]]]
We can sort it again by using a nested map like the other chunk array, this time we dont need to select the nested array to map with i[1]
, so we use i
for the nested map
chunked_while_tally.map{|i| i.map{|j| j[0]} }
//[[36, 16, 44, 46],
//[34, 3, 10, 5, 19, 39, 37, 17, 14],
//[6, 29, 24, 13, 45, 35, 49, 41, 48, 21, 31, 28, 23, 11, 33],
//[26, 30, 32, 38, 1, 2, 43, 9, 7, 25, 47, 27],
//[22, 12, 4, 18, 40, 42, 15],
//[50, 8, 52]]
The many enumerables Ruby provides are powerful tools and gives us many ways to organize and find the data we're looking for, especially when working with large and growing databases.
Resources
-https://flatironschool.com/
-https://ruby-doc.org/2.7.7/Enumerable.html
Top comments (0)