DEV Community

Derk-Jan Karrenbeld for XP Bytes

Posted on • Edited on • Originally published at xpbytes.com

Control flow in reduce/inject (ruby)

reduce (inject) is one of the most powerful
methods that exists on the Enumerable module, meaning that the
methods are available on any instances of any class that includes this module,
including Array, Hash, Set and
Range.

reduce can be used in a MapReduce process,
often is the base for comprehensions and is a great way to group values or
calculate a single value (reducing a set of values to a single value) given a
set of values.

This article quickly shows you how to skip values / conditionally return values
during a reduce iteration and how to break early / return a
different value and stop iteration.

Bridge In The Mist in Stockholm, Sweden

Recap 💬

From the documentation, given an instance enum (an Enumerable) calling
enum.reduce:

# Combines all elements of <i>enum</i> by applying a binary
# operation, specified by a block or a symbol that names a
# method or operator.

An example of using reduce would be write a function that sums
all the elements in a collection:

##
# Sums each item in the enumerable (naive)
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  sum = 0
  enum.each do |item|
    sum += item
  end
  sum
end

##
# Sums each item in the enumerable (reduce block)
#
# Each iteration the result of the block is the passed in previous_result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  enum.reduce do |previous_result, item|
    previous_result + item
  end
end

##
# Sums each item in the enumerable (reduce method)
#
# Each iteration the :+ symbol is sent as a message to the current result with
# the next value as argument. The result is the new current result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  enum.reduce(:+)
end

##
# Alias for enum.sum
#
def summation(enum)
  enum.sum
end

reduce takes an optional initial value, which is used instead of
the first item of the collection, when given.

How to control the flow?

When working with reduce you might find yourself in one of two
situations:

  • you want to conditionally return a different value for the iteration (which is used as base value for the next iteration)
  • you want to break out early (stop iteration altogether)

next ⏭

The next keyword allows you to return early from a yield block, which is the
case for any enumeration.

Let’s say you the sum of a set of numbers, but want half of any even
number, and double of any odd number:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    result + i * (i.even? ? 0.5 : 2)
  end
end

Not too bad. But now another business requirement comes in to skip any number
under 5:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    if i < 5
      result
    else
      result + i * (i.even? ? 0.5 : 2)
    end
  end
end

Ugh. That’s not very nice ruby code. Using next it could look like:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    next result if i < 5
    next result + i * 0.5 if i.even?
    result + i * 2
  end
end

next works in any enumeration, so if you’re just processing items using
.each , you can use it too:

(1..10).each do |num|
  next if num.odd?
  puts num
end
# 2
# 4
# 6
# 8
# 10
# => 1..10

break 🛑

Instead of skipping to the next item, you can completely stop iteration of a an
enumerator using break.

If we have the same business requirements as before, but we have to return the
number 42 if the item is exactly 7, this is what it would look like:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    break 42 if i == 7
    next result if i < 5
    next result + i * 0.5 if i.even?
    result + i * 2
  end
end

Again, this works in any loop. So if you’re using find to try to find
an item in your enumeration and want to change the return value of that
find, you can do so using break:

def find_my_red_item(enum)
  enum.find do |item|
    break item.name if item.color == 'red'
  end
end

find_my_red_item([
  { name: "umbrella", color: "black" },
  { name: "shoe", color: "red" },
  { name: "pen", color: "blue" }
])
# => 'shoe'

StopIteration

You might have heard about or seen raise StopIteration.
It is a special exception that you can use to stop iteration of an enumeration,
as it is caught be Kernel#loop, but its use-cases are limited as
you should not try to control flow using raise or fail. The
airbrake blog has a good article about this
use case.

When to use reduce

If you need a guideline when to use reduce, look no further. I
use the four rules to determine if I need to use reduce or
each_with_object or something else.

I use reduce when:

  • reducing a collection of values to a smaller result (e.g. 1 value)
  • grouping a collection of values (use group_by if possible)
  • changing immutable primitives / value objects (returning a new value)
  • you need a new value (e.g. new Array or Hash)

Alternatives 🔀

When the use case does not match the guidelines above, most of the time I
actually need each_with_object which has a similar
signature, but does not build a new value based on the return value of a block,
but instead iterates the collection with a predefined “object”, making it much
easier to use logic inside the block:

doubles = (1..10).each_with_object([]) do |num, result|
  result << num* 2
  # same as result.push(num * 2)
end
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

doubles_over_ten = (1..10).each_with_object([]) do |num, result|
  result << num * 2 if num > 5
end
# => [12, 14, 16, 18, 20]

Use each_with_object when:

  • building a new container (e.g. Array or Hash). Note that you’re not really reducing the current collection to a smaller result, but instead conditionally or unconditionally map values.
  • you want logic in your block without repeating the result value (because you must provide a return value when using reduce)

My use case

The reason I looked into control flow using reduce is because I was iterating
through a list of value objects that represented a migration path. Without using
lazy, I wanted an elegant way of representing when these
migrations should run, so used semantic versioning. The migrations enumerable is
a sorted list of migrations with a semantic version attached.

migrations.reduce(input) do |migrated, (version, migration)|
  migrated = migration.call(migrated)
  next migrated unless current_version.in_range?(version)
  break migrated
end

The function in_range? determines if a migration is executed, based on the
current “input” version, and the semantic version of the migration. This will
execute migrations until the “current” version becomes in-range, at which point
it should execute the final migration and stop.

The alternatives were less favourable:

  • take_while, select and friends are able to filter the list, but it requires multiple iterations of the migrations collection (filter, then “execute”);
  • find would be a good candidate, but I needed to change the input so that would require me to have a bookkeeping variable keeping track of “migrated”. Bookkeeping variables are almost never necessary in Ruby.

Photo called "It’s Own Kind of Tranquility", displaying a series of windmills on either side of a 'water street (canal)' in Alblasserdam, The Netherlands

Reference

Top comments (0)