DEV Community

nwdunlap17
nwdunlap17

Posted on • Updated on

An Introduction to Yield in Ruby, and filtering class instances

Keep your code DRY by avoiding WET

Everyone that’s been programming for a while knows the value of keeping your code DRY (Don’t Repeat Yourself). The last thing that you want to do is have WET (Written Everything Twice) code. That’s right, it’s such an important concept that there’s two acronyms for it!

There are many benefits to keeping code as DRY as possible.

  • It’s usually faster to turn a chunk of repeated code into a method, that way the next time you have to call it, you can do so in one line.

  • If you need to change your code, you only need to do so in one spot.

  • Your code is far easier to understand if every section is unique. Otherwise you risk reviewers looking at a section, saying ‘I’ve seen this before’, and not noticing that this time it’s a tiny bit different.

So it’s always good practice to put any repeated code into a method that can be called repeatedly. But what about those times when you do want to change your code block very slightly? For those instances, you have the yield keyword.

How Yield Works

The yield keyword allows you to temporarily pass control from the function in the middle of its execution. Essentially, you can build a function that has a yield in it, and then you can replace that ‘yield’ with code of your choice every time you call the function.

When you call a function with a yield, you include a block of code, surrounded by curly braces, that will replace any instance of the yield keyword. The following two functions have identical outputs.

    def hello()
        return Hello World!
    end
    hello() #=> “Hello World!”


    def hello2()
        return yield
    end
    hello2(){Hello World!} #=> “Hello World!"

Multiple yields can be used in the same function, and each one will be replaced by the same code block.

    def double()
        yield
        yield
    end
    double(){print Hi!} #=>”Hi!Hi!”

Of course, code blocks would be fairly useless without the ability to pass variables through the yield. In the below code snippet, the name variable is passed to the yield block, where it is called person.

    def greet(name)
        print You see #{name}. ”
        puts yield(name)
    end
    greet(Dave){|person| Hi #{person}!”} #=> ”You see Dave. Hi Dave!”
    greet(Dave){|person| #{person} says hello!”} #=> ”You see Dave.         
Dave says hello!

You may have noticed in the last example that yield but life a bit easier for us. Notice that in the above two examples, we put the name at either the beginning or end of the second sentence. If we were passing the rest of the sentence into a function, we would need a bit of extra logic to determine where exactly the name should go. But thanks to yield giving us direct access to the variable inside the function, we can easily put it where we want.

Functions vs Yield Blocks

  • The same functions can be called multiple times, but a yield block must be written in its entirety every time a function must yield. Therefore it is best practice to use the function for as much as possible, and only use yield for the mutable parts of the code.

  • Yields and functions work identically for passed variables. Yield functions only have access to the variables passed to it when it is called, just like functions. Variables will be passed to yield in the default manner; this means that changing an integer, string, or float in a yield will not affect the value outside the yield; but changing an array or hash will.

  • The return keyword ends the functions that the yield is in. This means that anything that appears after the yield will not execute. This is something you’ll probably want to avoid in most cases, so just use implicit return.

  • Within the function containing yield, you can treat yield as a function for the purposes and syntax of passing it values and retrieving values from it. For example, if a yield block takes in a string and returns an array, you can use yield(string)[3] to pass in the string and get the 4th element of that array. You can also set variables equal to the output of yield: foo = yield.

Using Yield with Class Iterators

One of the most common uses of yields is adapting classes for use in iterators. Imagine that we have a People class, and that each instance of this class has various attributes: name, age, career, etc.. If we wanted to make functions that could easily filter out specific people, we might have a few functions that look like this.

    def filterPeopleByAge(minAge,maxAge)
        people = Person.all.filter do |person|
            person.age >= minAge && person.age <= maxAge
        end
    return people
    end
    filterPeopleByAge(18,26) 
    #Returns everyone between 18 and 26

    def filterPeopleByName(name)
        people = Person.all.filter do |person|
            person.first_name == name
        end
        return people
    end
    filterPeopleByName (Dave) 
    #Returns everyone named Dave


    def filterPeopleByOccupation(occupation_array)
        people = Person.all.filter do |person|
            occupation_array.include?(person.occupation)
        end
        return people
    end
    filterPeopleByOccupation ([Doctor,Lawyer,Professor]) 
    #Returns all doctors, lawyers, and professors

But that certainly seems like a lot of repeated code. We’re only changing a single thing across each of those functions: the filter condition. We can’t simply write a function that can handle searching across every possible attribute (at least, not one that’s tidy or easily legible). But by using a yield, we can directly control the condition statement. The function and yield below can perform the role of the above functions in a much shorter and more legible manner.

    def filterPeople()
        people = Person.all.filter do |person|
            yield(person)
        end
        return people
    end
    filterPeople(){|person|person.age > 18 && person.age < 26} 
    # Returns all people between 18 and 26
    filterPeople(){|person|person.first_name = Dave} 
    #Returns all people whose first name is Dave
    filterPeople(){|person|         
    [Doctor,Lawyer,Professor].include(person.occupation) } 
    #Returns all doctors, lawyers, and professors

This generic function is even more powerful than the previous set of functions for three reasons. First, it's still just as valid even if we add additional attributes to the person class. Second, it only takes one line to call and create new search criteria. And finally, we can easily use it to combine search criteria.

    filterPeople() {|person| person.age < 40 && person.occupation == 'Lawyer'}
    #Returns all lawyers under 40

An even MORE generic solution

The previous search function looks pretty good on paper, but in practice it has a glaring flaw. Yield ONLY has access to values that are passed to it from the yielding function.

    def filterPeople(args)
        people = Person.all.filter do |person|
            yield(person,args)
        end
        return people
    end
    minAge = 18
    maxAge = 40
    filterPeople(){|person|person.age > minAge && person.age < maxAge}
    #THIS DOES NOT WORK. The yield cannot access the Age values because they were not passed in during the filterPeople function. 

If we want to be able to have a smarter search function, we need to allow it to accept some parameters. We'll create a generic array args that we can use to hold our criteria.

    def filterPeople(args=nil)
        people = Person.all.filter do |person|
            yield(person,args)
        end
        return people
    end

Now we are freely capable of passing parameters to the yield through the function. And because the function and yield block are all called together, it should be fairly legible as to what args[foo] is referring to.

    minAge = 14
    maxAge = 18
    filterPeople([minAge,maxAge]){|person,args|person.age > args[0] && person.age < args[1]}


    minAge = 30
    occupations = ['Artist','Race Car Driver','Athlete']
    filterPeople([minAge,occupations]){|person,args|person.age > args[0] 
    && args[1].include?(person.occupation)}

Now we have a function which allows us to create any arbitrary search criteria that we want for the Person class! And we can call it on a single line no less!

Top comments (1)

Collapse
 
baweaver profile image
Brandon Weaver

A few asides before I get into it:

  • You can use ruby after triple backticks to get Ruby syntax highlighting instead of just white text. It helps a lot for readability.
  • parens and explicit return statements aren't always necessary
  • DRY can be overdone, everything in moderation, and premature abstraction can bite you.
  • Ruby comments start with # instead of //
  • Ruby tends towards snake_case_names over camelCaseNames

Now then, article wise.

It may be good to mention that yield is effectively the same as this:

def hello2(&block)
  block.call
end

yield just implies it. block_given? is also a nice feature for checking if the caller actually gave a block function or not.

Functions vs Yield blocks

Point One

Not sure what you mean here? A yielded block is a function, and can be reused. You may want to clarify that function is really a method, because function can be more commonly understood to mean proc or lambda instead in Ruby.

Point Two

yielded functions are closures, which means they can see the context where they were created.

def testing_one
  a = 1
  yield
end

b = 2
testing_one { { a: defined?(a), b: defined?(b) } }
# => {:a=>nil, :b=>"local-variable"}

Now when you mention not being able to change Integers that's more because they're primitive values. This will work the same across Procs, blocks, lambdas, methods, and anything else in Ruby.

For Strings though, those are mutable unless you have frozen strings on:

s = "foo"
# => "foo"
testing_one { s << "bar" }
# => "foobar"
s
# => "foobar"

s.freeze
# => "foobar"
testing_one { s << "baz" }
# FrozenError: can't modify frozen String

Point Three

return will end any function, whether or not a yield is involved. Now inside a function, return does some real interesting and potentially confusing things depending on what type it is:

a = proc { return 1; 2 }
# => #<Proc:0x00007f81c33877b8@(pry):22>
a.call
# LocalJumpError: unexpected return

b = lambda { return 1; 2 }
# => #<Proc:0x00007f81c0549ea0@(pry):24 (lambda)>
b.call
# => 1

a2 = proc { next 1; 2 }
# => #<Proc:0x00007f81c1d52920@(pry):26>
a2.call
# => 1

Can't say I ever really understood that one myself, keeps catching me whenever I start using proc for some reason in my code so I tend to go for lambda / -> when possible. lambda also is more explicit about arguments and arity than a proc is, so it's easier to tell something broke.

Remember though that block functions are Procs:

testing_two {}
=> Proc

Point Four

This comes back to why, for me, I prefer the explicit passing of a block. Less magic and more easily understandable from a glance. Just remember that's a stylistic choice for me, do what makes more sense for your team.

Using Yield with Class Iterators

Enumerable is a very common use of this:

class Collection
  include Enumerable

  def initialize(items)
    @items = items
  end

  def each
    @items.each { |item| yield item }
  end
end

...which gets us all of the fun Enumerable methods like map and others.

Now to your example:

def filterPeople()
  people = Person.all.filter do |person|
    yield(person)
  end
  return people
end

A few quick cleanups and we have:

def filterPeople()
  Person.all.filter { |person| yield(person) }
end

filter will return an Array making the assignment redundant. The name for a function used to filter is a predicate.

Now a more pragmatic example of this might be to use the Enumerable trick from above:

class Person
  class << self
    include Enumerable

    def each
      all.each { |person| yield person }
    end
  end
end

This will allow you to directly control the People you get back:

Person.filter { |person| person.age > 18 }

Second Implementation

Your second implementation makes this a bit more complicated:

def filterPeople(args=nil)
  people = Person.all.filter do |person|
    yield(person,args)
  end
  return people
end

minAge = 14
maxAge = 18
filterPeople([minAge,maxAge]){|person,args|person.age > args[0] && person.age < args[1]}

...but you got real close to something fun in Ruby in the process. Let's assume we still have an Enumerable person class out there like above.

We're going to get into some metaprogramming, so buckle up, it's a trip.

For this one we'll need Object#send, ===, to_proc, and Hash to do some fun things with kwargs. If you've ever seen ActiveRecord, this will look familiar:

Person.where(age: 18..40, occupation: 'Driver')

We can do that using === and filter in Ruby by making our own class which responds to ===:

class Query
  def initialize(**conditions)
    @conditions = conditions
  end

  # Define what it means for a value to match a query
  def ===(value)
    # Do all of our conditions match?
    @conditions.all? do |match_key, match_value|
      # We use the key of the hash as a method name to extract a value,
      # then `===` it against our match value. That means the match value
      # could be anything like a `Regexp`, `Range`, or a lot more.
      match_value === value.send(match_key)
    end
  end

  # Make `call`, like `block.call`, point to `===`
  alias_method :call, :===

  # This will make more sense in the example below, but we can treat our class
  # like a function by doing this
  def to_proc
    -> value { self.call(value) }
  end

Why the to_proc and the ===? It allows us to do both of these:

# using `to_proc` via `&` to treat our Query class as a function
Person.select(&Query.new(age: 18..40, occupation: 'driver'))

# using `===` to call our query for a case statement:
case person
when Query.new(age: 18..30)
  # something
when Query.new(occupation: 'Driver')
  # something else
else
  # something else again
end

This pattern is super powerful, and ended up expanded in a gem I wrote called Qo. The problem is you have to do some more lifting to get it to work with Array and Hash values to match against them.

Ruby has a lot of potential for flexibility, especially when you know about === and to_proc and how to leverage them.

Other thoughts

Keep writing, it takes a bit to really get into all the fun stuff, and feel free to ask questions on any of that, I got a bit carried away in responding.