Ruby has supported pattern matching objects since version 2.7, and with it introduced some of the strangest Ruby syntax.
Through this article I aim to start small and build to explain how
{year: Integer => year_of_birth}
is valid Ruby (in pattern matching).
This article, with some bonus content, and all the code examples are available in a runnable gist, feel free to play with them after reading.
Intro to pattern matching
pattern matching allows us to compare classes of objects
using case object; when Condition
uses a simple === this ends up the same as .call
so we can pass lambdas or symbols converted to procs, which helps switch on more complex logic.
Using case object; in Pattern
is very different.
To look at how the pattern definitions have changed case statements I'll take you on the same journey I went on, and in the process, explain that strange syntax from the first paragraph.
Thinking of Patterns as some different object to normal ruby code
helps us to reason about them correctly.
After the in
keyword the ruby parser acts differently and we should think about it differently. My mental model is that it creates some sort of object, lets make up a PatternMatching::Pattern instance, and then it checks pattern_matching_pattern.match?(object_at_start_of_case_statement)
.
I have a picture of how this (fictional) .match?
method works, which so far seems to match reality.
A familiar pattern
using in
we can do essentially identical comparisons of class
value = 1
case value
in Integer # comprable to value.is_a?(Integer)
puts "value is an integer"
else
raise "failed to check class properly?!"
end
# in this case `in` is very similar to `when`
case value
when Integer # ===(value)
puts "value is _still_ an integer"
else
raise "failed to check class properly?!"
end
but we can also match on the structure of an object and it's constituent parts
array = [1, 2, 3]
case array
in [Integer, Integer, Integer]
puts "there are three integers in the array"
else
raise "what the heck is in there?!"
end
# we can even *collect some of the contents in the same way as method *args and **kwargs.
case array
in [Integer, *tail]
puts "the array starts with an Integer"
puts "the rest of the array is #{tail}"
else
raise "what the heck is in there?!"
end
This is the first variable assignment done by the pattern, this in itself feels quite powerful. This variable assignment is also the first part of the puzzle for our strange syntax in the opening paragraph.
Going deeper
there are 2 methods that help with pattern matching on the internal state of objects
deconstruct and deconstruct_keys
along with 2 ways of declaring a patten of an object
you can use an array pattern or a hash pattern
Thing = Data.define(:thing) # Data already have the methods we need defined
my_thing = Thing.new(1)
case my_thing
in Thing
puts "my_thing is a Thing instance"
end
here the pattern is just the class Thing
matching the pattern here is the same as above and it will just check that my_thing is an instance of Thing
something like:
return false unless my_thing.is_a? Thing
true
now for some fun
Deconstruct
case my_thing
in Thing(Integer)
puts "my_thing holds an integer"
end
to build this pattern it will work from the outside in
reccursively checking that the class of the object is correct,
then checking that the object.deconstruct matches the internal pattern
To aid in the mental model I want to define a fictional class PatternMatching::Pattern
N.B. I'm unaware of the actual details of the pattern matching internals,
This is a fictional class I use as a mental model, a product of my own tinkering, failures and successes.
I picture the process like
PatternMatching::Pattern.new(
klass: Thing,
internal_pattern: [PatternMatching::Pattern.new(klass: Integer, internal_pattern: nil)]
)
then each PatternMatching::Pattern checks if the objects, class matches the klass, then defers to the internal_pattern.match?(object.deconstruct)
till we reach the end of the tree
we could also see this as nested checks && case statements.
e.g This is equivalent to
my_thing.is_a?(Thing) && case my_thing.deconstruct
in [Integer]
puts "my_thing holds an integer"
end
I have discovered no limit to the nesting
OtherThing = Struct.new(:value)
other_thing_1 = OtherThing.new
other_thing_2 = OtherThing.new(other_thing_1)
other_thing_1.value = other_thing_2
case other_thing_1
in OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing]]]]]]]
puts "at each stage it compares the internal pattern to `object.deconstruct`"
end
I feel like I can hear your next thoughts, something along the lines of:
but that has all the disadvantages of positional args, with complex state this will be a pain to manage,
there must be a better way!
deconstruct_keys to the rescue
Deconstruct keys
deconstruct_keys can be defined on any object and as with deconstruct, it is already defined on Struct and Data.
This method recieves an array of keys and then responds with a hash, matching those keys to values.
if my_thing.deconstruct_keys([:thing]) == {thing: 1}
puts "we can specify the kwargs we care about and discard the rest"
end
This is the second part of our puzzling syntax, we are able to specify the class of the value for each key! Strategy patterns on complex objects, validation of user input, to flex on people holding onto ruby < 2.7, this is the next game changing powerup that pattern matching affords us.
case my_thing
in Thing[thing: Integer]
puts "this will check the class of my_thing matches Thing"
puts "and my_thing.deconstruct_keys(:thing)[:thing] matches Integer"
end
to this end, in a similar vein to deconstruct, we can compare deeply nested structures
N.B. not an actual suggested way to represent data, just a toy example.
Also not that deep.
json_stuff = {
person: {
name: "Michael Keene",
birthday: {
year: 1992,
month: 4,
day: 19
}
}
}
case json_stuff
in {person: {birthday: {year: year_of_birth}}}
# we apparently only care about this guys birthday
puts "this person was born in #{year_of_birth}"
# this assignment by the pattern is cool
# but how do we compare to an existing and defined value?
end
Comparing to specific values - Carat pins
Comparing the structure of classes and the keys of deconstruct keys are fine and dandy, along with the ability to assign some of the constituent parts to variable names (assuming a match)
is already affording us some strong posibilities. However an obvious question soon bubbles up, can I compare to existing variables? These variables could be from an external yml configuration or something only set during runtime, there would be no other way to hard code them, are we doomed to compare these things with only < 2.7 syntax?
Thankfully that is not the case :)
N.B. This doesn't connect much to the initial strange syntax, other than it is a natural extension of what we have learned so far.
a naive attempt would just to be using the same variable in the same place
nonesense = "ASdansdflnlnn"
cool_guy_name = "Michael Keene"
case json_stuff
in {person: {name: nonesense}}
puts "The persons name has now been assigned to the variable `nonesense`"
puts "this is because we need to pin the value with a carat ^"
end
but much like regular reassignment
var = "foo"
var = "bar"
the pattern will assign the value for the key to the variable name, regardless of what it was.
So let see the difference if we try once more but with a carat
nonesense = "ASdansdflnlnn" # reset value of nonesense
case json_stuff
in {person: {name: ^nonesense}}
raise "that name is too stupid"
in {person: {name: ^cool_guy_name}}
puts "what a cool guy: #{cool_guy_name}"
end
This correctly identifies me as a cool guy, as the carat before the symbol indicates that the pattern should compare to the actual already defined variable. Very nice stuff indeed.
so now we see that the mental model we have PatternMatching::Pattern has a series of different checks it can do, as well as variable assignment. Checking the docs for accepted patterns we see the list includes:
is the klass the object.class ?
does the object.deconstruct match the position internal pattern ?
does the object.deconstruct_keys match the kwarg internal pattern ?
is the object.eql? to this pinned value ?
we can also use | to split up different patterns for the same case branch
there are some extraneous things with ranges that I won't cover in detail, but basically they have to be bracketed and pinned to be understood by the PatternMatching::Pattern builder
There is now a hole in the checks we can do
going back up to this example:
case json_stuff
in {person: {birthday: {year: year_of_birth}}}
# we apparently only care about this guys birthday
puts "this person was born in #{year_of_birth}"
# this assignment by the pattern is cool
# but how do we compare to an existing and defined value?
end
We know we can assign values to variables, and we know we can check the class of a value, but how do we both assign to a variable (in this case year_of_birth) and check it's class? we could use something like
in {person: {birthday: {year: Integer}}}
and then dig(:person, :birthday, :year) but that feels inefficient, and in the clean clear world of pattern matching that just won't do.
No! There must be a better way. If you look back to the article intro, there is.
case json_stuff
in {person: {birthday: {year: Integer => year_of_birth}}} # this is valid Ruby!
puts "this person was born in #{year_of_birth} and it's an Integer!"
end
Well gang, we got there! To me this looks strange as hash rockets typically go {key => value}
and are no longer required with symbol keys {key: value}
. Having both feels peculiar {key: Value => other_value}
, like a three layered fraction, it doesn't yet sit right with me but I'm sure that will ease with time.
To me this really solidifies the fact that the Ruby parser
does something very different after the in
keyword.
In our new shining world of pattern matching and hash value ommission I almost forgot to consider the other place
where values are assigned to variables based on the key that they have, method definitions!
Yes friends I can see a world where one day this syntax could be used in a similar fashion with method args and kwargs,
to offer clear explicit type requirements in method definitions for method arguments.
Top comments (0)