Michael Keene

Posted on Jul 30, 2024

The Strangest Ruby Syntax: PatternMatching

#ruby #rails #programming

Ruby has supported pattern matching objects since version 2.7, and with it introduced some of the strangest Ruby syntax.

Through this article I aim to start small and build to explain how

 {year: Integer => year_of_birth}

is valid Ruby (in pattern matching).

This article, with some bonus content, and all the code examples are available in a runnable gist, feel free to play with them after reading.

Intro to pattern matching

pattern matching allows us to compare classes of objects

using case object; when Condition uses a simple === this ends up the same as .call so we can pass lambdas or symbols converted to procs, which helps switch on more complex logic.

Using case object; in Pattern is very different.
To look at how the pattern definitions have changed case statements I'll take you on the same journey I went on, and in the process, explain that strange syntax from the first paragraph.

Thinking of Patterns as some different object to normal ruby code
helps us to reason about them correctly.

After the in keyword the ruby parser acts differently and we should think about it differently. My mental model is that it creates some sort of object, lets make up a PatternMatching::Pattern instance, and then it checks pattern_matching_pattern.match?(object_at_start_of_case_statement).
I have a picture of how this (fictional) .match? method works, which so far seems to match reality.

A familiar pattern

using in we can do essentially identical comparisons of class

value = 1

case value
in Integer # comprable to value.is_a?(Integer)
  puts "value is an integer"
else
  raise "failed to check class properly?!"
end

# in this case `in` is very similar to `when`

case value
when Integer # ===(value)
  puts "value is _still_ an integer"
else
  raise "failed to check class properly?!"
end

but we can also match on the structure of an object and it's constituent parts

array = [1, 2, 3]

case array
in [Integer, Integer, Integer]
  puts "there are three integers in the array"
else
  raise "what the heck is in there?!"
end

# we can even *collect some of the contents in the same way as method *args and **kwargs.

case array
in [Integer, *tail]
  puts "the array starts with an Integer"
  puts "the rest of the array is #{tail}"
else
  raise "what the heck is in there?!"
end

This is the first variable assignment done by the pattern, this in itself feels quite powerful. This variable assignment is also the first part of the puzzle for our strange syntax in the opening paragraph.

Going deeper

there are 2 methods that help with pattern matching on the internal state of objects

deconstruct and deconstruct_keys
along with 2 ways of declaring a patten of an object

you can use an array pattern or a hash pattern

Thing = Data.define(:thing) # Data already have the methods we need defined

my_thing = Thing.new(1)

case my_thing
in Thing
  puts "my_thing is a Thing instance"
end

here the pattern is just the class Thing matching the pattern here is the same as above and it will just check that my_thing is an instance of Thing
something like:
return false unless my_thing.is_a? Thing
true

now for some fun

Deconstruct

case my_thing
in Thing(Integer)
  puts "my_thing holds an integer"
end

to build this pattern it will work from the outside in
reccursively checking that the class of the object is correct,
then checking that the object.deconstruct matches the internal pattern

To aid in the mental model I want to define a fictional class PatternMatching::Pattern

N.B. I'm unaware of the actual details of the pattern matching internals,
This is a fictional class I use as a mental model, a product of my own tinkering, failures and successes.

I picture the process like

PatternMatching::Pattern.new(
  klass: Thing,
  internal_pattern: [PatternMatching::Pattern.new(klass: Integer, internal_pattern: nil)]
)

then each PatternMatching::Pattern checks if the objects, class matches the klass, then defers to the internal_pattern.match?(object.deconstruct) till we reach the end of the tree

we could also see this as nested checks && case statements.
e.g This is equivalent to

my_thing.is_a?(Thing) && case my_thing.deconstruct
                         in [Integer]
                           puts "my_thing holds an integer"
                         end

I have discovered no limit to the nesting

OtherThing = Struct.new(:value)

other_thing_1 = OtherThing.new
other_thing_2 = OtherThing.new(other_thing_1)

other_thing_1.value = other_thing_2

case other_thing_1
in OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing[OtherThing]]]]]]]
  puts "at each stage it compares the internal pattern to `object.deconstruct`"
end

I feel like I can hear your next thoughts, something along the lines of:
but that has all the disadvantages of positional args, with complex state this will be a pain to manage,
there must be a better way!
deconstruct_keys to the rescue

Deconstruct keys

deconstruct_keys can be defined on any object and as with deconstruct, it is already defined on Struct and Data.
This method recieves an array of keys and then responds with a hash, matching those keys to values.

if my_thing.deconstruct_keys([:thing]) == {thing: 1}
  puts "we can specify the kwargs we care about and discard the rest"
end

This is the second part of our puzzling syntax, we are able to specify the class of the value for each key! Strategy patterns on complex objects, validation of user input, to flex on people holding onto ruby < 2.7, this is the next game changing powerup that pattern matching affords us.

case my_thing
in Thing[thing: Integer]
  puts "this will check the class of my_thing matches Thing"
  puts "and my_thing.deconstruct_keys(:thing)[:thing] matches Integer"
end

to this end, in a similar vein to deconstruct, we can compare deeply nested structures

N.B. not an actual suggested way to represent data, just a toy example.
Also not that deep.

json_stuff = {
  person: {
    name: "Michael Keene",
    birthday: {
      year: 1992,
      month: 4,
      day: 19
    }
  }
}

case json_stuff
in {person: {birthday: {year: year_of_birth}}}
  # we apparently only care about this guys birthday
  puts "this person was born in #{year_of_birth}"
  # this assignment by the pattern is cool
  # but how do we compare to an existing and defined value?
end

Comparing to specific values - Carat pins

Comparing the structure of classes and the keys of deconstruct keys are fine and dandy, along with the ability to assign some of the constituent parts to variable names (assuming a match)
is already affording us some strong posibilities. However an obvious question soon bubbles up, can I compare to existing variables? These variables could be from an external yml configuration or something only set during runtime, there would be no other way to hard code them, are we doomed to compare these things with only < 2.7 syntax?
Thankfully that is not the case :)

N.B. This doesn't connect much to the initial strange syntax, other than it is a natural extension of what we have learned so far.

a naive attempt would just to be using the same variable in the same place

nonesense = "ASdansdflnlnn"
cool_guy_name = "Michael Keene"

case json_stuff
in {person: {name: nonesense}}
  puts "The persons name has now been assigned to the variable `nonesense`"
  puts "this is because we need to pin the value with a carat ^"
end

but much like regular reassignment

var = "foo"
var = "bar"

the pattern will assign the value for the key to the variable name, regardless of what it was.

So let see the difference if we try once more but with a carat

nonesense = "ASdansdflnlnn" # reset value of nonesense

case json_stuff
in {person: {name: ^nonesense}}
  raise "that name is too stupid"
in {person: {name: ^cool_guy_name}}
  puts "what a cool guy: #{cool_guy_name}"
end

This correctly identifies me as a cool guy, as the carat before the symbol indicates that the pattern should compare to the actual already defined variable. Very nice stuff indeed.

so now we see that the mental model we have PatternMatching::Pattern has a series of different checks it can do, as well as variable assignment. Checking the docs for accepted patterns we see the list includes:

is the klass the object.class ?
does the object.deconstruct match the position internal pattern ?
does the object.deconstruct_keys match the kwarg internal pattern ?
is the object.eql? to this pinned value ?
we can also use | to split up different patterns for the same case branch

there are some extraneous things with ranges that I won't cover in detail, but basically they have to be bracketed and pinned to be understood by the PatternMatching::Pattern builder

There is now a hole in the checks we can do
going back up to this example:

case json_stuff
in {person: {birthday: {year: year_of_birth}}}
  # we apparently only care about this guys birthday
  puts "this person was born in #{year_of_birth}"
  # this assignment by the pattern is cool
  # but how do we compare to an existing and defined value?
end

We know we can assign values to variables, and we know we can check the class of a value, but how do we both assign to a variable (in this case year_of_birth) and check it's class? we could use something like
in {person: {birthday: {year: Integer}}}
and then dig(:person, :birthday, :year) but that feels inefficient, and in the clean clear world of pattern matching that just won't do.

No! There must be a better way. If you look back to the article intro, there is.

case json_stuff
in {person: {birthday: {year: Integer => year_of_birth}}} # this is valid Ruby!
  puts "this person was born in #{year_of_birth} and it's an Integer!"
end

Well gang, we got there! To me this looks strange as hash rockets typically go {key => value} and are no longer required with symbol keys {key: value}. Having both feels peculiar {key: Value => other_value}, like a three layered fraction, it doesn't yet sit right with me but I'm sure that will ease with time.
To me this really solidifies the fact that the Ruby parser
does something very different after the in keyword.

In our new shining world of pattern matching and hash value ommission I almost forgot to consider the other place
where values are assigned to variables based on the key that they have, method definitions!
Yes friends I can see a world where one day this syntax could be used in a similar fashion with method args and kwargs,
to offer clear explicit type requirements in method definitions for method arguments.

Forem

The Strangest Ruby Syntax: PatternMatching

Intro to pattern matching

A familiar pattern

Going deeper

Deconstruct

Deconstruct keys

Comparing to specific values - Carat pins

Top comments (0)

Read next

Qwen2.5: New AI Model Matches GPT Performance with 3x More Training Data and Specialized Variants

Upload to S3

Build a clone of Perplexity with LangGraph, CopilotKit, Tavily & Next.js 🪄

AI System Combines Face Analysis and Body Signals to Better Detect Human Emotions