DEV Community

Brandon Weaver
Brandon Weaver

Posted on • Edited on

Ruby 3 - Set Literal

NOTE: This change was not implemented in Ruby 3

Next in our series on new features in Ruby 3 we'll be looking at the new Set syntax recently discussed on the bug tracker

Quick Reference

What's it look like?

set = { 1, 2, 3 }
set.include?(3) # => true
Enter fullscreen mode Exit fullscreen mode

Originally this might have looked like this:

set = Set[1, 2, 3]
set.include?(3) # => true
Enter fullscreen mode Exit fullscreen mode

Details

Now Set has some interesting usecases, so consider this a quick rundown of Set as well as an introduction of the new syntax.

It should be noted that Set inclusion is O(1) rather than O(n) for an Array. That means these examples can be pretty quick compared to searching an entire array.

What Can We Use It For?

Triple Equals - ===

=== is real fun, and for Set it's implemented as include? or member?. In the core docs it uses this example:

case :apple
when Set[:potato, :carrot]
  "vegetable"
when Set[:apple, :banana]
  "fruit"
end
# => "fruit"
Enter fullscreen mode Exit fullscreen mode

With the new syntax we can change it to this:

case :apple
when { :potato, :carrot }
  "vegetable"
when { :apple, :banana }
  "fruit"
end
# => "fruit"
Enter fullscreen mode Exit fullscreen mode

Remembering that some methods in Ruby now leverage === for matches:

text = "The rain in spain falls mainly on the plane"
words = text.split
fillers = Set['The', 'the', 'in', 'on']
fillers = { 'The', 'the', 'in', 'on' }

new_words = words.grep_v(fillers)
# => ["rain", "spain", "falls", "mainly", "plane"]

words.any?(fillers)
# => true

words.slice_after(fillers).to_a
# => => [["The"], ["rain", "in"], ["spain", "falls", "mainly", "on"], ["the"], ["plane"]]
Enter fullscreen mode Exit fullscreen mode

That means you get methods like any?, all?, none?, one?, grep, slice_before, slice_after, and grep_v to work with.

Granted I have opinions about find taking === patterns instead of an ifnone and select / reject / filter also using patterns, but that's a matter for another issue on the bug tracker like this one I submitted recently.

Duplicate Prevention

Set cannot have repeated elements, and that can be real useful for inline definition:

def example_method(a, b, c)
  { a, b, c }
end

example_method(1, 1, 1)
# => { 1 }
Enter fullscreen mode Exit fullscreen mode

I do wonder if one can splat inside these:

def example_method(*args)
  { *args }
end

example_method(1, 1, 1, 1, 1)
# => { 1 }
Enter fullscreen mode Exit fullscreen mode

...but that migth get confusing after a while, no? Especially with ambiguity with Hashes and keyword splatting:

a = { a: 1 }
b = { **a, b: 2 }
Enter fullscreen mode Exit fullscreen mode

We'll get more into overloaded syntax issues in a bit though.

Compromises and Issues

New Syntax

I like expressive new syntax, but this does add another layer of concern as far as Hash vs Set by overloading some symbols. This could be the same as some of the class.[] confusion as well for cases like Hash[] and how those work.

That said, I like this syntax and think it in general makes sense. It's just a balancing act whenever you add new syntax to not break intuitiveness.

I do wonder what would happen if we had %s{ 1, 2, 3 } instead. While two more characters this does fit into the convention of %w, %i, and others.

Punning

Now the bad part about this syntax is that it overrides the potential for a very very powerful feature in Javascript called object punning:

const a = 1;
const b = { b: 1, c: 2 };

const c = { a, ...b }
// {a: 1, b: 1, c: 2}
Enter fullscreen mode Exit fullscreen mode

I could see that being extremely useful for Ruby if it were to do something along the lines of this:

def move_north(x:, y:)
  { x, y + 1 }
end

move_north(x: 1, y: 2)
# => { x: 1, y: 3 }
Enter fullscreen mode Exit fullscreen mode

There's probably some way to shim that behavior in, sure, but with the new Set literal that appears to be unlikely. That said I don't disagree with the syntax either, and actually quite like it, so I'm a bit torn on this one.

Block Ambiguity

This could provide some fun with Block ambiguity as well. How so? Consider the trick of {} vs do ... end for blocks and what that does to parens:

describe  'something' do
  # This works
end

describe 'something' {
  # This won't  
}
Enter fullscreen mode Exit fullscreen mode

Most of these are already inherit issues with Hashes, sure, but will need to be watched out for in Set as well. I do wonder about one-arg blocks like any? and how they might respond:

[1, 2, 3].any? { 1, 2 }
Enter fullscreen mode Exit fullscreen mode

There could be some ambiguity here, especially without parens, and exceptionally so like this:

[1, 2, 3].any? { 1 }
Enter fullscreen mode Exit fullscreen mode

Is that a block or a Set? I could see it being interpreted as a block very easily, which may create issues. It may be a fair compromise to just call this one an invalid case.

Final Thoughts?

I like it, I think it's succinct and it makes sense, I just have a few concerns over some of the above ambiguity issues as noted.

I can see some good uses for a few things I've used day-to-day as far as querying data and de-duping lists of items, but will probably need to write a Set tutorial later to refresh myself on what the rest of them might be and give it a bit more of a pragmatic edge.

Looking forward to what Ruby 3.0 brings, and I'll be writing on new features as I see them.

Notice one that I haven't? Feel free to DM me on Twitter @keystonelemur and I'll take a look into it.

Top comments (4)

Collapse
 
thepeoplesbourgeois profile image
Josh • Edited

Plus, the %<char> sigil syntax is a conversion operation on a string literal (and following the percent sign immediately by a paired closure, e.g., %["Let's make s'mores", Cathy said] creates an interpolated string literal where neither the " nor ' characters need to be escaped). It's a clever idea, but it would be a nontrivial amount of reworking to have MRI begin treating the enclosed contents of this syntax as a discrete set of typed data for the case of a Set object.

I might as well also say it here, to avoid double-commenting: The block syntax as you have it written in

[1, 2, 3].any? { 1 }

would return true even if you called #all? with that block, as the expression evaluates to a truthy value (an instance of Numeric), not as a value against which each value should be compared

Collapse
 
baweaver profile image
Brandon Weaver
[1, 2, 3].any? { 1 }

Regarding this, yes, that was the point of that example. It's ambiguous. Is it a set? Is it a block? For me I know it'll probably be called a block but it's an edge case to watch out for.

 
thepeoplesbourgeois profile image
Josh

Apologies for the confusion

Collapse
 
rhymes profile image
rhymes

{ 1, 2 } reminds of Python :D

>>> {1, 2}
{1, 2}
>>> type(_)
<class 'set'>
>>> {1, 2} | {2, 3}
{1, 2, 3}