Brandon Weaver

Posted on Sep 12, 2021

Ruby 3.1 – Shorthand Hash Syntax – First Impressions

#ruby #rails

It's the time of year again, and with it comes a bundle of new Ruby 3.1 features getting approved and merged ahead of the December release.

This series will be covering a few of the interesting ones I see going across the bug tracker. Have one you think is interesting? Send it my way on Twitter @keystonelemur or comment here.

Shorthand Hash Syntax - First Impressions

Now this, this is something I've wanted for a very long time. Shorthand Hash Syntax, also known as Punning in Javascript, is an incredibly useful feature that allows you to omit values where the variable name is the same as the key:

a = 1
b = 2

{ a:, b: }
# => { a: 1, b: 2 }

In Javascript this would like like so:

const a = 1
const b = 2

{ a, b }

So you can see some of the resemblance between the two.

Commits and Tracker

You can find the relevant diff here and the bugtracker issue here.

Exploring the Testcases

Let's take a look into the specs from that diff (slightly abbreviated):

def test_value_omission
  x = 1
  y = 2

  assert_equal({x: 1, y: 2}, {x:, y:})
  assert_equal({one: 1, two: 2}, {one:, two:})
end

private def one = 1
private def two = 2

Now as mentioned before, this allows value omission as specified in the first assertion:

x = 1
y = 2

assert_equal({x: 1, y: 2}, {x:, y:})

This is what I would expect the feature to do, and already opens up a lot of potential which I'll get into in a moment, but there's something else here in the second:

assert_equal({one: 1, two: 2}, {one:, two:})

# ...

private def one = 1
private def two = 2

...it's also working on methods, which opens up a whole new realm of interesting potential. I'll be breaking these into two sections to address what I think the implications of each are.

Implications of Punning

Let's start with general implications, as I believe those alone are interesting enough to write on.

Pattern Matching

Consider with me pattern matching:

case { x: 1, y: 2 }
# I do still wish `y` was captured here without right-hand assignment
in x:, y: ..10 => y then { x:, y: y + 1 }
# ...
end

This allows us to abbreviate the idea of leaving one value unchanged, and working with another value explicitly. In another case with a JSON response we might want to extract a few fields while changing one:

case response
in code: 200, body: { messages:, people: }
  { messages:, people: people.map { Person.new(_1) } }
in code: 400.., body: { error:, stacktrace: }
  { error:, stacktrace: stacktrace.first(5) }
# ...
end

It allows us to be more succinct in what we're simply extracting and what we're transforming.

Note: I have not tried combining this with regular Hash syntax, but I assume this will work given the parse.y code. More experimentation needed.

Keyword Arguments

Keyword arguments in general are a huge value-add to Ruby for understandability and errors:

# (A tinge contrived, yes)
def json_logger(level:, message:, limit: nil)
  new_message = limit ? message.first(limit) : message
  Logger.log({ level: level, message: new_message }.to_json)
end

With one argument that's not too bad, but the biggest annoyance of keyword arguments is constantly doing this:

some_method(a: a, b: b, c: c, d: d + 4)

It feels repetitive and doesn't add extra value. Punning in JS elided this information into arguments which should be forwarded without transformation, and Ruby shorthand hash syntax does the same. The benefits of keyword arguments without all of the extra code:

some_method(a:, b:, c:, d: d + 4)

Now we can quickly see only d is being modified, allowing us to more clearly see the intent of the code, while also getting all of the benefits of keyword arguments around name checks, value checks, and more easily understood methods.

I see this being the highest value add.

Implications of Including Methods

This one is a bit more unusual, but I like the idea.

Configuration Hashes

There are a lot of cases where I have to assemble larger hashes of configuration. By putting parts of it into methods I've made it easier to manage:

def configuration
  { a:, b:, c: }
end

private def a = {}
private def b = {}
private def c = {}

Those could be things like say logger configs, AWS keys, library configuration, and very interestingly (and I don't know if this would work) optional configuration cascades:

def configuration
  {
    **({ logger:, stack_limit:, tracing: } if logging.enabled?),
    **({ shards:, rw:, clusters: } if db.sharded?),
    # ...
  }
end

Of course those are theoretical and I would need to spend substantial time with this feature after nightly builds are out to be sure on this, but if it does work it opens the door for some very interesting cascading configuration styles in the future.

Wrap Up

This is a first impressions article on the shorthand hash syntax, and I expect as I have time to play with it I'll come up with some new ideas. Until then, I'll be watching the bug tracker for fun new features coming up soon.

Top comments (12)

tantle • Sep 12 '21

So, what is the symbol syntax in Ruby 3.1 now? Do they start with a colon, or do they end with a colon?

Brandon Weaver • Sep 12 '21

Same as it has been. This plays off the json style of Hash keys.

tantle • Sep 13 '21

I've been using Ruby for 15 years and it is by far my favorite programming language! As I introduce friends and colleagues to Ruby, the two things that seem to constantly trip them up are the symbol syntax vis-a-vis keyword arguments, and hash keys being either symbols or strings.

In my view, it's deeply unfortunate that as of Ruby version 1.9 there are two different ways of declaring symbols that can be quite confusing to people just coming to the language. This is compounded by the fact that keyword arguments share the same syntax, but aren't really symbols per se.

For example, the new hash syntax seems innocuous and both of these examples are equivalent:

hash = { :foo => 'bar', :bar => :bar }      # => {:foo=>"bar", :bar=>:bar }
other = { foo: 'bar', bar: :bar }           # => {:foo=>"bar", :bar=>:bar }

However, when looking at the result, there is no way to determine which syntax was used when the hash was declared. This impacts providing guidance in error messages, because we may use the new syntax in the message, when the user wrote the code using the old syntax.

The new hash literal declaration syntax can only be used when keys are symbols. For example, would we expect the bug in the code below obvious to language newcomers?

obj = Object.new                                        # => <Object:0x00007f8c4f9d1ad8>
hash = { :foo => 'bar', :bar => :bar, 'obj' => obj }    # => {:foo=>"bar", :bar=>:bar, "obj"=>#<Object:0x00007f8c4f9d1ad8>}
other = { foo: 'bar', bar: :bar, 'obj': obj }           # => {:foo=>"bar", :bar=>:bar, :obj=>#<Object:0x00007f8c4f9d1ad8>}

The ability to coerce a string into a symbol already exists in Ruby, but it uses the original "legacy syntax", which I believe makes the intention much more clear.

string = 'hello'    # => "hello"
symbol = :'hello'   # => :hello

Furthermore, both the old and new syntax may be freely combined when declaring hash literals which can lead to some odd looking declarations. For example in order to fix the bug in earlier example, should we write:

other = { foo: 'bar', bar: :bar, 'obj' => obj }         # => {:foo=>"bar", :bar=>:bar, 'obj'=>#<Object:0x00007f8c4f9d1ad8>}

When evaluating the validity of the expression above, one must bring several different syntaxes to mind.

The alternative syntax may only be used safely when all hash keys are symbols, and in all other cases the original syntax must be used.

The new syntax effectively doubles the search space when scanning code for references to a given hash key that is a symbol because it might be declared in one of two different forms (leading colon or trailing colon).

The ActiveSupport::HashWithIndifferentAccess class exists in part to deal with some of the shortcomings of dealing with Ruby hashes. I really wish that we didn't need such a class and that Ruby would address these issues in a fundamental manner that would help make the language more approachable to newcomers.

Brandon Weaver • Sep 13 '21

I mean I agree with you on the general premise that Symbol and String intermingling is confusing, but there's zero chance that gets changed due to the way Ruby works, and I've resigned to that. I've also had that argument several times, but originally in Ruby Symbols and Strings were much more different in terms of GC, memory, and identity. Because of that past it's impossible to change.

As far as this being the straw that broke the camels back? I would disagree. Keyword arguments have done this for a long time, the only difference now is that they can be used on "write" (creating a hash or calling a kwarg function) rather than "read" (kwarg function argument definition).

Victor Goff • Sep 16 '21

Even further than that, it is not only symbol and string as keys, but anything that responds to hash can be used as a key.

Omri Gabay • Sep 12 '21

That is SUCH a tiny diff for a big change, wow

Brandon Weaver • Sep 13 '21

I may add more tests for edge cases, I can see some potential for issues later without a more formal spec.

Morten Grum • Sep 13 '21

Cool. Including methods will open for a new (and simpler) way to create polymorphism.

Brandon Weaver • Sep 13 '21

Oh? Have a few examples of ideas there?

Aivils Štoss • Sep 20 '21

This creates copies of the object. The pointer is lost and at the smallest change the code must be changed 101 times. At least it's under javascript. I do not see any progress.

Brandon Weaver • Sep 20 '21

What are you referring to?

Aivils Štoss • Sep 21 '21

Old time javascript "foo = {}; foo.bar = 123;" . When passing foo as an argument to a function, it was passed as a pointer. In the old days javascript to create an object copy was cumbersome "foo_copy = JSON.stringify (foo);". By modifying the code accordingly, it was quite safe to rely on adding the foo attribute, which will be available in the called functions. Modern javascript makes it very easy to create a copy of an object "const {not_my_variable1, not_my_variable2} = foo;". But when I changed the code I wrote "foo.bar = 123;" and in modern javascript I have to check every call to a function to see if a copy of the object has been created and if there I also have to type an extra "bar" variable so that it can be passed to the next function.
This is an unwanted side effect for the syntax sugar.

View full discussion (12 comments)