DEV Community

Cover image for Blessing the diamond problem - Raku's Allomorphs are a Bad Idea
2colours
2colours

Posted on

Blessing the diamond problem - Raku's Allomorphs are a Bad Idea

In the recent Rakudo Weekly post, there was an illustration of the Allomorph type as an educational offtangent. This is the perfect inspiration to write a full post about the topic. TL;DR: Allomorphs are a net negative feature that could be easily and predictably removed from the language (given the intent, of course).

This topic has two advantages:

  • Raku knowledge is enough to understand it; in fact, knowledge of another class-based scripting language might also be enough
  • it's a plain high-level feature that could be implemented in user space so we won't be dealing with odd runtime considerations, just type theory and user-facing implications

The article states Allomorph classes should be considered a common raku scenario. I agree with this statement. Let's look into the reasons and consequences!

Why Allomorphs exist

This question is half sincere and half rhetoric.

Sincere question - a hypothesis

First let me recommend reading about the expression problem, my rationalization is related to this broader topic.

  1. Perl as the ancestor of Raku, is strong on having dedicated operations for different types: as an example, it has . for string concatenation and + for numeric addition. This approach has certain merits: it's a fairly predictable system that works well with unstructured data.
  2. Raku took this a step further: it encourages you to "bring your own operations" (to existing types), as exemplified by the ability to define new operators, and coercive type constraints. This allows for a fairly extensive (and extensible!) system of coercive operations (if you know another language that has this, please inform me).
  3. Raku is, however, also a class-based object oriented language and this framework doesn't work well with coercions - you cannot just call a method on an unrelated object and have it coerced to your class because the method won't be available for starters... a class has to proactively plan for supporting unrelated types to be able to coerce into them.
  4. Raku does take up on this OO challenge: by having very fat base classes (Mu, Any, Cool) - in particular, Any covers most of the interface of List and Cool covers1 most of Numeric and Str.
  5. Now, I think Allomorphs were envisioned to take these "object-oriented coercions" a step further, by using actual inheritance which means full coverage of the interfaces, utilizing inherited behavior and passing type checks.

Rhetoric question (you could answer regardless!)

... now, having said that, I don't know of a real life use-case of Allomorphs, other than being pretty much forced to deal with them ("common raku scenario"). If you have such examples, that would help estimating the costs and benefits. In my understanding, a string passing eg. an Int type constraint is unnecessary when you could have a coercive constraint Int() to allow for any sort of type to be coerced into Int, or Int(Str) to specifically allow this particular coercion.

I'm confident that decent alternatives could be suggested to all existing uses to Allomorphs but the only real way to know it is to see the use-cases. I for one basically only bumped into them when they caused problems for me (spoiler: it was a Bool coercion ("empty check") on a "string" obtained from prompt).

"Red flags" of the Allomorph class

These are properties of the Allomorph class that are more symptoms of an underlying issue than being issues on their own.

First, it's questionable why something of the name "allomorph" is tied to Str by inheriting from it - if these multi-faceted types are useful, why make one of the facets mandatorily a string?
Second, if we look into the Rakudo implementation of the class, we can observe that it prepares to delegate a whole bunch of calls to Numeric - so the hidden assumption is that the other type has to be some sort of Numeric.
Third, despite inheriting from Str, this type has to bring its own Str coercion method - actually, Str($allomorph) and $allomorph.Str don't even behave the same way. This inconsistency might be an issue in itself but it's not the fault of the hard-wired Str() coercion: indeed, why should something that "is a Str" return anything but itself when coerced into a Str? That odd feeling is probably because of...

The Liskov Substitution Principle

The L in the SOLID principles, the Liskov Substitution Principle (LSP) assigns semantics to subtyping, stating that a subtype of type T must be usable instead of T without breaking the program. We will see that Allomorphs break the LSP inevitably.

In a class-based type system, inheritance is a means of subtyping one can describe with the is-a relation. The LSP gives semantics to this relation: if a Horse is an Animal, then a Horse has to behave in our model exactly as an Animal would, fulfilling all invariants.

The Rakudo Weekly post highlighted an obstacle and provided an explanation to it. I think it misses the point and it shows no awareness of the LSP. (It's worth clarifying that something "that inherits from Int" is an Int.)
Posting the code example here:

class Animal {} 
class Horse is Animal {}
class Human is Animal {}
sub cantread(Animal $a) { "{$a.^name}s can't read" }
my $h = Human.new;
say cantread $h; #Humans can't read
Enter fullscreen mode Exit fullscreen mode

It's not clear whether this is meant to be a good example, and if it isn't, what does it prove or illustrate. This code example is perfectly consistent with the LSP - we might find it subjectively odd that "humans can't read" but if it's an invariant of animals that they can't read, then our choice is to either a) not model humans as animals b) accept that by extension, humans also cannot read. This is a modeling question: the two commonsensical resolutions seem to be a) discard "animals cannot read" as it isn't an invariant b) don't model humans as animals.
Anyway, the code is easy to reason about and with a more practical interface (eg. a can-read method or subroutine), one could even sensibly define results for different subtypes.

The problem with Allomorphs isn't the mere fact of the subtyping but, as we will see, the subtype of Int that we get (IntStr) is indeed not quite an Int as per the LSP - and even more notoriously, it "is a" Str but doesn't behave like a `Str.

The diamond problem and its incompatibility with the LSP

There are many variants of the diamond problem even within OO - for our case, this is the important part:

If there is a method in A that B and C have overridden, and D does not override it, then which version of the method does D inherit: that of B, or that of C?

Raku is not unique in the regard that it allows users to "invent" this problem for themselves and uses the C3 algorithm to resolve these conflicts - Python does the same (Perl as well). However, Raku specifically encourages this problem by introducing it to the core language in a somewhat custom way, bringing utilities to create values like this.

For simplicity's sake, I'm going to talk about IntStr in particular - the conclusions apply to other allomorphic types as well.

What we ignored so far is that there is in fact an overlap between the interfaces of Int and Str, and since they do behave differently, there is absolutely no way there could be a type that is simultaneously an Int, a Str and respects Liskov's principle. The previously mentioned Rakudo sources don't make it a secret where the overlaps are: the overlaps will be typically resolved2 in the numeric direction.

This means that you can have a Str - really, something that passes all type checks, coercive or not! - that will

It would be easy to downplay these issues3 as "who would do something like that" but as I said, after a bunch of type checks, you wouldn't know where the value came from, and built-ins like prompt don't ask twice to give you these incompatible values.

If we take the LSP a little bit more seriously, IntStr isn't a proper Int either: you might expect for Int value 10 <= $x < 100 to get $x.chars == 2 but $x.chars can be any number when you have an IntStr with leading zeroes4.

Now we can see why Allomorphs bring their own implementations of coercion methods, forcefully upcasting them into Str, Int, Real, and so on: this is a way - actually, the only way I know about - to forcefully opt out of Allomorphs and restore the invariants of these types. You got a string? Next time, remember to either call .Str on them, or take notes: you aren't supposed to boolify them, smartmatch them against each other, or step them. I'm not sure, though, if this list is exhaustive...

Apropos upcasting to Str. This modification hints that certain methods of Str pay attention to be invariant (return instances of Str rather than the subtype). This is not only a "red flag" (why was this necessary if Allomorphs are so great?) but also a deliberate breaking change for sure - one more to the list in my previous post.

Bonus one issue <> doesn't even always call val()

This is a random thing I observed: while <i> is a Str (fair enough I suppose) and <1i> is a ComplexStr (the usual val() scenario), for some reason, <0+1i> is outright Complex, unlike both val('0+1i') and qw:v<0+1i>, as taken from the calendar post linked in Rakudo Weekly.

This is on Ubuntu 24.04 with Rakudo 2025.12 and it seems like a mere bug. I don't know whether it's better be a bug or a feature.

Conclusion

I think there is plenty of reason to not have Allomorphs in the core language, or at least avoid them, ranging from the fact that they are trivial to implement outside of the compiler, and that you really don't want strings that don't always behave like strings scattered around in your code, coming from external sources.

The conclusion highly depends on some sort of benefit or legitimate use cases of Allomorphs - if you know about those, let me know about them in the comments, and maybe we can iterate this post, or at least point out some irreconcilable difference of value judgements.


  1. This in itself is not a problem as long as there are no overlaps between the interfaces (method signature clashes), although I think it could have been implemented better: instead of de facto covering interfaces, these interfaces should have been separated into some sort of mixins, possibly roles. 

  2. I'm not sure why so much code needed to be written but the point would stand with automatic, C3-based method resolution as well. 

  3. I'm not sure if this downplaying is probable, though: I don't see much reason to take something for granted that makes the language harder to implement, specify, document, learn. There would have to be strong positive evidence in favor of it. 

  4. At this point you may wonder whether Allomorphs allow for alternate numeric formats from Unicode. They do, although in my experience less than the actual source code would allow. This would be another story but it does mean that there are in fact many more strings that have the same values. 

Top comments (0)