While I was writing puzzle game solvers in Z3 Crystal, one issue I was running into more than anything else was existence of Char
type.
Most modern languages don't have character type. "foo"[0]
in most languages is a "f"
- a String
that just so happens to be one character long.
Having a separate Char
hugely complicates APIs. I can see why it could be a useful thing for performance, but the complexity cost is real.
Why character type is problematic in general
The main reason is that in Unicode world, a lot of operations you might intuitively thing would work on characters, actually don't. But they work on strings just fine.
Just one such operation out of many is upper-casing something. Here's Crystal:
puts "ß".upcase # outputs correctly uppercased SS
puts 'ß'.upcase # outputs lowercase ß
Ouch!
There's a lot of situations where uppercasing a length 1 string results in a string longer than 1.
So a language that has separate character types has a choice - either don't support any such operations on characters (which would be a huge pain), or implement them not quite correctly (like Crystal does).
Crystal specific issues
In Crystal I was writing a lot of code like c == "."
or c =~ /[0-9]/
.
The problem here is that they simply return false
or nil
, and do not complain any type issues. So I have code that looks perfectly fine, and it would run perfectly fine in Ruby and most other languages, and for which type checker isn't complaining in any way, and yet it is statically wrong.
So here are some questions:
Should Crystal have exposed Char
in the first place? If I was designing a language, I wouldn't add such type, or just have an internal one not exposed in regular APIs, but obviously that would be a huge change, so I doubt this would even be a consideration at this point.
Should "a" == 'a'
? Sure, they're different types, but 420 == 420.0
is true even though they're different types too, so it's not inherently impossible. I'm not sure what would be the implications here.
Should Char =~ Regexp
match it as if it was a length one String
? I'd say probably yes to this one, at least I'm not seeing a big downside, and it has very obvious meaning, and it's difficult to express it otherwise.
Should ==
or =~
with mismatching types pass type check? Obviously yes due to union types. If x
is String | Nil
, then x == nil
which means "foo" == nil
must be valid code. And same argument for =~
.
Should ==
or =~
with types that cannot match produce a warning? Now here's an interesting question. If we statically know that a == b
or a =~ b
will be false
/nil
due to types of a
and b
, the odds are good that it might be programmer error, not intended code. And it doesn't seem like a terribly complicated analysis to do. So should Crystal warn in such case? Like with all warnings, that's mainly a question of false positive rate, as overly aggressive linters are a huge pain.
Coming next
OK, that's enough Crystal for now. In the next episode we'll try another technology as promised.
Top comments (5)
Thank you! This is a very well put article and I agree with most of what you say. I opened a forum discussion about this to gather some ideas: forum.crystal-lang.org/t/fair-crit... . I don't think we can change things due to Crystal 1.0 backwards compatibility promise, but we can think about these things for 2.0, and maybe allow at least matching a Char against a regex.
I'd not count any of the "Better C++" languages as modern, even if they were released yesterday. "Better C++" languages all intentionally sacrifice productivity for other goals like performance (just how many string types Rust has? feels like it's at least 10).
My list would be more like (latest major version) Ruby, Python, JavaScript, Raku etc. I checked a bunch of what I considered modern languages, and Julia and Crystal seem to be the only ones with a separate Char type.
Anyway, what do you think a false positive rate would be if Crystal had a warning for statically type-impossible
==
or=~
? I think it's relatively safe with a more traditional type system, but maybe what Crystal is doing makes this impossible.Given that Crystal has union types it's essentially impossible to make
==
and=~
type safe. For example, say we want to restrict comparing numbers against numbers only, never strings. But now you have a variable of typeInt32 | String
. You want to check if that's equal to "hello". The compiler won't let you write that program because it will say "I can't compare Int32 with String". You'd have to write something likex.is_a?(String) && x == "hello"
. So it will lead to incredibly verbose code.The same argument applies to letting any type only be comparable to some other type.
With
=~
or===
we could maybe make it more strict, not sure. But for example if you have a variable of typeString | Nil
and we'd like to disallow=~
fornil
(it will never match) then you'd have to write!value.nil? && value =~ "..."
.In the end, this was just a tradeoff between verboseness vs. how common it is to fall into this trap. If you know that
String#[Int32]
returns a Char, and that you can't compare Char with Regex, then you won't do it. So I think it's just a matter of how much exposure you had to the language before.(every language has some of this, it's inevitable)
I didn't mean it as a type check, I meant it more as a linter warning if
==
is statically impossible. A few languages and linters have some sort ofCondition is always false
warning.In Crystal's case it would be based on type overlap. So the idea is that
ARGV[0] == nil
would trigger this linter rule (String
andNil
don't overlap, so it's alwaysfalse
), butARGV[0]? == nil
wouldn't (String|Nil
andNil
overlap).I'm not sure how practical that would be.
I'm almost sure that there will be a lot of false positives, based on how the compiler and language works (or put another way: the type system.) Maybe there's a rule for this already in Ameba (a popular Crystal linter)