Vsevolod

Posted on May 17, 2023 • Edited on Sep 26, 2023

Optionality in Java 8 and beyond

#java #typetheory

This is a theoretical look at optionality of values. We'll discuss different approaches to handling "the billion-dollar mistake". I'll be using extremely simple (mostly) JVM examples. Thus, to better feel my pain, at least basic Java knowledge is recommended.

Note that I'm not pretending to be useful here. These are just my structured thoughts on the matter. However, I would be happy if someone finds them valuable or at least entertaining. Without further ado, let's start!

Java before 8

In Java 7 and below, all objects are nullable:

Object = \textcolor{red}{\lbrace null \rbrace \cup} U

By $U$ here, I mean Universal set of all possible objects in Java.

As a more specific example, let's take Integer type:

\begin{aligned} & int = \lbrace~ x~ |~ -2^{31} \le x \le 2^{31}-1 ~\rbrace \cr & Integer = \lbrace null \rbrace \cup int \end{aligned}

Due to this, compiler doesn't help us with possible null instead of a value. For example (and yes, here a primitive int could be used, but I want to keep the code as simple as possible):

Integer square(Integer i) {
    return i * i;
}

Later in the code, someone mistakenly calls this function with null value:

square(null);

Boom! A runtime exception for our clients: NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because "<parameter1>" is null. We've all seen this one.

Haskell

In theory, compiler can distinguish nullable values. At the time of Java 7 one example of such a compiler was GHC (The Glasgow Haskell Compiler). In Haskell, types aren't nullable by default. And for possibly absent values, a special Maybe type is declared:

data Maybe a = Nothing | Just a

In such a system, we have compiler guarantees that Nothing couldn't possibly be passed instead of a value, since those are two distinct types. Using Int as an example:

\begin{aligned} & Int = \lbrace ~x~ |~ -2^{29} \le x \le 2^{29}-1 ~\rbrace \cr & Nothing \notin Int \end{aligned}

The same square function as before:

square :: Int -> Int
square = (^ 2)

And later in the code we try to pass Nothing to it:

square Nothing

Our code doesn't compile: Couldn't match expected type ‘Int’ with actual type ‘Maybe a0’. Wow!

Java 8

Today, all devs in Java world know "the best way" to deal with nullable types. The new (eh, not really new in 2023) and shiny Optional class. Since we already know a Maybe type, we can see clear similarity between the two. Let's try using it in our simple example:

Optional<Integer> square(Optional<Integer> o) {
    return o.map(i -> i * i);
}

And then:

square(null);

Huh: NullPointerException: Cannot invoke "java.util.Optional.map(java.util.function.Function)" because "<parameter1>" is null.

The obvious problem here is that we want our Optional<Integer> to be:

Optional~Integer = \lbrace empty() \rbrace \cup int

But in reality it is:

Optional~Integer = \textcolor{red}{\lbrace null \rbrace \cup} \lbrace empty() \rbrace \cup int

Not good!

I'm not saying that Optional is bad. It's good. It's a good convention. By using optionals, library developers clearly communicate to library users where they can provide or receive no value. However, in statically typed languages, we generally want more. We want dat compiler guarantee (at least I do)!

Scala 2

When Java devs want something from Haskell, where do they go? Right. They check out Scala. So did I.

Scala Option is closer to Haskell Maybe, since it's an algebraic sum type. It is None | Some a in terms of Haskell. Thus, you can do pattern matching and other cool things with it. But...

def square(o: Option[Int]): Option[Int] = 
  o match {
    case Some(i) => Some(i * i)
    case None => None
  }

square(null) // java.lang.ExceptionInInitializerError: Caused by: scala.MatchError: null

Runtime exception due to the same problem:

Option~Int = \textcolor{red}{\lbrace null \rbrace \cup} \lbrace None \rbrace \cup Int

The Project Valhalla

Several years ago, when I checked Project Valhalla for the first time, the code worked like so:

inline class Point {
    private int x;
    private int y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    public int x() { return x; }
    public int y() { return y; }

    public Point add(Point other) {
        return new Point(x + other.x, y + other.y);
    }
}

point.add(null); // error: compilation failed: 
                 // incompatible types: <null> cannot be converted to Point

Compiler guarantees were finally there!

However, in the newest version, even though the Point itself looks neater:

value record Point(int x, int y) {
    public Point add(Point other) {
        return new Point(x + other.x, y + other.y);
    }
}

We are back with our good ol' NPE:

point.add(null); // NullPointerException: Cannot read field "x" because "<parameter1>" is null

Frankly, I am not aware "why", probably reasoning for this is buried somewhere in Valhalla-related discussions. But the sad fact is that even with Valhalla in place, we're still left with no null-safety compiler guarantees.

The Problem with Maybe

For now, it looks like Haskell's Maybe is right as rain. But it has the following problem:

Int \textcolor{red}{\not\subset} Maybe~ Int

Maybe Int should form a $\lbrace Nothing \rbrace \cup Int$ set, but it doesn't. Due to this, theoretically compatible changes become incompatible in Haskell. Let's say we have such a (strange) square function:

square :: Int -> Maybe Int
square 0 = Nothing
square i = Just $ i ^ 2

At some point in time, we decide to return Int:

square :: Int -> Int
square = (^ 2)

Or accept Maybe Int:

square :: Maybe Int -> Maybe Int
square = fmap (^ 2)

Both cases are an ease of requirements, so theoretically should be backward compatible. But in Haskell they aren't. Compilation is broken for our clients.

The idea for this chapter was stolen from the Maybe Not talk by Rich Hickey.

The Union Way

Can something be better than Haskell? For our use case - "yes".

Returning to the JVM, we could find the type we were looking for. Kotlin's Int?:

Int? = \lbrace null \rbrace \cup Int

Firstly, non-nullable Int gives us compiler guarantees that it is actually $\lbrace~ x~ |~ -2^{31} \le x \le 2^{31}-1 ~\rbrace$ :

fun square(i: Int): Int = i * i

square(null) // error: null can not be a value of a non-null type Int

Secondly, ease of requirements works without breaking our clients (they'll only get warnings from the compiler):

fun square(i: Int): Int? = when (i) {
    0 -> null
    else -> i * i
}

Guaranteeing to return value:

fun square(i: Int): Int = i * i

Or accepting nulls as well as values:

fun square(i: Int?): Int? = i?.let { it * it }

Doesn't break clients. For example, square(2)?.let { it + 1 } works for all three functions.

Scala 3

While I was thinking to publish or not to publish, Scala 3 was released (yes-yes, the first draft of this writing was written several years ago). Dotty has built-in support for union types and the opt-in flag -Yexplicit-nulls to enable null safety.

My previous example from Scala 2 now (in version 3.2.2) gives a compile-time error: Found: Null, Required: Option[Int].

Backward compatibility is in place as well:

square(3).nn + 1 // works for all examples below

def square(i: Int): Int | Null = i match
  case 0 => null
  case _ => i * i

def square(i: Int): Int = i * i

def square(i: Int | Null): Int | Null = i match
  case null => null
  case _ => i * i

Yep! Right what we wanted.

Although... Compare Kotlin code for Int? -> Int? function with the above Int | Null -> Int | Null definition in Scala. One-liner transformed into match/case expression. Scala lacks operators like ?. or ?:, which makes working with nullable types awkward. Also, since the feature is new and optional (pun intended), it's an order of magnitude less spread around Scala codebases. So, for now, I would give a point to Kotlin using a pen and to Scala using a pencil. That said, the future regarding proper null safety looks bright in Scala world.

Today, we've reviewed existing ways of handling nulls in different languages (mostly on the JVM). To sum up, let's assign points to each approach discussed:

No compiler guarantees at all (Java).
Compiler guarantees (Haskell).
Proper union type (Scala, Kotlin).
- Kotlin gets an extra 0.5 for a better standard null-handling utilities.

Am I promoting Kotlin here? Probably not. Encouraging Java developers to try it and make their own weighted decision? Definitely yes.

In the following article (if it ever gets published), I am planning to discuss the cons of Kotlin's implementation by leveraging such power means as abstraction and composition. Thanks for reading!

Top comments (1)

х**о*і_новини • Oct 17 '23

Wow!