Martin Häusler

Posted on Dec 10, 2017

Modernizing Java - A language feature wish list (Part 2)

#java #javascript #ruby

In this second part of the "Modernizing Java" series, we will consider language features from C#, JavaScript (ES6) and others. The first part of the series can be found here.

Features from Ruby

Ruby is a scripting language, particularly well-known for the web framework "ruby on rails". It's a quite clean language as far as scripting languages go, and it was the first language where I encountered the yield keyword and the concept of a coroutine. Basically, yield allows you to exit the current control flow of a function, and when it is called the next time, you continue where you left off:

// this is how coroutines could look in Java
public Iterator<Number> powersOfTwo(){
   int current = 1;
   while(true){
      yield current;  // note the new "yield" keyword here
      current *= 2;
   }
}

The example above is a generator for an infinite sequence. Note that we do not burn CPU cycles with our while(true) loop here. Since we exit the control flow in each iteration, only one iteration is executed for each call to ìterator.next(). The returned iterator is implicit, you don't need to define it. This concept has also been adapted by ES6, Python, C# and many other languages, and people are putting it to great use (hello, Redux Saga!). Like many other features in this blog series, this is a quality-of-live enhancement and can be "emulated" in standard Java. However, I really think that this would be very useful.

Features from C#

Programmers often label C# as "the next evolution of Java". Indeed, the two languages share many commonalities, and if it weren't for differences in the standard library, one could quite possibly write a transpiler that converts a C# source code file into a Java source code file and vice versa. A full-blown discussion is beyond the scope of this article. C# offers a number of interesting language features that do not exist in Java.

Partial classes

In C#, you can label a class as partial. This allows you to split one class across multiple files, but the compiler treats them as one:

// in file "myClassPart1.cs"
public partial class MyClass {

}

// in file "myClassPart2.cs"
public partial class MyClass {

}

It's different from an import statement, because in the end, there is only one class in the binary files. "Why would somebody want to do that?" you may ask. The primary reason why this useful is code generation. For example, there are powerful WYSIWIG UI builders that produce C# source code files (e.g. one is integrated in VisualStudio). If you ever had the questionable pleasure of dealing with code generation then you will know the pain of having to manually edit auto-generated files. The problem is: once you re-run the generator, your manual changes are lost. In the Java world, there have been efforts to "mark" sections of the hand-written code as such, so that the generator will leave them alone (see, for example, the code generation facilities of EMF). With partial classes, those pains are gone for good. The generator controls one file (one part of the class) while your hand-written code goes into an entirely different file, which just happens to be another part of the very same class. You can be sure that your hand-written changes will not be overwritten by some automated generator, because they reside in a different file of which the generator is unaware. This is a feature that only concerns the Java compiler, the runtime remains untouched because in the end, only a single *.class file is produced. Java is a popular target for code generation, and having partial classes would help to ease the pain with generated code a lot.

The `event` keyword

This is a comparably small detail of C#, yet one that I personally enjoy: the event keyword. How often did you write code like this in Java:

private Set<EventListener> eventListeners= new HashSet<>();

public void registerEventListener(EventListener listener){
   this.eventListeners.add(listener);
}

public void removeEventListener(EventListener listener){
   this.eventListeners.remove(listener);
}

public void fireEvent(Event event){
   for(Listener listener : this.eventListeners){
      listener.onEvent(event);
   }
}

It's really repetitive. If you have a class that deals with 5 different event classes, then the code above has to be duplicated and adapted four more times. In C#, you get all of the code above like this:

public event MyEvent MyEvent;

If you want to add event listeners:

myClass.MyEvent += myListener;

... and to fire the event internally:

this.MyEvent(event);

Look Ma, no for loop! This is a really small thing, but it eliminates a lot of boilerplate code. Whether or not using the observer pattern in general is a good idea or not is an entirely different discussion.

Tuples

In a recent version of C#, native support for tuples was added. This allows you to easily construct, pass along, and deconstruct pairs, triples, quadruples, you name it. Here's what it looks like:

(int count, double sum, double sumOfSquares) = ComputeSumAndSumOfSquares(sequence);

What happened here? ComputeSumAndSumOfSquares returns a triple, containing the count, the sum, and the sum of squares. We receive all three values in a single method call. In case we are not interested in any of those three, we can replace the variable declaration with _:

(_, double sum, _) = ComputeSumAndSumOfSquares(sequence);

It's simple, it's elegant, it's a shame that it does not exist in Java.

`nameof`

A good programming habit is writing preconditions to make sure that the parameters you receive indeed match the specification. This allows your methods to fail fast and provide precise error messages. Now, if you consider this code:

public long sum(Iterator<Long> values){
   if(values == null) { throw new IllegalArgumentException("Argument 'values' must not be NULL!"}
   // ...
}

... you will notice that values appears twice: once as a parameter name, and once inside a string literal. That's fine in and on itself, but what happens if I rename the variable? The string literal won't change, because the IDE is unaware of the semantic correlation between the two (you could enable replacement inside strings too, but that has other issues...). C# offers an elegant solution:

public long Sum(IEnumerator<Long> values){
   if(values == null) { throw new ArgumentException("Argument '" + nameof(values) + "' must not be NULL!"}
   // ...
}

As you can see, nameof eliminates the need to hard-code variable names into string literals. nameof produces the name of the passed variable as a string. Another small thing, but a useful one, in particular for error messages.

Features from JavaScript (in particular ES6)

ES6 has a couple of very neat enhancements for JavaScript regarding the syntax.

Object Deconstruction

One of the most useful ones is called object deconstruction. How often did you write code like this in Java:

MethodResult result = someMethod();
int size = result.size();
byte[] data = result.getData();
User author = result.getAuthor();

ES6 eliminates a lot of ceremony here:

const { size, data, author } = someMethod();

This is similar to C# tuples, but not quite the same. ES6 looks for equally named fields in the result object of someMethod, and assigns them to new local variables. The deconstructor can actually do a lot more (such as renaming and assigning default values in case of absence), but that's for another blog post. While this won't work as smoothly in Java (because getters need to be identified and called etc.), having something along these lines would definitly be useful.

Implicit conversion from Object to Boolean

When writing JavaScript code, as much as I loathe implicit conversions in general, there is one construct that I enjoy using:

if(this.header){
   // render header
}

Note that header in the code above is not a boolean, it's a data structure. By using it in an if statement, we check if it is null (or undefined, but that's another story). This implicit conversion from Object to boolean by checking null-ness is definitly useful. However, it does have some issues in JavaScript when it comes to working with numeric values, because the number 0 also implicitly converts to false; a convention that should never have reached beyond low-level languages like C in my opinion. Checking for null-ness is a very common task in Java, and making it quicker and easier to do seems like a good idea.

From C/C++

Did you ever run into a situation in Java where you want to write a cache of configurable size (in megabytes)? Well, then you're in deep trouble. In Java, you don't know how big an object actually is. Usually you don't need to care, but if you run into such a corner case, those issues will be back with a vengeance. You can estimate the size of an object via reflection, but this is a slow and expensive operation. Alternatively, you can use Java instrumentation via an agent, but that complicates the deployment of your application and in general feels wrong, given that you only want to do something as simple as measuring the size of an object in memory. What I would really like to see in Java is what C/C++ provide out-of-the-box, which is the sizeof keyword. I realize that this is not an easy task to do in the JVM, but it's nigh impossible for programmers writing "clients" on the JVM.

From Haskell

Haskell is a functional language, and in many ways the spiritual successor of OCaml.

List comprehension

Generating lists is a common task in programming. Haskell makes this aspect really easy by introducing list comprehensions. For example:

[(i,j) | i <- [1,2], j <- [1..4] ]

... will produce the pairs [(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(2,4)]. Try that with nested for loops and you will see why the syntax above is great.

Partial Application

In Haskell, you can partially apply functions, producing new ones in the process. For example:

add x y = x + y
addOne = add 1
add 3 4 -- produces 7
addOne 6 -- also produces 7

addOne is now a function with one argument, adding the constant of 1. You can do something simílar in Java today too:

BiFunction<Integer, Integer, Integer> add = (a,b) -> a + b;
Function<Integer, Integer> addOne = (a) -> add(1, a);

... except that you need a lot more ceremony. This is also similar to the bind function in JavaScript, and default value parameters (found in several languages). Even though partial application is most widely used in functional programming, it's an aspect that is easy to "extract", because it does not depend on the other characteristics of functional programming (such as lazy evaluation). It theoretically works in any language that allows function (or method or procedure or...) calls. I don't have an explanation why there is so little adoption of this neat feature.

Conclusion

I hope that you enjoyed this tour of language features. Java is a very good language in many ways, but it needs to continue evolving. In this blog series, I tried to give an overview of "what everybody else is doing". Did I miss something significant? Are there any other language features that you would like to see in Java that were not covered in this series at all? Let me know in the comments :)

Thanks for reading!

Top comments (2)

Alex Reilly • Dec 12 '17

Swift enums are super powerful (automatic raw values, associated values, pattern matching, enforced exhaustive switch statements, breaking switch cases by default), and the ecosystem around optionals is fantastic.

mbtts • Jan 30 '18 • Edited

Another one from ES6/Scala/PHP and maybe a few others - string template literals.

"Expected " + obj.getValue() + ", but actually " +  other.getValue() + "."

Versus:

"Expected ${obj.getValue()}, but actually ${other.getValue()}."