OverOps, an Israeli company that helps developers understand what happens in production, carried out research on what are the top Java exception in production. Want to guess which one is in #1 place? NullPointerException.
Why is this exception is so frequent? I argue (as does Uncle Bob in his book Clean Code 😉) that it is not because developers forget to add null checks.
The reason: developers use nulls too often.
In C# and Java, all reference types can point to null. We can get a reference to point to null in the following ways:
“uninitialized” reference type variables — variables that are initialized with nulls and are assigned with their real value afterward. A bug can cause them to never be reassigned.
uninitialized reference-type class's members.
explicit assignment to null or returning null from a function.
Here are some patterns I noticed in functions returning null:
Returning null when the input is invalid. This is one way of returning error codes. I think it is an old school programming style, originating in the times when exceptions didn’t exist.
An entity’s property can be optional. When there is no data for an optional property, it returns null.
In hierarchical models, we usually can navigate up and down. When we are at the top, we need a way to say so, usually we do it by returning null.
When we want to find an entity by criteria in a collection, we return null as a way to say the entity was not found.
The code in which the NullPointerException is raised can be very far from where the bug is. It makes tracing the real problem harder. Especially if the code is branched.
In the following code example, there is a bug, somewhere in class A, causing
entity to be null. But the NullPointerException is raised inside a function of class B. Real-life code can be much more complicated.
I encounter null checks which seem like the developer was thinking:
“I know I should check for null but I don’t know what it means when the function returns null and I don’t know what to do with it,” or
“ I think this cannot be null but just to make sure, I don’t want it to blow up in production”
It usually looks like this:
Those kinds of null checks cause some code logic to not trigger, without the ability to know about it. Writing this kind of code means that some logic of a flow failed but the whole flow succeeded. It also can cause a bug in some other functionality which assumed the other function did its job.
Imagine you buy a ticket to a show online. You got a success message! The day of the show finally arrived, you leave work early, arrange a babysitter, and go to see the show. When you arrive you discover you don’t have tickets! and there are no empty seats. You return home upset and confused😠. Can you see how this kind of null check can cause this situation ?
It also makes the code branched and ugly 😢
In C# and Java reference types can always point to null. This leads to a situation that we cannot know, by looking at a function signature, if null is a valid input or output of it. I believe most of the functions don’t return or accept null.
Because it is hard to know if a function returns null or not (unless documented), developers are either inserting null checks when not needed or don’t check for nulls when needed — and yes, sometimes putting null checks when needed 😉.
This poor design choice causes the problems I described before in “Hidden errors” and a lot of NullPointerException errors, of course. Lose-lose situation. 😢
There are languages like Kotlin that aim to eliminate NullPointerException errors by differentiating between nullable references and non-nullable references. This allows catching the null assigned to non-null references and making sure developers check for null before dereferencing nullable references, all at compile-time.
Microsoft is adopting the same approach by introducing Nullable Reference Types in C#8.
Robert C. Martin, who is widely known as “Uncle Bob,” wrote one of the most famous books about clean code called (surprisingly) “Clean Code”. In this book, Uncle Bob claims, we should not return nulls and should not pass null to a function.
I want to propose some technical patterns for eliminating null usage. I am not saying this is the best solution for every scenario — just options.
The option type is a different way to represent an optional value. This type asks if a value exists and, if so, accesses the value. When trying to access the value which doesn’t exist, it raises an exception. This solves the problem of NullPointerException raised in code areas away from the bug. In Java there is the Optional class. In C# (until C# 7 ) there is the Nullable type which is only for value types but you can create your own or use a library.
A straightforward approach is to replace a reference that can be null (by logic) with this type:
Each function that returns null will be converted into two functions. One function with the same signature that throws an exception instead of returning null. The second function returns a boolean representing if it is valid or not to call the first function. Let’s see an example:
If the code holding an
IEmployee instance assumes this employee has a manager the code should call to
Manager. But if this assumption doesn’t exist the code should call to
HasManager and handle the two possible outputs.
Let’s see another example:
The logic of
ContainsEmployeById is the same as
FindEmployeById but without returning the employee. Now let’s say that those functions reach the DB, we have a performance problem here. Let’s introduce a similar but different pattern: the boolean function when returning true will also return the data we search for. It looks like this:
The fact that I can separate a function to two functions and each has its usages is a sign that returning null is a code smell for violating the Single Responsibility Principle.
A practical guideline we can derive from the Liskov principle is that a class must implement all the functions of an interface it implements. Returning null or throwing an exception are ways to not implement a function. So returning null is a code smell for violating the Liskov principle.
If a class can’t implement a specific interface’s function we can move that function to another interface and each class will implement only the interface it can.
Now instead of asking
employee.HasManager — which we will do if we used the first approach “Splitting the function into two” — we ask employee is
In existing codebases, there is a lot of code returning reference types. We cannot know if null is a valid output or not.
The first quick win I wish you to have is to change your coding conventions so null is not a valid input or output to a function. Or, at least when you decide that null is a valid output, use the Option type.
Some tools can help to enforce this convention like ReSharper and NullGuard. I guess, although I haven’t tried this yet, you can add a custom rule to SonarQube which will alert when the word null appears.
I would love to know what you think. Are you going to embrace this convention? And if not, why? What’s holding you back?
If you encounter a scenario in which you think returning null is the right design choice, or the patterns I suggested are not good, I would love to know.
Thanks Mark Kazakov for the funny meme, Alex Zhitnitsky from OverOps for answering my questions, Baot for organizing a great writing event for new bloggers, Itzik Saban , Amitay Horwitz and Max Ophius for giving me feedback.