loading...

Has type safety gone too far?

donaldkellett profile image Donald Sebastian Leung ・2 min read

Types undeniably greatly reduce and perhaps even eliminate certain classes of bugs from occurring such as misinterpreting data stored in a variable/register/etc. so I see how type safety is important in this respect. Just compare C to Java - it is rather easy to accidentally create a segfault in the former if you're not careful enough but the most you'll get in the latter is a NullPointerException if you're really unlucky. However, in my opinion, modern programming languages like Swift have taken it way too far.

What do I mean? In modern programming languages like Swift, Kotlin and Go where type safety is emphasized above everything, you can't even multiply an integer with a double without explicitly type-casting the integer into a double first. This makes code containing numerical calculations involving both integral and floating-point types unnecessarily verbose when it should be very simple and straightforward. I get why implicit type-casting from double to integer (for example) are prohibited in languages like Java (since it's a lossy conversion and the programmer may not have intended it to be so) but preventing implicit type-casting the other way round is just absurd. What bugs can you possibly make by "accidentally" multiplying an integer by a double?

Another thing that I find really annoying is how Swift/Kotlin (and possibly Go?) treats nullable and non-nullable types completely differently. At least in Swift, everything is non-nullable by default (even reference types!). This means that if you want to define a recursive data structure such as a linked list, you have to explicitly mark it as nullable or else the code won't compile. And when you try to dereference a nullable type, you can't even use the dot notation (e.g. someObject.somePropertyOrMethod) plainly - you have to "safely" dereference it by prepending either a ? or a ! to the dot. All this means that the programmer has to remember an extra set of rules and do more typing just for the sake of "type safety". I get that this is intended to reduce the chances of a "null pointer exception" occurring but if an inexperienced programmer abuses the ! dereferencing just to get his/her code to compile then a null pointer exception will happen anyway!

My point is, adequate type safety is important especially for inexperienced programmers (hence they should learn a safer language such as Python/Java first before they learn C/C++) but the way it is done in modern "type-safe" languages like Swift is more akin to "helicopter parenting" - it tries to shield you from all possible sources of danger but does so in such a controlling manner that harms the developer in the long run. What do you think?

Posted on by:

donaldkellett profile

Donald Sebastian Leung

@donaldkellett

A Year 2 Computer Science and Engineering undergraduate at The Hong Kong University of Science and Technology

Discussion

markdown guide
 

I get that this is intended to reduce the chances of a "null pointer exception" occurring but if an inexperienced programmer abuses the ! dereferencing just to get his/her code to compile then a null pointer exception will happen anyway!

Languages should not base decisions on whether or not something inconveniences an inexperienced/lazy programmer. Null is the billion dollar mistake. Literal decades of experience has shown that just trusting developers to get it right doesn't work. Preventing this mistake does not harm anyone in the long run.

Everything should be non-nullable by default, because 'null' can mean too many things. Did I forget to initialize this? Did a memory allocation fail? Did some function that was meant to create an instance of this fail? Does it just mean there is no value and that should be considered normal? You don't know because 'null' has no context, it could mean any of those things. So how do you fix it if you don't know the reason why it broke?

And when you try to dereference a nullable type, you can't even use the dot notation (e.g. someObject.somePropertyOrMethod) plainly - you have to "safely" dereference it by prepending either a ? or a ! to the dot.

This is the "Optional" concept found in many functional programming languages. It essentially wraps your type in a box to make it explicitly known whether something has a value or not. That's why you can't just use dot notation, because that's not your thing - your thing is inside of the box. The language forces the programmer to handle the case of None/Nil, instead of just hoping that they will. This is a Good Thing (TM), because now the compiler can instantly point out when you aren't doing this correctly for you. The other members on your team might miss it in a code review, your compiler never will.

Everyone should strive to make things more explicit. After all, the majority of time we spend as programmers is reading code, not the actual physical act of writing it. Explicitness shows intent: sure it's more verbose, but you know what the programmer meant. Implicitness shows doubt: was the implict behavior intended, or was it a mistake?

Lastly, a moment on type-safety and modern languages. Swift is far, far away from the outer reaches of type systems. For that you should look at languages like Haskell and Idris, where the type system is so much more rich and expressive. Idris has the concept of dependent types, which blew my mind when I first read of it. A dependent type is a type that depends on a value. Take your standard linked list type in whatever language you want. Now imagine that a list of 3 ints was a separate type from a list of 4 ints!

Sounds ridiculously restrictive, right? Actually, it's quite powerful. Consider that a head function fails on an empty list. You as the programmer have to handle that case yourself and check to make sure the list isn't empty before hand. With dependent types, you can define head's type as List[n] where n > 0. Append/insert's type would be something like List[int, n] -> int -> List[int, n + 1]. Remove's type would be List[n] -> List[n - 1] where n > 0. Given all of this information, the compiler can now track the list's usage. If you never called append, the size of the list never would have changed, so it can prove n = 0, and thus calling head becomes a compile-time error. This doesn't work all the time, and sometimes you have to construct a mathemetical proof to convince the compiler, but the concept itself is as groundbreaking as going from no types to types. This can move a whole bunch of run-time errors to compile time, and that's always a good thing.

 

What do I mean? In modern programming languages like Swift, Kotlin and Go where type safety is emphasized above everything

I don't think a language where void * interface{} is the idiomatic way for type independent code reuse is emphasizing type safety above everything...

 

What bugs can you possibly make by "accidentally" multiplying an integer by a double?

Well, in real-world mathematics, this is not a problem at all. But even though we usually think that floating point numbers are like real numbers, they actually are discrete numbers approximating a real value. This can lead to strange effects, when you add or subtract two numbers with largely different order of magnitude. Here is a good example:

stackoverflow.com/questions/210049...

To be honest, it often is not a problem in every day code, but it still is not a safe operation. So there is a good reason why the compiler says that you have to be explicit about it. (The syntax should be as easy as possible to not clutter the code though.)

So I don't think languages take type-safety too far and I am often happy to rely on a compiler to avoid stupid things instead of leaving this to the runtime. But in the end, that's a personal preference as well.

 

But even though we usually think that floating point numbers are like real numbers, they actually are discrete numbers approximating a real value. This can lead to strange effects, when you add or subtract two numbers with largely different order of magnitude.

Right, but that's a problem with how we represent numbers in computers, not with a lack of type enforcement. A better reason might be to get you to think about why you're trying to combine floats and integers in arithmetic operations, which seems like a weird thing to do. I'm open to examples in which that's "proper", though.

 

Modern type systems put a strong emphasis on guaranteeing invariants. By limiting the places in the code where you can create instances of a type or modify it's inner fields, you can promise users of that type that all instances of it will have certain properties depending on the correctness of only a small part of the code (and not the entire codebase). Because that part is small and contained, it can be verified and tested both manually and automatically.

For example - you can have a Date type, and guarantee that all instances of it will have valid dates. No February 31, for example. Calling new Date(2018, 2, 31) will yield an error (however error handling works in that language) and so will new Date(2018, 2, 1).setDay(31). Actions that can result in invalid dates may fail, but actions that only fail if given an invalid date are guaranteed to succeed - because all instances of Date are guaranteed to be valid dates. So date.getWeekday() will always succeed. Without that guarantee new Date(2018, 2, 31).getWeekday() will have to be an error, and thus date.getWeekday() can not guarantee to always succeed.

Another example is regular expressions. Your Regex object is actually a representation of a finite automaton as a byte array (because everything is byte arrays). Not all byte arrays are valid automatons, but if you managed to create an instance of Regex you are guaranteed that the byte array inside it is a valid one, and you can safely use it.

Of course - the date module and the regex module may have bugs, but because they are contained they can be unit-tested. If the internals of Date and Regex would be modified from other places, these tests will not be able to guarantee that Date and Regex always hold their invariants.

So what does null have to do with it?

null puts a crack on (almost) any guarantee you can have about the type. Date is guaranteed to be a valid date. Or... it can be null. Regex is guaranteed to be a valid finite automaton. Or... it can be null.

Why are these guarantees important? Because they allow us to write total functions - functions that can not fail. For example, consider these two functions:

# C
int is_smaller(int a, int b) {
    return a < b;
}
# Python
def is_smaller(a, b):
    return a < b

At first glance, they are identical - both take 2 numbers a and b and return true (1/True) if a is smaller than b and false (0/False) if a is greater or equal to b. And yet - they are different. The C function is a total function, while the Python function is a partial function! While the C function will return 1 or 0 for any pair of values you pass to it, the Python one may not return anything - for some values, it can fail!

How so? Consider this: is_smaller(1, "2"). It will fail with a TypeError. This cannot happen with the C version, because the compiler only let us pass integers to it, but with Python we can pass anything we want - and fail at runtime.

C's type system is not the most expressive one, but it allows us - in this simple case - to limit the domain of the function to a set of values that it can handle without failing. But what if it'd get pointers?

int is_smaller(int* a, int* b) {
    return *a < *b;
}

Now is_smaller is no longer a total function - it's a partial function. It can fail. And even if we checked for nulls - what would we return? It'll be neither true nor false, and if we return some other magic number it'll put the burden of divining it's meaning to the caller. It's caller is unlikely to be able to do anything meaningful with that result either, so it'll have to pass it to its caller, and so on.

The process of checking for erroneous values and passing it to the caller can, of course, be replaced with exceptions. But many programming languages - especially the ones that follow functional programming, or are at least inspired by it - prefer to go the other way around. Instead of making partial functions easier to use, they make it possible to write total functions! And as it turns out, once we add encapsulation that allows us to guarantee invariants, the one thing that prevents so many partial functions from being total is null values!

For example - our date.getWeekday(). Date instances are already guaranteed to be valid dates, so the only way it'll fail to return an answer is if date is null. Remove nullness - and getWeekday() becomes a total function! And if it's a total function, it's caller may also become a total function (since it does not have to deal with getWeekday() failing, and the caller of it's caller too and so on - most of the code can become total functions that fail only if there is some actual bug in the code.

But this can not be achieved if you are allowing null values. While valid values usually make sense - most function wouldn't know what to do with null values, and will have to fail. Or even worse - just return null themselves, and good luck receiving null from some top level function and trying to figure out what went wrong where.

 

Thanks for the insight - that is probably one of the best cases for non-nullable types I have heard so far.

 

The problem is, that null has the same type as the thing you want to have. Take this Java code:

int doSomething(MyClass ref) {
    if(ref != null) {
         // ...
    } else {
        // what to do now? Why is it null?
    }
}

This forces you to put null checks everywhere, in contrast, without null:

Maybe<MyClass> res = createRef();
match(res) {
    Just(ref): doSomething(ref) // doSomething always gets a value
    Nothing: throw "Error while creating ref"; // Now we know what happened
}

Nullable types force you to do defensive programming.

You can't pass res directly to doSomething because the types dont match.
You have to use match to handle the null case explicitly.

checks inside consumer methods are code smell

You do not always control the consumer and not every code base follows best practices the whole time.

Also, checks for nulls inside consumer methods is a code smell. One should not check anything and if null came from the outside, raising/throwing NullPointerException or how do you call it in Java is the correct behaviour.

If checking for null is a code smell, passing null should also be discouraged. I mean, you are not supposed to catch NullPointerExceptions. Wouldn't it be better then to not have null in the first place?

 

No.

:-)

We're human and we make mistakes. The job of a programming language and it's compiler or interpreter is to help as much as possible detect these early on. The longer a bug goes undetected, the more it is built upon and the more it costs to fix.

Type safety is pretty much the first line of defense in statically typed languages. That it kicks in almost as soon as you start typing is a feature, not an annoyance. If it's annoying you, then it's doing its job and you may not be.

Of course, if you feel like having no type safety at all, there are plenty of languages out there that let you shoot yourself in the foot with with a duck, or a fish or a planet. Just don't come running (on one foot) back to complain once your customer loses all their data into what should have been an open file handle, but actually was array of integers, nulls and assorted fish.

As others have pointed out, making values non-nullable by default is actually a fantastic idea and actually in practice helps not hinders. You spend less time worrying (and coding against if you're being a good careful programmer) about "is this value null?". With nullable types, the compiler will enforce this safety and it'll reduce bugs. This is a very good thing indeed.

For extra bonus points, exceptions (as in Java and C#) are actually a bad idea for the same reason. Really you should be using error types. Turns out that has similar results; safer programs, the compiler will shout at you more, but it turns out easier to program with ("does this method or any that it calls throw an exception? Not sure, why I shall wrap it in a try catch just in case!")

As to the issue with linked list; a decent strong type system will have something called a Bottom Type, often called Nothing, which is useful for "never returns", but also allows you to say this:

let nil of type List<Nothing> = [... some singleton value ...]

Due to the wonders of type systems, List<Nothing> is a sub-type of any List<T> (assuming the library designer made it covariant in T) and nil is therefore a valid value usable in any list to signify "end of list". No nullability required. In fact you tend to use this Null Object design pattern more often, expressing what null means for certain concepts rather than hoping nobody will notice the NullPointerExceptions.

As to "Is it bad to multiply an integer by double?", as others have said, yes it is a bad idea, unless you know what you're doing. Which is why the compiler should by default shout at you unless you write "I REALLY know what I'm doing" (which is probably not true).

Sorry this is all a bit complicated, but it's for very good reason. Language designers are super paranoid on purpose: Turns out we're all rubbish programmers. This means that compilers and programming languages will annoy you.

(FYI - I was so annoyed by this that I'm now one of those paranoid compiler and programming language designers. In reality I sometimes use things like python to get things done, just don't tell anyone I said that, I'll be thrown out of the Guild Of Compiler Writers Who Must Make Programmers Suffer in shame.)

 


What bugs can you possibly make by "accidentally" multiplying an integer by a double?

Pseudocode:

int x = 2;
double y = 3.0;
print("Answer is: ", x / y);

// Alternatively
int x = 2;
double y = 1.0 / 3.0;
print("Answer is: ", x * y);

It's unclear which answer you want out of that, since one of the components is an integer it's quite likely you want to perform integer math and want an integer result (0), or you might want a result that is rounded to the nearest integer (1), or you might want a double (~0.6666667).

Better be explicit than have the compiler guess for you.

 

Forcing you to properly handle Optional types makes you remember when a value can be null, but also incentivises you to handle the null case as quickly as possible, reducing the amount of paths in your program.

Either you handle the possibility of errors now, or you pass optionals all over the place. The latter is supposed to feel bad.

This all assumes a functional style of program flow. I agree that having a class with changing optional member variables feels bad, but again, it incentivises rewriting to not make illegal states possible.

Finally, readability increases when you have less possible states.

 

I get what you are saying, but in my opinion, type safety provides self-documentation. So even know it is a pain in the butt to maintain, someone new coming to the project will be faster coming up to speed because everything is inherently documented within source code with no comments.

This is plain wrong. Nobody prevents the language from patterns matching nulls, as e. g. Erlang does. Monad is not needed here at all.

Erlang is dynamically typed. The billion dollar mistake is irrelevant for dynamically typed languages, because any variable can potentially be of any type anyways.

I have used both null and Maybe/Option extensively.

In code bases that use null, I typically arrive at an arrangement where objects are null checked before being passed into business logic. That way I can avoid null checks as a preamble to every method. However, sometimes properties of business objects or reference data can still be null, which requires some selective null checks. Because nullability is implicit, later devs reading the code made false assumptions about how nullability is used. Leading to regressions and/or defensive null checks everywhere, and generally confused logic.

I now primarily use Maybe/Option/Nullable as a way to specify "you have to handle the null case". That way it is never even a question that anyone has to ask: "should I check this parameter for null?" The type itself contains the explicit information about its nullability. That information can also be used by JSON parsers, for example, to catch missing information without me writing code for it. There is nothing emotional about my usage here. It is purely pragmatic -- it avoids confusion, which saves time.

Although, like everything, you can take Maybe/Option too far and create a lot of overhead for yourself. But a little practice fixes that.

Depending on usage, the Null Object Pattern can also be an alternative to using nulls.

No marketing magic. I tried it for a while before sharing these experiences and perspectives.

Yes, thr big problem with null is that you can't see based on the type if you have to check for null. So in a big codebase with multiple developers, you will check always for null just to be sure. If you have the nullable/not-nullable distinction at type level, you know where you have to check for null and where not. Thr compiler will force you to check where you need it.

 

never 2 much of type safety :D

 
 

Unit tests should be for proving that a section of code does what it should, and help detect when a future change in the codebase would alter the functionality of the given section of code. Strong, static-typed languages replace the insane amount of value tests you (should) see in a carefully written dynamic language.