DEV Community

Sam Ferree
Sam Ferree

Posted on


Into the Unknown - Working with Missing Data in C# Part 1: Being Explicit


This post assumes you have some knowledge of C-style object oriented languages, ideally C#, as well as Generics.

The Problem.

public int Length(string str)
  return str.Length;

This has a pretty nasty bug it's hiding. (Ignore the redundancy of it for now...) This function claims to take a string, and return length of it.

But that's not the whole truth, Unless you're working in C# 8+ (with the nullable switch enabled) you can pass this function a value that isn't a string, or anything actually

int length = Length(null)

And even if you are working in C# 8+ with the Nullable flag enabled, the compiler would only give you a warning, which can be suppressed with the ! operator

Now, in C# you can throw a slightly more helpful error

public int Length(string str)
  if (str is null)
    throw new ArgumentNullException(nameof(str));

  return str.Length;

This will let the caller know which parameter wasn't allowed to be null, and of course C# 8 will give you a little warning at compile time. Note that this warning doesn't actually prevent null from being passed in. It just gently pokes you at compile time that something may be brewing.

Now the C# 8 feature I've been referring to is called Nullable Reference types, and it's a way to be more explicit about types which may be null. If for instance, I had wanted to express that this function could handle a null value, I could have written it this way:

public int Length(string? str)
  if (str is null)
    return 0;

  return str.Length;

This would let the caller know that the Length function accepts null. This is a great feature that makes null reference errors less likely to pop up, but doesn't really offer the kind of bullet proof compiler safety i'd like out of a strongly typed language.

The Goal

What we're looking for is not just a way to inform fellow coders that some information could just plain be missing, but to have the compiler force them to deal with that possibility. It's the not handling of the null case that causes errors, and the compiler's in ability to actually force the coder to deal with it that causes them at runtime.

But this problem is actually solved, just not in C#. At least not in the language itself.

Let's look at arguably the strongest typed language, Haskell. Here's how I would define a list of numbers, that could possibly have no value (not even an empty list). You don't commonly have to specify the types in haskell, but here it is anyway

Num t => Maybe [t]

This basically tells haskell that we might have a list of objects which fall into the Num typeclass.

So how do you work with a Maybe? The most trivial thing you can do is match on it. If you're trying to get the count of negative numbers from a list which might be Nothing, the Haskell compiler will force you to provide instructions for what to do in that case.

Can we do this in C#? As is turns out we can, and a number of ways.

A Solution

I'll define just one way of implementing the Maybe type here. To fit in with the wider .NET Ecosystem, I'll call it an Option to match F#'s built in type for this. From here on out I will refer to our custom "Maybe" type as an "Option" type

public struct None { }
public static class Option 
  public static None None { get; } = new None();

public struct Option<T>
  public static implicit operator Option<T>(None none) => new Option<T>();
  public static implicit operator Option<T>(T value) => new Option<T>(value);

  private readonly bool _hasValue;
  private readonly T _value;

  private Option(T value)
    _hasValue = value is { }; // 'is not null' is planned for C# 9.0
    _value = value;

Some things to note here, The None and static Option types are helpers, they are going to make this more readable when we actually use it. The constructor of the option type is private. Since it is a struct, there is still a parameter less constructor available, but the struct type will prevent anyone from assigning an Option the value of null, which is what we're trying to avoid in the first place. The preferred way to instantiate an option type is to use one of the implicit conversions. You can either convert an instance of None, or an instance of T. Note too that the constructor which takes in an instance of T will check if it's null, and if it is, will behave as though it was converted from an instance of None. This prevents a programmer mistake from trying to "Force" a null into option by typing something like Option<string> s = (string)null;

Now as it stands, this option type is like a sealed box. And the value may or may not be in that box, so how do you get the value out of the box. There's a pretty good argument to say that shouldn't actually try to do that, and I'll cover more about what that means and how that works in a later post, but for now let's give ourselves a way to get the value out of the Option type, by forcing us to provide something to do in the case that it is missing entirely.

Let's add the following class method to the Option type.

public TOut Match<TOut>(Func<T, TOut> Some, TOut None) =>
  _hasValue ? Some(_value) : None;

This method takes two parameters, first, a Function from T to TOut which will be executed if the option contains a value, and a TOut value will be returned if the option does not contain a value. So if we wanted to use this to calculate the length of a string which we only might have, it could look something like this.

public int Length(Option<string> strOption)
  return strOption.Match(
    Some: str => str.Length,
    None: 0);

Here I am using named parameters to make it a little more clear, but we're matching Some string to string.Length, and None to 0. C# can infer the generic TOut parameter as int, so it also helps us out a little.

Here's now some of the values our Option accepting function can safely handle

Length("hello"); // string get's implicitly converted to Option<string>
Length(Option.None) // Also implicitly converted
Length(null); // warning thrown by the C# 8 compiler. In this case, will convert null to string, and be caught by converter.
Length(null!); // as above but compiler warning is suppressed.

Next time, we'll circle back to where I mentioned that in most cases you don't actually want to "Get the value out the box" and what would be better to do instead.

Top comments (0)

Timeless DEV post...

Git Concepts I Wish I Knew Years Ago

The most used technology by developers is not Javascript.

It's not Python or HTML.

It hardly even gets mentioned in interviews or listed as a pre-requisite for jobs.

I'm talking about Git and version control of course.

One does not simply learn git