loading...

Smart Enum: What Jon Skeet Got Wrong

entomy profile image Patrick Kelly ・5 min read

Back, oh, 6-8 years ago, there's was this kick in the .NET world about Java-style enums through a particular pattern. They never settled on a name for it, so it's been called everything from Smart Enum, to Type Safe Enum, to Fat Enum, and more. You don't really hear much about it anymore. Why is a complex matter, but let's talk about a particular reason, and why it's wrong.

In 2004, Jon Skeet posted Violating the "Smart Enum" Pattern in C# where he describes a shockingly dangerous issue. He then ascribes the problem to be the pattern.

It's not the pattern. It's that for years, very smart C# devs like Jon Skeet, Eric Lippert, Steve Smith, and more, overengineered the everliving fuck out of this concept, and created the problem that Skeet then ascribed to be because of the pattern. See, it's not that "enum + data" is a problem, it's how they implemented it.

So, before anything else, let's talk about what the intent is. In Java, where the inspiration for this came from, enums are rather fat objects. Sure, they can be used just like enums in other languages, but they also can carry additional data. Here's an example lifted straight from Oracle:

public enum Planet {
    MERCURY (3.303e+23, 2.4397e6),
    VENUS   (4.869e+24, 6.0518e6),
    EARTH   (5.976e+24, 6.37814e6),
    MARS    (6.421e+23, 3.3972e6),
    JUPITER (1.9e+27,   7.1492e7),
    SATURN  (5.688e+26, 6.0268e7),
    URANUS  (8.686e+25, 2.5559e7),
    NEPTUNE (1.024e+26, 2.4746e7);

    private final double mass;   // in kilograms
    private final double radius; // in meters
    Planet(double mass, double radius) {
        this.mass = mass;
        this.radius = radius;
    }
    private double mass() { return mass; }
    private double radius() { return radius; }

    // universal gravitational constant  (m3 kg-1 s-2)
    public static final double G = 6.67300E-11;

    double surfaceGravity() {
        return G * mass / (radius * radius);
    }
    double surfaceWeight(double otherMass) {
        return otherMass * surfaceGravity();
    }
    public static void main(String[] args) {
        if (args.length != 1) {
            System.err.println("Usage: java Planet <earth_weight>");
            System.exit(-1);
        }
        double earthWeight = Double.parseDouble(args[0]);
        double mass = earthWeight/EARTH.surfaceGravity();
        for (Planet p : Planet.values())
           System.out.printf("Your weight on %s is %f%n",
                             p, p.surfaceWeight(mass));
    }
}

Can't implement that with C# enums, that's for sure!

Now, there's an immensely better way to implement this in C# than what the aforementioned individuals, and others, were trying to do. Implementing something just because another language does it is foolish. But hey, I've done plenty of ridiculous language experiments too, so, I get it.

In his blog post, the specifics of the pattern he described would create something like this (I haven't wrote these in the editor, so apologies for mistakes):

public class Planet {
    public static readonly Planet MERCURY = new Mercury();
    public static readonly Planet VENUS = new Venus();
    public static readonly Planet EARTH = new Earth();
    public static readonly Planet MARS = new Mars();
    public static readonly Planet JUPITER = new Jupiter();
    public static readonly Planet SATURN = new Saturn();
    public static readonly Planet URANUS = new Uranus();
    public static readonly Planet NEPTUNE = new Neptune();

    private readonly double mass;   // in kilograms
    private readonly double radius; // in meters

    private Planet(double mass, double radius) {
        this.mass = mass;
        this.radius = radius;
    }

    // Implement the other stuff here...

    // Now create the "enumerations"
    private class Mercury {
        private Mercury() : base(3.303e+23, 2.4397e+6) { }
    }
    private class Venus {
        private Venus() : base(4.869e+24, 6.0518e+6) { }
    }
    // And so on...
}

Looks... verbose, right? Absolutely. Also, if you want complete Enum support, you've got to add three interfaces to this, and one of them is... not pleasant to implement. Realistically, it means you're probably including other information, like the backing integer value, and a string for the name (unless you want to use reflection, but then names are super slow). All of this means bloat. Not just bloat to the code, but bloat to the objects too.

But there's a problem John Payson noticed in this method of implementing the pattern, that Jon Skeet describes. Please note that the mechanism for creating the recursive constructor chains is no longer possible (or, is at least correctly flagged as an error, I haven't tried suppressing it). Through cleverly using recursive constructor chains and the finalizer resurrection anti-pattern, you can get an object back, despite the hidden constructor. Naturally, serialization and reflection have tricks around this as well.

But the thing is, the majority of this wasn't even needed in the first place!

Consider what it is we're trying to do. An enum, with additional data and potentially methods on top of that. I'm not concerned with the methods part, because, extension methods exist now. So let's just consider the "enum + data" part.

Well, let's take this super literally.

public sealed class Planet {
    public static readonly Planet MERCURY = new Planet(Base.Mercury, 3.303e+23, 2.4397e6);
    public static readonly Planet VENUS = new Planet(Base.Venus, 4.869e+24, 6.0518e6);
    // And so on...

    private readonly Base @base;
    private readonly double mass;   // in kilograms
    private readonly double radius; // in meters
    private Planet(Base @base, double mass, double radius) {
        this.@base = @base;
        this.mass = mass;
        this.radius = radius;
    }

    private enum Base {
        Mercury,
        Venus,
        // And so on...
    }

    // Implement the other stuff here ...
}

Despite the public API looking the same, this bought us several useful things. Firstly, all the Enum API's can easily be adapted to still work, because... it's still an Enum!. In fact, you can easily support all the interfaces Enum implements, and VisualStudio will even generate the entire thing for you through @base!. But furthermore, this type can now be sealed, which prevents someone from even trying the tricks like Jon Skeet showed off. Reflection won't be super useful here either, as getting access to the constructor still won't let you create anything particularly unique. Since equality and comparisons would still be defined through @base, a new instance with the same underlying value would still evaluate the same way, essentially having value semantics. And creating a new object by abusing the serializer? Well, you'll just get the default(Base) which happens to be Base.Mercury in this case. You're always left with a valid object.

Of course there's still one way I can see which can create an invalid object through trickery. It does involve reflection, but also something else. Use reflection to get the constructor, and then cast an invalid enum value to Base, and pass it into the constructor. I'm not even really considering this a fault of my approach to the design however, because this is a known problem with any integer backed enum, which C# uses, as well as almost every single enum implementation out there.

To my fellow programmers out there, heed this post as a warning. Cleverness often gets you into trouble. Don't be any more clever than you have to be. Because when the code becomes more clever than you, you'll misunderstand it.

But seriously, don't use this pattern. There's better ways to implement anything you could possibly think you need this for.

Discussion

pic
Editor guide
 

Interesting read. I haven't come across this "Smart Enum" pattern before. It is interesting to see how things change over time and how smart people are able to find and explain things like these years later.

Thanks for your investment and your dedication to let us all know about it.