DEV Community

Paul J. Lucas
Paul J. Lucas

Posted on • Edited on

Go-tcha: When nil != nil

In my first project in Go, I implemented my own Error struct to communicate detailed error information beyond the simple error message that Go’s built-in error interface provides.

When I first implemented it, I did it as a struct. I’ve since learned that it’s better as an interface:

package myerr

type Code string

type Error interface {
    error            // implementations must satisfy error interface
    Code() Code      // project-specific error code
    Cause() Error    // the Error that caused this Error, if any
    OrigErr() error  // the original error
    // ...
}
Enter fullscreen mode Exit fullscreen mode

Aside on Error Codes

The reason I chose a string for error codes rather than an integer is because the only advantage of integers is that they make it easy for C-like code to switch on them. However, there are a few disadvantages of integer codes including:

  1. They’re not good for human readers: you have to look up the codes or the error provider has to include string descriptions (in which case the provider might as well have made the codes a string in the first place).
  2. You have to ensure that the codes from different subsystems don’t conflict.

Yes, there’s a small performance penalty for comparing strings as opposed to integers, but:

  1. Presumably, handling errors is exceptional, so it doesn’t matter that the code is a bit slower since it’s not critical-path.
  2. If you’re doing a REST application where error codes are sent as part of a JSON response via HTTP, then you’re already taking a performance hit for itoaing and atoiing the integer code (not to mention parsing the entire JSON document) since it’s sent over the wire as text. Hence, a few extra string comparisons are noise.

Despite now being an interface, Error still has to be implemented by a struct:

type MyError struct {
    code    Code
    origErr error
    cause   *MyError
}

func (e *MyError) Cause() Error {
    return e.cause
}

func (e *MyError) Code() Code {
    return e.code
}

func (e *MyError) OrigErr() error {
    return e.origErr
}
Enter fullscreen mode Exit fullscreen mode

That all seems fairly straightforward. (Note: other functions for creating errors are elided here since they’re not relevant for this article; but they’re equally straightforward.)

The Problem

I was then writing some negative unit tests (those that ensure an error is detected and reported correctly), so I needed a way to compare two Errors. Go has DeepEqual() in its standard library, but, while it works, it only tells you if two objects are equal. For testing, if two objects aren’t equal, it’s helpful to know what specifically is not equal, so I wrote ErrorDeepEqual() that returns an error describing this:

func ErrorDeepEqual(e1, e2 myerror.Error) error {
    if e1 == nil {
        if e2 != nil {
            return errors.New("e1(nil) != e2(!nil)")
        }
        return nil
    }
    if e2 == nil {
        return errors.New("e1(!nil) != e2(nil)")
    }
    if e1.Code() != e2.Code() {
        return fmt.Errorf("e1.Code(%v) != e2.Code(%v)", e1.Code(), e2.Code())
    }
    // ...
    return ErrorDeepEqual(e1.Cause(), e2.Cause())
}
Enter fullscreen mode Exit fullscreen mode

Since an Error can have a cause, ErrorDeepEqual() ends with a recursive call to check that the causes are equal. The problem was one test ending with a panic over a nil pointer on this line:

    if e1.Code() != e2.Code() {  // panics here because e1 is nil
Enter fullscreen mode Exit fullscreen mode

But the if lines before this line check for nil, so neither e1 nor e2 can be nil here, right? How can e1 == nil be false yet e1 be nil?

The Reason

It turns out that, in Go, this can happen and confuses enough people to the point where there’s an FAQ about it. A quick summary is that an interface has two parts: a type and a value of said type. An interface is nil only if both the type and the value are nil:

var i interface{}                // i = (nil,nil)
fmt.Println(i == nil)            // true
var p *int                       // p = nil
i = p                            // i = (*int,nil)
fmt.Println(i == nil)            // false: (*int,nil) != (nil,nil)
fmt.Println(i == (*int)(nil))    // true : (*int,nil) == (*int,nil)
fmt.Println(i.(*int) == nil)     // true : nil == nil
Enter fullscreen mode Exit fullscreen mode

For now, we’ll set aside the rationale as to why Go works this way (and whether the rationale is a good rationale).

The Fix

In the mean time, I still had to figure out how I was getting a non-nil Error with a nil value. It turns out that this is the culprit:

func (e *MyError) Cause() Error {
    return e.cause               // WRONG!
}
Enter fullscreen mode Exit fullscreen mode

The problem is that whenever you assign to an interface (either explicitly via assignment or implicitly via return value), it takes on both the value and the concrete type. Once it takes on a concrete type, it will never compare equal to (the typeless) nil. The fix is:

func (e *MyError) Cause() Error {
    if e.cause == nil {          // if typed pointer is nil ...
        return nil               //     return typeless nil explicitly
    }
    return e.cause               // else return typed non-nil
}
Enter fullscreen mode Exit fullscreen mode

That is: when a function’s return type is an interface and you can return a nil pointer, you must check for nil and return the literal nil if the pointer is nil.

The Rationale

There are several reasons why Go works this way. There’s a long discussion thread on the go-nuts mailing list. I’ve gone through it and distilled the main reasons.

The first is because Go allows any user-defined type to have methods defined for it, not just struct types (or classes found in other languages). For example, we can define our own type of int and implement the Stringer interface for it:

type Aint int                    // accounting int: print negatives in ()

func (n Aint) String() string {
    if n < 0 {
        return fmt.Sprintf("(%d)", -n)
    }
    return fmt.Sprintf("%d", n)
}
Enter fullscreen mode Exit fullscreen mode

Now, let’s use a Stringer interface variable:

var n Aint                       // n = (Aint,0)
var s fmt.Stringer = n           // s = (Aint,0)
fmt.Println(s == nil)            // false: (Aint,0) != (nil,nil)
Enter fullscreen mode Exit fullscreen mode

Here, s is non-nil because it refers to an Aint — and the value just happens to be 0. There’s nothing noteworthy about a 0 value since 0 is a perfectly good value for an int.

The second reason is that Go has very strong typing. For example:

fmt.Println(s == 0)              // error: can't compare (Aint,0) to (int,0)
Enter fullscreen mode Exit fullscreen mode

is a compile-time error since you can’t compare s (that refers to an Aint) to 0 (since a plain 0 is of type int). You can, however, use a cast or type assertion:

fmt.Println(s == Aint(0))        // true: (Aint,0) == (Aint,0)
fmt.Println(s.(Aint) == 0)       // true: 0 == 0
Enter fullscreen mode Exit fullscreen mode

The third reason is that Go explicitly allows nil for methods having pointer receivers:

type T struct {
    // ...
}

func (t *T) String() string {
    if t == nil {
        return "<nil>"
    }
    // ...
}
Enter fullscreen mode Exit fullscreen mode

Just as 0 is a perfectly good value for Aint, nil is a perfectly good value for a pointer. Since nil doesn’t point to any T, it can be used to implement default behavior for T. Repeating the earlier example for Aint but now for *T:

var p *T                         // p = (*T,nil)
var s fmt.Stringer = p           // s = (*T,nil)
fmt.Println(s == nil)            // false: (*T,nil) != (nil,nil)
fmt.Println(s == (*T)(nil))      // true : (*T,nil) == (*T,nil)
fmt.Println(s.(*T) == nil)       // true : nil == nil
Enter fullscreen mode Exit fullscreen mode

we get analogous results for pointers — almost. (Can you spot the inconsistency?)

Where Go Goes Wrong

Unlike the earlier example with Aint where comparing s (of type Aint) to 0 (of type int) was an error:

fmt.Println(s == 0)              // error: can't compare (Aint,0) to (int,0)
Enter fullscreen mode Exit fullscreen mode

whereas comparing s (of type *T) to nil (of type nil) is OK:

fmt.Println(s == nil)            // OK to compare
Enter fullscreen mode Exit fullscreen mode

The problem is that, in Go, nil is overloaded with two different meanings depending on context:

  1. nil refers to a pointer with 0 value.
  2. nil refers to an interface with no type and 0 value.

Suppose Go had another keyword nilinterface for case 2 and nil were reserved exclusively for case 1. Then the above line should have been written as:

fmt.Println(s == nilinterface)   // false: (*T,nil) != nilinterface
Enter fullscreen mode Exit fullscreen mode

and it would be crystal clear that you were checking the interface itself for no type and a 0 value and not checking the value of the interface.

Furthermore, using the original nil would result in an error (for the same reason you can’t compare an Aint to a 0 without a cast or type assertion):

fmt.Println(s == nil)            // what-if error: can't compare (*T,nil) to (nil,nil)
fmt.Println(s == (*T)(nil))      // still true: (*T,nil) == (*T,nil)
fmt.Println(s.(*T) == nil)       // still true: nil == nil
Enter fullscreen mode Exit fullscreen mode

The confusion in Go is that when new programmers write:

fmt.Println(s == nil)            // checks interface for empty, not the value
Enter fullscreen mode Exit fullscreen mode

they think they’re checking the interface’s value but they’re actually checking the interface itself for no type and 0 value. If Go had a distinct keyword like nilinterface, all this confusion would go away.

Hacks

If you find all this bothersome, you can write a function to check an interface for a nil value ignoring its type (if any):

func isNilValue(i interface{}) bool {
    return i == nil || reflect.ValueOf(i).IsNil()
}
Enter fullscreen mode Exit fullscreen mode

While this works, it’s slow because reflection is slow. A much faster implementation is:

func isNilValue(i interface{}) bool {
    return (*[2]uintptr)(unsafe.Pointer(&i))[1] == 0
}
Enter fullscreen mode Exit fullscreen mode

This checks the value part of the interface directly for zero ignoring the type part.

Top comments (7)

Collapse
 
msoedov profile image
Info Comment hidden by post author - thread only accessible via permalink
Alex Miasoiedov • Edited

Wellcome aboard to the land of runtime panics and using reflect package in a statically typed language!!!

This seems more explicit:

func isNilValue(i Error) bool {
    return i == nil || i == Error(nil)
}
Enter fullscreen mode Exit fullscreen mode
Collapse
 
pauljlucas profile image
Paul J. Lucas

The panic I got wasn’t due to using reflection. As to your definition of the function, it’s supposed to be for any interface — it has nothing to do with Error.

Collapse
 
msoedov profile image
Alex Miasoiedov • Edited

why do you need to generalize it to interface{} if you need in to work with the concreteError interface? What is the point of static types if you eventually still using reflect or unsafe?

Thread Thread
 
pauljlucas profile image
Paul J. Lucas • Edited

[W]hy do you need to generalize it to interface{} if you need in to work with the concrete Error interface?

Because I prefer general solutions to narrow solutions. Suppose you have not one but N interfaces where N >> 1. Do you really want to write N different functions, one for each interface, that all do the same thing? If so, feel free — in your code-base.

But the actual fix (as I wrote) is not to let the interface become (*MyError,nil) in the first place. I added the Hacks section as a general solution to show it’s possible to check for a nil-valued interface in Go, but I never said it was preferable.

What is the point of static types if you eventually still using reflect or unsafe?

I never said I was using either. Again, the actual fix is as I described. I am not using the functions I described in the Hacks section.

Please read what I actually wrote and do not assume anything I did not write explicitly.

Thread Thread
 
msoedov profile image
Alex Miasoiedov • Edited

(*MyError,nil) Nil as a value in interface if a hack of language design

Do you really want to write N different functions, one for each interface

I don't think that a good idea to have N different nil interfaces in the first place

If so, feel free — in your code-base.

Please read what I actually wrote and do not assume anything I did not write explicitly.

I admit I have not read your post after seeing the code examples because of obvious reasons.

Thread Thread
 
pauljlucas profile image
Paul J. Lucas

(*MyError,nil) Nil as a value in interface if [sic] a hack of language design

Take it up with the Go authors, not me. I only described the way Go is and what the authors’ rationale for it being that way is. As I mentioned, it does actually have a use. But I'm not going to argue either for or against it — that’s outside the scope of my post.

I admit I have not read your post ...

In general, perhaps actually read posts before commenting on them. Specifically for my posts, you must read them before commenting on them; or please don’t comment on them at all. Thanks for your cooperation.

after seeing the code examples because of obvious reasons.

I have no idea what that means.

Thread Thread
 
msoedov profile image
Alex Miasoiedov

Specifically for my posts, you must read them before commenting on them; or please don’t comment on them at all. Thanks for your cooperation.

Absolutely! That's your right sir. I promise I won't ever either read of comment your valuable content.

Some comments have been hidden by the post's author - find out more