DEV Community

Derek Hammer
Derek Hammer

Posted on

Self documenting code [a rant]

I am going to start off with an example. This particular interaction is one that I've had in multiple situations throughout my career. When talking about comments, well-factored code, clean code, or anything along similar lines, there can be someone who make the argument

Self documenting code is a myth. Developers that argue for self documenting code are just lazy and don't want to do the work necessary to keep code maintainable.

This comment is the source of my rant.

Commentless?

The absence of comments is not the defining characteristic of self-documenting code. Instead, self-documenting code is one that eliminates otherwise necessary comments with clean, straightforward source material. This has a number of benefits:

  • It reduces the total lines of code (with comments).
  • It reduces the diaspora of knowledge spread over multiple lines when one line would do.
  • It reduces the memory overhead of the reader.
  • It reduces the chance of a comment and code disagreeing with each other due to bit rot.

All of these add up to better understanding of the code, meaning faster and better changes over time.

Ugly, undocumented code

For an illustrative example, let's put some code down that I might have written in college.

class Animal {
  public int kills(bool pred) {
    int r = 0;
    for(int i = 0; i < k.size(); i++) {
      if(new Boolean(k[i][0].get('kill')) && (!pred || k[i][0].get('type') == 0) {
        r += 1;
      }
    }
    return r;
  }
}

This ugly, undocumented code could, for sure, be improved through some comments. Let's add them.

class Animal {
  // gets the number of other animal kills
  // 
  // Boolean pred if true, return only predator kills
  public int kills(Boolean pred) {
    // k is the instance variable for all kills, plants and animals
    // the resulting number of kills
    int r = 0;
    for(int i = 0; i < k.size(); i++) {
      if(
       // is this interaction an animal kill?
       new Boolean(k[i][0].get('kill')) && 
       // are we not search for just predator kills?
       // or is this an animal kill type?
       (!pred || k[i][0].get('type') == 0)
      ) {
        r += 1;
      }
    }
    return r;
  }
}

I think it would be hard to argue that, in this case, the comments aren't necessary (assuming no changes to code). In fact, I will often do exactly this style of commenting if I'm looking at a hairball of code that I do not understand, encoding my understanding into the code as we go along.

The big tools

Creating self-documenting code has many layers, but you could cover about 95% of changes with just two tools: Rename Variable/Method and Extract Method. Let's apply each of those here, leaving the comments in. For this exercise, I'm going to add a /* Refactoring: Comment */ to explain what I did.

class Animal {
  // gets the number of other animal kills
  // 
  // Boolean pred if true, return only predator kills
  /* Rename: kills -> getAnimalKills for convention following and clarity */
  /* Rename: pred -> onlyIncludePredatorKills for clarity */
  public int getAnimalKills(Boolean onlyIncludePredatorKills) {
    // k is the instance variable for all kills, plants and animals
    /* Rename: k -> allPlantAndAnimalKills for clarity */
    // the resulting number of kills
    /* Rename: r -> result for clarity */
    int result = 0;
    for(int i = 0; i < allPlantAndAnimalKills.size(); i++) {
      if(
       // is this interaction an animal kill?
       /* Extract Method: isAnimalKillAt(i) for clarity */
       isAnimalKillAt(i) &&
       // are we not search for just predator kills?
       // or is this an animal kill type?
       /* Extract Method: isPredatorKillAt(i) for clarity */
       (!onlyIncludePredatorKills || isPredatorKillAt(i))
      ) {
        result += 1;
      }
    }
    return result;
  }

  private Boolean isAnimalKillAt(int index) {
    return new Boolean(allPlantAndAnimalKills[index][0].get('kill'));
  }

  private Boolean isPredatorKillAt(int index) {
    return allPlantAndAnimalKills[index][0].get('type') == 0;
  }
}

And, now let's remove all of the comments, just to see how it goes.

class Animal {
  public int getAnimalKills(Boolean onlyIncludePredatorKills) {
    int result = 0;
    for(int i = 0; i < allPlantAndAnimalKills.size(); i++) {
      if(isAnimalKillAt(i) && (!onlyIncludePredatorKills || isPredatorKillAt(i))) {
        result += 1;
      }
    }
    return result;
  }

  private Boolean isAnimalKillAt(int index) {
    return new Boolean(allPlantAndAnimalKills[index][0].get('kill'));
  }

  private Boolean isPredatorKillAt(int index) {
    return allPlantAndAnimalKills[index][0].get('type') == 0;
  }
}

Here, we have a method that is now "self documenting." However, this code is not necessarily clean. Questions I would continue to ask:

  • Why is there a magic number of 0 in the allPlantAndAnimalKills data structure?
  • Why is type an integer?
  • Why does kill == true mean that it is an animal kill?

These questions could potentially be fixed through further refactoring. For example, maybe we just rename the 'kill' key to 'animalKill'. But, others may be too costly to fix right now or there may be a complicated reason why they need to be that way. The answer to these questions is where comments are useful.

Comments are to explain why

We use comments to explain why something is out of the ordinary. Or something that is very complicated. These explanatory comments should be answering the question: Why is the code made in this particular way? There can be some helpful "what is this?" text. But, to me, the why is way more important. Let's use our example one last time.

class Animal {
  /*
   * This data structure represents kills within the ecosystem
   * that belong to this animal.
   *
   * [
   *   // kills array
   *   [
   *     { 'kill' => 0, 'type' => 0, 'x' => 0, 'y' => 0 },
   *   ],
   *   // metadata array
   *   [
   *     { 'killsInDesert' => 0, 'killsInSea' => 0 }
   *   ]
   * ]
   *
   * This is a very complicated data structure that undergirds
   * our entire system. We've extracted most of the data out of
   * this into better typed data structures, but we have not yet
   * eliminated this for all use cases.
   *
   */
  private List<List<HashMap<String, Integer>>> allPlantAndAnimalKills;

}

Here, the comment is doing something that the code is not. It is describing the motivation of the author/editor of the code. They want this complicated data structure to go away, but haven't been able to push it over the edge just yet. This could be the catalyst that a new developer touching the code for the first time needs to invest in just getting rid of the whole bad data structure in favor of something that is better.

Note that there is also a what. This what is valuable in this case because of the type system abuse. In theory, you could make all of that information be encoded into some kind of psuedo type system that operates just on this data structure. I won't get into too much detail here, but just to say that it is possible to get rid of the what even in this very complicated case. But, no code is ever going to be able to catch the Why?

Top comments (0)