Jason McCreary

Posted on Feb 15, 2018 • Originally published at jason.pureconcepts.net

Removing Comments

#programming #refactoring #practices

After my post on Boyscouting I found most of the discussion focused on code comments. Specifically, that part of boyscouting was removing comments. Tips on removing comments was also one of my most popular code cleanup tweets.

Recently removing comments came up again in a code review:

// filter out contacts that have unsubscribed
$contacts = $unsubscribedFilter->filter($contacts);

// filter out duplicate contacts
$contacts = $duplicateFilter->filter($contacts);

The reviewer asked why I "removed the documentation" when I condensed the code to:

$contacts = $unsubscribedFilter->filter($contacts);
$contacts = $duplicateFilter->filter($contacts);

I'll come back to the difference between comments and documentation. For now, there is a difference and what I removed were indeed comments.

First, let me acknowledge all the team leads and enginering managers crying out:

Remove Comments?!!

Yes, I am suggesting removing comments from your code.

Surely you don't mean removing useful comments?

Well, what's a useful comment?

Let's back up and address the distiction between comments and documentation. Documentation is formatted comment blocks (e.g. DocBlock, JavaDoc, etc). Comments are everything else.

A comment should relay why, not what or how. Said another way, if there is something that can't be relayed by reading the code, then a comment might be needed.

Going back to the code review, there was nothing the comments relayed that the code did not. I can infer from the assignment and method names we are "filtering duplicate contacts". So the code comment above is not useful. In fact, I wasted time reading it.

For me, removing comments is about acheiving code that clearly communicates. One could even refactor the code to improve readability. Consider:

$contacts->filterUnsubscribed();

Comments can not only be useful, they can also be misleading. I continually come across outdated comments that have not evolved as the code has changed. I recently needed to fix the following legacy code which was prematurely ending.

foreach ($items as $item) {
    if ($item->published) {
        // we've hit the most recent item before this push, so stop looping
        exit;
    }
}

The comment says to stop looping, but the code exits. I wasted several minutes debating which to trust. Given the bug, I updated the code to follow the comment and stop looping. Regardless, this bug would have been solved without the comment. Combined with the buggy code, it did more harm than good.

That's really what it's about – doing good. Leaving something better than you found it. That's why it's called Boyscouting. If you come across a comment that you can remove, remove it. If you can't remove it, see if you can refactor the code so you can remove the comment. Future developers will thank you. Even if that future developer is you.

I eventually came across the following passage from Rob Pike regarding comments which, quite effectively, summarizes this entire post.

[comments are] a delicate matter, requiring taste and judgment. I tend to err on the side of eliminating comments, for several reasons. First, if the code is clear, and uses good type names and variable names, it should explain itself. Second, comments aren’t checked by the compiler, so there is no guarantee they’re right, especially after the code is modified. A misleading comment can be very confusion. Third, the issue of typography: comments clutter code.

Want more cleanup tips? I'm writing a "field guide" with real-world practices to help you write code that truly lasts. Sign up for early access.

Top comments (18)

Jason C. McDonald • Feb 15 '18

I also agree with commenting intent, but maybe not with some of the "removal" suggestions herein.

My company has actually seen significant gains in coding speed, acclimation time, and code quality after adopting the Commenting Showing Intent standard, wherein every logical block/statement has an intent comment, mandatory. If someone went through and "boyscouted" our code, it would do serious damage to our code base, and probably get them banned. ;)

Take, for example, that bug you mentioned finding because of the comment. Yes, you might have found that on your own, but keep in mind the sheer MASS of code you're looking at. Usually, by time one encounters that, their eyes have already glazed over. What stops them? The discrepancy between the comment and the code. It encourages further investigation, and has led to a LOT of bugs being caught and fixed, even by complete newbies to the language/code.

Thus...while one should definitely prune bad comments, I'd say a better thing to do would be to tidy the comments up.

Make sure they're commenting intent, not "what" or "how" (unless you're explaining an especially bizarre algorithmic choice or whatnot).
Check for discrepancies, and after resolving them, ensure the comments are accurate.
Remove cruft.
Fix grammar (it's more important than it sounds).

In other words, just make it more readable. But PLEASE don't go "boyscouting" in CSI-standardized code!

Jason McCreary • Feb 15 '18 • Edited

Agreed, don't go removing comments, or any other boyscouting practice, from any standardized codebase.

With that said, CSI seems to say the same thing I do:

CSI comments state WHY

I have no problem with comments relaying why where the code can not.

Jason C. McDonald • Feb 15 '18

Except, in my experience, code rarely ever states WHY. It states WHAT. You almost always need intent-comments, because...

$contacts = $unsubscribedFilter->filter($contacts);
$contacts = $duplicateFilter->filter($contacts);

...does NOT tell me why you're doing that, only what you're doing. Therein lie a host of bugs and logic errors.

Jason McCreary • Feb 15 '18 • Edited

Code is not a "cooking show" where you are guided line by line, step by step through commentary. There is a basic level of trust and skill required to be a programmer. If you can't glean enough about the code through reading it, then it should be refactored - not commented. There is a limit to this. So at some point, it's the programmers responsibility to understand. Let's not lower the bar with comments everywhere.

Jason C. McDonald • Feb 15 '18 • Edited

No, that's not the point at all. The reason for CSI, for intent-commenting in general, is that intent cannot be reconstructed easily. You can figure out WHAT with any degree of conmpentency, but NOT why. This is a major reason we have so much bad code; we spend so much of our time trying to guess what we and other devs were intending, we waste hours trying to grok what a few comments could have helped us grok in minutes.

See, mind-reading is not part of the programmer's skill set. You cannot even necessarily remember your own intent six months after you wrote the code; if anything, you are intent-commenting for your own good.

Now, to clarify...that code you posted almost certainly doesn't appear alone in the code. You aren't facing a two-line block, you're facing a wall of code. What's the intent in context? Perhaps this is in a function that loads your contact list...

// We only want to see contacts we're subscribed to.
$contacts = $unsubscribedFilter->filter($contacts);
$contacts = $duplicateFilter->filter($contacts);

See? Now it's clear why the code is there. We can tell we're filtering out unsubscribed and duplicate contacts, but we now know why we're doing it - to filter the displayed results. Now if you are finding that you're seeing duplicates in the results, and go through, you might realize "oops, I'm storing the results in $contacts, but I should be storing them in $display! That's only easily grokked because of the intent content; ordinarily, you'd probably have looked right past it (code blindness is nothing to mess around with!)

And, for the record, I have encountered those sorts of situations in my own code for years.

Jason Lotito • Feb 18 '18

Why do we only want to see contacts we're subscribed to? That comment just says what is being done, and I can't even be sure that's what the code is doing.

Well, you should refactor that, because as of right now, you have code that does something very specific (return a list of contacts you are subscribed to, not display that as the comment implies, btw). A better refactoring of that would be to do something like this:

function getContactsSubscribedTo($contacts) {
    $contacts = $unsubscribedFilter->filter($contacts);
    $contacts = $duplicateFilter->filter($contacts);
}

There. We've removed the unnecessary, confusing, "what's it doing" comment with a clearly defined and testable function.

Jason C. McDonald • Feb 18 '18

...I can't even be sure that's what the code is doing.

Ergo, the intent comment. What is my intention? Why am I writing that code there? If you can't be sure that's correct, then perhaps something is wrong.

I know we can parse semantics here, but I'm not talking theory. The ROI on the commenting standard I describe is actually observable within my own company.

So, yes, you might be describing what you intend for it to be doing. And, yes, there are rare cases where a comment is pointless. But the problem with examples like the above is that they're oversimplified. In the real world, those two little lines of code appear within the context of hundreds or thousands of lines of code, and hundreds of functions with potentially similar names. In practice, the benefit of those intent comments becomes apparent when you begin using it in a real-world context.

Jason Lotito • Feb 18 '18

So how do I verify your intent without you and automatically?

Jason C. McDonald • Feb 18 '18 • Edited

...how do I verify your intent with you...

The same as how you verify my code without me. Read it.

...and automatically...

The same as how you verify my code automatically. On a large scope, either that's what happens in the tests, or it isn't. On a smaller scope, you don't.

Beyond this, I really doubt this conversation is going to be productive. I know intent-comments work because I actually use them in production. But, I have learned that there are camps who believe that comments are a waste of time, and who will continue to believe they're a waste of time regardless of what I have to say about it. ;) So, I think we'd better agree-to-disagree at this point.

Ben Sinclair • Feb 15 '18

I'm torn on this, too. Your example of "unsubscribedFilter" is ambiguous without either the comment or reinforcement from the similarly-used "duplicateFilter". If it was on its own I would probably interpret it as a filter which only let unsubscribed items through, not one which filtered them out.

The better solution for me would be to change the filter or method names and then remove the comments - which is too much change for this "boyscouting" idea.

I know what you're saying, and I know it might seem that I'm being picky, but I'm saying this to back up my point, which is that even if you think something is clear without comments, it might not be to someone else. It might be a good idea to ask the person at the next desk whether they agree with you that something's obvious.

Jason McCreary • Feb 15 '18

Agreed, removing comments is not the only step here. It must have the accompanying step to improve the readability of code without the comment.

Burdette Lamar • Feb 15 '18

Agree: burdettelamar.wordpress.com/2018/0...

Mihail Malo • Feb 8 '19 • Edited

Would you say this comment is justified?

  buffer(length: number) {
    // has to come first so we don't allocate on error
    const start = this.fwd(length)
    const newBuf = Buffer.allocUnsafe(length)
    this.buf.copy(newBuf, 0, start, this.pointer)
    return newBuf
  }

It's there to prevent someone "optimizing" this method to the following:

  buffer(length: number) {
    const newBuf = Buffer.allocUnsafe(length)
    this.buf.copy(newBuf, 0, this.fwd(length), this.pointer)
    return newBuf
  }

Which offers no performance improvement, but if an error causes the bombardment of this method with yuuuuuge numbers, orders of magnitude larger than the underlying Buffer, it will cause lag, instead of just throwing the errors nicely.

For reference, this is what the .fwd method does:

  /**
   * Sets the buffer pointer, the address at which next parse will begin.
   * @throws {RangeError} if this will put pointer outside of `0..=len` range.
   * @param {number} addr What index to go to
   * @returns {number} the previous pointer, before setting
   */
  private goTo(addr: number): number {
    const { pointer, len } = this
    const newPointer = addr
    if (newPointer > len) throw new RangeError("Index past Buffer end")
    if (newPointer < 0) throw new RangeError("Index before Buffer beginning")
    this.pointer = newPointer
    return pointer
  }
  /**
   * Increments the buffer pointer, the address at which next parse will begin.
   * @throws {RangeError} if this will put pointer outside of `0..=len` range.
   * @param {number} dist How far to wind the buffer pointer. Can be negative to rewind and overwrite.
   * @returns {number} the previous pointer, before winding
   */
  fwd(dist: number): number {
    return this.goTo(this.pointer + dist)
  }

The JSDoc might seem excessive, but it is useful because tooling displays it in tooltips wherever it is referenced.

Jason McCreary • Feb 9 '19

I don't consider "doc blocks" to be comments. So I wouldn't consider those "excessive".

As far as the inline comment regarding the optimization, these are definitely the gray area where a comment does relay non-obvious implementation details. This could potentially be mitigated with a name such as optimizedBuffer or by pushing the buffer optimizations lower to a createFastBuffer.

Mihail Malo • Feb 9 '19 • Edited

I didn't mean the JSDoc as part of the question, it was just a literal copy paste to show what fwd does.
I don't see how it could be extracted ergonomically, since the invariant is that this.fwd(length) returning as the return value and this.pointer the two arguments to copy runs before allocate

In most cases inlining them is perfectly acceptable. These are the cases around buffer:

  doublele() {
    return this.buf.readDoubleLE(this.fwd(8))
  }
  slice(length: number) {
    return this.buf.slice(this.fwd(length), this.pointer)
  }
  buffer(/**/){/**/}
  string(length: number, encoding: Encoding) {
    return this.buf.toString(encoding, this.fwd(length), this.pointer)
  }

But yeah, hopefully this class will be single-responsibility enough that no one will come edit it at all. And the comment is definitely staying.

One more good thing about deleting comments is that the few that do remain are seen as that much more important.

Matt Lancaster • Feb 15 '18

Nice article Jason.

If anyone is looking to delve deeper into proper coding techniques then I suggest you pick up "Clean Code" by Robert Martin (Uncle Bob). It's a fantastic read and will help you think in a more concise thought process when writing code.

Frank Carr • Feb 15 '18

I'm kind of torn on this issue.

First of all, I like the code itself to be self-documenting with no inline comments. Comments that describe implementation, such as the "stop looping" one you mentioned, can quickly go out of date.

However, in C# components and microservices, I use documentation comments to allow other developers to see Intellisense hints in Visual Studio and to build documentation files using tools like Sandcastle. These comments should describe what the method does, such as "Search for unpaid invoices for specified customer", but not go into any implementation details like "Query XYZ.Invoices database table for unpaid customer invoices".

Also, since I frequently deal with legacy code going back 20+ years in some cases, I want to keep comments made by programmers who left the company many years ago. This gives me insight into what they were thinking when they originally developed the code. I may add my own comments (initialed) to indicate that the comments are no longer applicable if that's the case.

Jason McCreary • Feb 15 '18 • Edited

Agreed, there is definitely value in widely shared codebase to have a higher level of comments. I would still strive for readable code without comments in any codebase. That is, the comments should augment.

View full discussion (18 comments)