Don’t document your code. Code your documentation.

Dani Morillas on January 25, 2017

This is one of the great discussions among developers: document or not document your code? Is it worth writing documentati... [Read Full]
markdown cheatsheet

Well, I think modern coders have confused the reason we should be documenting and commenting. My policy is pretty simple:

  • You should always be able to tell WHAT the code is and does by how you write it (your point.)

  • The WHY (the intent of the code) is rarely something one can figure out in the best of circumstances. This is where commenting comes in - but comment WHY, not WHAT.

  • Documentation, in the literal sense, should be hand-written for your end users. HOW do you use the library/application? I hate it when projects call API docs "documentation," and don't bother with anything else, as it is nigh impossible to learn to use an unfamiliar library from the API docs.

In short, we MUST do all three: code your documentation (WHAT), comment your code (WHY), and document your project (HOW).

I was about to write exactly the same post :)

The only thing is that it's extremely complex to teach that to youngsters. Code reviewing doc from juniors takes a lot of trips back and forth in the comments.

I guess that clear a clear doc comes with a clear vision of what you're doing.

I always try to use good naming conventions for variables, functions, classes, etc.., but I still believe in docstrings for almost every function/method/class. It takes almost no time, it can help you remain focused, and it will help others quickly see what is happening in the code. I avoid using comments within a function/method, unless it is unavoidably convoluted.

I completely agree with this. It makes it so much easier to add someone new to a project.

I feel strong opinions about this subject, I've seen the easiness and welcoming warmth that documented code gives to new people in the code base (and by new I don't mean people with no experience).

I use both, both prose and long and clear variable names that try to speak the intention. I can pick up the code rather easily but my experience is that people don't want to be reading code for the most part unless they absolutely have to.

So as a Python dev I felt that self documented code was the best... but after working in a totally unknown code base in a new language to me... not having prose documentation is hellish.

Do the right thing, try to make comments and variables as simple as possible but not any simpler and also keeping in mind that simplicity is hard and expensive to attain.

Hell sometimes prose and code don't even matter if people won't read either. At that point you even have to consider making video tutorials, talks, conferences, whatever you have to do just to get your point across.

Documentation is hard, but not because of the medium. Some people digest code, others prose, others talks, other classes and a mentor.

And our job is to consider every single one, that is if you want your code to survive.

I agree strongly, especially to the prose part. It's probably a thing of preference, but I just love it if I can jump through a codebase, gaining a good idea of what it does by only reading the comments and some really trivial lines of code.

A problem with depending on your function names for documentation is that it makes your code rather verbose. (Anyone who has been a Java for dev for any length of time knows exactly what I'm talking about!) Following the logic of Clean Code, I suppose that map should have been named applyFunctionToAllElements or something. Fortunately though that isn't the case, and after reading the documentation you can remember what map means and be spared all that verbosity.

I'm not against favoring code over comments though. If it's a private method that has a very specific purpose and isn't called from many places then I'm happy to not give it a comment header. As ever, "it depends".

Whenever I found naming hard, I realised had an incomplete idea of the domain and goal.

Why am I mapping this, why am I transforming this?

While I agree with the general gist of the article, I would live to address this part:

"Extract as much code as you can to methods. Even if you end up having a method with only 3 or 4 lines. Each method should do one thing and only one thing. "

I am not a fan of this part of Clean Code. It is too simplistic on it's advice given - extraction is a good tool, but giving blanket recommendations this way doesn't work.

Let me explain.

I understand where you are coming from, and I did the same for years, but I eventually concluded that my behaviour was leading to worse code, not better. Now, instead, where I used to extract a function, I just tidy it up into a commented block.

In theory, aggressive extraction sounds really good, but pushing for it will lead to premature (i.e. incorrect) abstraction. (Also, since we are talking about a class here, it will also tend to lead to harder-to-track state because it's now spread out over many methods.)

The reason why it leads to premature abstraction is that extracting is really hard to do correctly. It's pretty hard to find the border for what your "one" thing is (and naming it is even harder, sometimes an elegant name obscures), so you are going to fail at doing it a large % of the time. Because the more you extract, the more failed extractions you are going to have, it's important to be careful about extracting and don't use it wildly.

John Carmack, the oculus CTO, writes about a similar thing here:
number-none.com/blow/john_carmack_...

I've seen what happens to comments in code: Nothing. And that's bad. The first coder writes a comment, yeah, great. Then someone makes a quick change to the code... But guess what, the comment doesn't get adjusted. And suddenly the code and the comment are telling two different stories. Happens all the time.

This is one of the reasons I also prefer the clean code approach: Let the code tell the story and restrict the code documentation to the "why". What a method does should be obvious, but the question will be, why it does that. What is the purpose? Why -12 and not -123? etc. The less redundant documentation inside your code, the better.

Of course, if we are talking about a public API or something, a documentation for the user (which can be in the form of many examples) is great to get started, no doubt about that. Also it's not a bad idea to document that structure of your project, the "big picture" to find your way around.

In your experience, do devs who discontinue the comments really have the discipline to continue the "clean code" paradigma?

I recently read Clean Code for Javascript, it's quite big. But I implement in my coding practices, i think it's for the best, cuz we dev's write's code and projects day in day out, it's for the best to write code which feels like a story and anyone can read it through!

github.com/ryanmcdermott/clean-cod...

No. While I love some well written code, clear in meaning, it is by no means a substitute for documenting your code. I shouldn't need to dig through what is sometimes hundreds of files and thousands of lines of code just to find out how to do work with a project. It doesn't matter how clear code is, it cannot describe your entire code base, or the far reaching side effects.

This is forgetting the (few) times you have to document why you are doing something non-obvious. I otherwise agree documenting the "what" is generally useless and redundant and quickly become stale and error-prone. I however also agree with other commenters that it can be much easier, and not less readable, possibly more, to sometimes add a one line comment rather than extracting a piece of code into its own function/method.

Documenting the why, rather than the what, can be absolutely crucial. As for comments being easier to understand than code being extracted into separate functions; I'd say there is far more chance of a comment being out of date than a function name (although the latter is also entirely a possibility).

I feel disappointed in this article, as I thought it would be about writing a program that takes an AST of your code as input and then outputs literate "pseudocode" that a human can read. For example, if you were to write some FizzBuzz code and feed it into the machine, you would then get the following info:

  1. Set the variable "integer" to equal the input.
  2. Check if "integer" is the modulo of 3, if it is return "fizz".
  3. Check if "integer" is the modulo of 5, if it is return "buzz".
  4. Check if "integer" is the modulo of 3 and the modulo of 5, if it is return "fizzbuzz".
  5. Return "integer".

It might not be good documentation, but it would be interesting documentation.

As for your broader point, I tend to lean towards "Readme Driven Development" - writing the documentation first and then writing the code to match your documentation. You start off with your clean, awesome documentation (knowing what your code is supposed to do beforehand)...meaning you can then focus your time on writing the clean code that matches what your documentation promises.

It could also be "bug driven development". Your README is a specification, and you never do anything that is NOT in the documentation first.

I took this approach in my project here: github.com/ScalaWilliam/eventsourc...

Also you can add some automation to this:
For example, if you added a change in README, you can say in the description "New Issue: thing X is not implemented, does not match readme", and a git bot that I made will automatically create an issue based on that. Here's a commit and an issue:
github.com/ScalaWilliam/git-work/c... github.com/ScalaWilliam/git-work/i...

Side note:
The fizz buzz example is that of imperative code, it's really better to read the code in that case. Where I'm thinking some sort of visual would be useful is in generating a code architecture view automatically. Like a dependency tree within your code.

There some cases where I like to put a short summary at the top of a class indicating what it's supposed to do, or what the scope is. Especially if (the name of) the class bears no reference to some familiar pattern in the team.

And I'm all for proper naming and method extraction to avoid comments.

I like to document my code inline as soon as it becomes non-obvious on first glance. I specifically write a small comment on each path distinction the code may take, introduced by if/else. Generally, my code should be easily understandable by people who have mastered the language its written in but have no idea what software itself is about.

If my method names have speaking names, I like it and I'll make use of it. But as far as I have tried it, kind of more often than not this is not possible without introducing the lengthy gibberish names we have to deal with in Java or C#.

Writing documentation is a pretty thankless job. Nobody will ever read what you write and no matter how much of it you write people will find something you did not document and complain about it. It's always somebody else that ought to document something. And somehow it is frequently implied that would be me.

So,my attitude to documentation is "by exception". Most code should be obvious and not in need of any documentation. But once in a while you do something non trivial and then it matters to document wtf. you were thinking. A simple link to a stackoverflow post to explain the totally non obvious workaround for some issue, the weird set of circumstances that caused you to add some null check or other condition that you chased down, etc. Those are the things that need documenting. All the rest is obvious.

I love little hints that outline the reasoning behind bits of non obvious hackery, algorithm choices, concurrency handling and other bits of code where somebody really went the extra effort to get things done the right way.

Great post!

When I first got into development I was really confused by the concept of not adding documentation to code. In school I was taught to always comment everything and always have documentation in code. On top of that I worked as an automation engineer for a while where each line had to be commented to explain what it was doing. Which in hindsight, was terrible to read and scan.

I understand this concept now and really it's just easier this way. The only time comments are really added to a project that I've seen are when a special work around is added for a bug in a library (external dependency), which includes a TODO to come back and update the code when the bug is fixed.

For me and the company I work for, we document our functionality and how API's are supposed to work in a common space that isn't in code.

Both! Use meaningful variables to help define your constructs but also explain your thought process in comments. As someone who as written code and fixed other people's code nothing helps more than "// this loop examines each item in the array looking for the process flag" W00T! I now know why this loop exists!

There are a few reasons why I consider comments important:

  • languages are ambiguous, or lacking in semantic clarity. A perfectly formed algorithm may not always reveal what it's attempting to do, rather only how it is done.
  • How something is done, or what is happening, is not always enough to understand why the function would be called in the first place. Though I agree with nice function names, I disagree with names that are 80-100 characters long. Short of those lengths there will always be some uncertainty in the function's purpose.
  • Functions don't live on their own, they exist in a framework. Describing how this function fits into the framework is something that code alone cannot always do. This becomes more important as frameworks become more generic or abstract.

I do both. Saved me many more times I care to remember.

Sure, name functions and variables in a meaningful way but put a comment block at the top of the function to explain the what, why and how.

Taken further, try this method (it's what I do and often commented upon in a welcomed way):

  1. Name your function and input variables in a useful way.
  2. Now write individual comments explaining the process.
  3. Now write the actual code under each comment.
  4. Review whether the line of code is easily understandable on it's own. If so, remove the comment.
  5. Function all good and understandable to anyone? You're done. If not, put in a comment block at the top of the function explaining the why and how.

Like tests, comments are best written before the actual code but they must be reviewed after the code is written to ensure their value.

I like to document my code inline as soon as it becomes non-obvious on first glance. I specifically write a small comment on each "branch" the code may take by an if/else distinction. Generally, my code should be easily understandable by people who have mastered the language its written in but have no idea what my software itself is about.

If my method names have speaking names, I like it and I'll make use of it. But as far as I have tried it, kind of more often than not this is not possible without introducing the lengthy gibberish names we have to deal with in Java or C#.

I absolutely agree with that. But you could take that to the next level. E.g. If you need to document REST interfaces, you could just write a document, wiki page or whatever. But that's similar to writing code comments. You could use some annotation based documenting framework, but you're still able to annotate any response code that does not have to match.
So we wrote a litte tool that captures requests, responses and additional textual information during test execution.
You're forced to have at least one test covering each use case and when all tests are green, you've got an up to date REST documentation as markdown.

I'm 50/50 on documentation within a method; if you need to document it, it's probably too complicated to begin with. Classes and methods always get XML style documentation, however, and I'm a sucker for a verbose, properly named variable / function.

I have become a really big fan of inline documentation via file and function header comments. Besides that I also like having a few sentences on project architecture, workflow and file organisation. Naming things well is important but not enough in my opinion.

I have a number of challenges for people who think code can generally be self-documenting. The first one is to code the quicksort algorithm in such a way as to make the average and worst-case time complexity obvious, and also explain the circumstances in which the worst case occurs.

code of conduct - report abuse