Eli Bierman

Posted on Jun 13, 2018 • Edited on Jul 7, 2020 • Originally published at elib.dev

3 Essential Components of Great Documentation

#documentation #nicedocs #opensource

I originally wrote this as a response to @nonlinearnygma's post:

Article No Longer Available

It got really long and I felt a bit rude leaving such a long comment, so I decided to make it its own post. I clearly have a lot of opinions about documentation! This post focuses on what makes documentation great, and how a beginner to a project can get involved in contributing to its documentation.

Update: After writing this post I saw @ojkelly has a very similar post describing documentation in layers:

The Four Layers to Great Documentation

Owen Kelly ・ Jun 7 '18 ・ 5 min read

#documentation #docs #marketing #writing

I really love how he describes the layers building on each other, so be sure to check out that post.

But Nobody Likes Writing Docs...

If you feel this way you are not alone. Writing documentation doesn't always feel like a priority when you also need to code. Good documentation makes code more reusable and understandable, so the code and other coders will thank you for it. Writing documentation at the same time as you write code can make it a more manageable and enjoyable task.

Writing Docs Helps You

Writing documentation is a great way to think through the details of a technical subject. It can help reframe issues in your mind that lead to insights about the project or areas for improvement that you may not have thought of before. It can even help in the planning stages of a project or new feature. Some people even practice Documentation-Driven Development!

Writing Docs Helps the Community

Documentation is the first place most coders go to try out or learn about a new project, or when they come across an issue or try solving a new problem with a project they already are familiar with. That means that documentation contributions can make even more of a positive impact on a project's community than contributions to the code.

GitHub's 2017 Open Source Survey found that the biggest problem encountered in open source was "incomplete or confusing documentation." Their top insight from the survey was this:

Documentation is highly valued, frequently overlooked, and a means for establishing inclusive and accessible communities.

Any contribution (small or large) is always greatly appreciated by the project maintainers and its users.

Aspirational Documentation

Some projects with documentation that I try to learn from:

VueJS
SQLite
FreeBSD (especially compared with Linux)
Bunjil (I haven't used it, but this GraphQL server by @ojkelly that he shared in his post has really #nicedocs)
Docker (contributed by @presto412)
AWS (contributed by @technologymop)

All of these projects include the components I describe below. Let me know in the comments if there are any projects with documentation you love. I'm always looking for more documentation inspiration!

My Ideal Documentation

In my opinion the ideal documentation usually has 3 components.

The Why / Goals: the context and goals of the project
The What / API / Reference: detailed technical documentation of the programming interface
The How / Examples / Guides: example-based guides for accomplishing specific tasks

The Why / Goals

What was the motivation for this project being built in the first place?
What are some similar projects and how are they different?
What types of projects would this project be a good fit for, and when might something else be better?

Usually this is best answered by the authors of the project. If it's possible, it might be helpful to get their point of view on this to add to the documentation if it's not there already. Sometimes people leave out explaining where the project might not be the best fit, but it is very useful and appreciated by users.

The What / API / Reference

What are the different high-level components and how do they fit together?
What are the different low-level data types and functions and what do they do?

There is usually some of this already. If there isn't any, it can be very difficult to get started at all. This can be a good place for suggesting changes where things are unclear to you as you are learning how to use the project. Maintainers can sometimes be picky about this area of the documentation since it's viewed as the authoritative source of information, but it can be a good opportunity to learn about the nitty-gritty details of the project.

The format of this type of documentation is often language-specific, since most languages come with some kind of built-in system for generating docs from comments in the source code. That is usually the format that people coding in that language expect to see this type of documentation in.

The How / Examples / Guides

How do you install the project and get some basic code running?
What steps do you take to build a simple application using this project?
What steps do you take to address common problems the project addresses?

This is usually the area with the most room for improvement and easiest for someone new to the project to jump into.

This part of the documentation holds a user's hand and walks them through each step in a clear way, leading them to the nirvana of working code that solves a real issue (or it's clear how it can be applied to one).

You can write this type of documentation by building a small thing using the project and take careful notes at every step about what you're doing, so that somebody else could follow along by just copy-pasting. You can easily turn this into a guide that takes users from total beginner to a small win and an aha-moment.

Most people coming to a project are total beginners in it, so writing for other beginners from a beginner's perspective is immensely valuable. People with more experience in the project may actually have trouble seeing from a beginner's perspective so they usually really appreciate this type of contribution. I think this type of documentation goes a long way in making projects more approachable.

I'd love to read other people's opinions on what makes good documentation here or on @nonlinearnygma's post. Do you have a love-hate relationship with documentation? Any favorite #nicedocs?

Latest comments (31)

Patrick Ziegler • Sep 28 '19

I love how this is written in the perspective of writing documentation as a user instead of a maintainer or regular contributor!

Eli Bierman • Jun 14 '18

I'm glad it helped :) I didn't expect it to be such a popular topic; I'm glad you brought it up for discussion! It's been so cool seeing different people's perspectives on it.

Juan Koss • Jun 14 '18

good job bro! it's really helpful post for all readers

Rubén Martín Pozo • Jun 14 '18

In my opinion, one of the main problems of documentation is how fast it gets outdated. In the past I tried things like Specification by example and "Living documentation" but it tends to be too expensive (time and money) to be effectively used in real projects.

Right now, we're using Guru as our documentation platform. It keeps the documentation small (very important to prevent it from getting old pretty quick) and reminds authors to check their articles every now and them to see if they are up to date. I recommend to give it a try. We're pretty happy with this solution.

Eli Bierman • Jun 14 '18

Thanks for sharing your experience! I've in the past tried the Literate Programming technique, but also found it very difficult to maintain.

Guru seems like a really interesting way of doing it; I'll check it out. My first impression is that it seems like it focuses on the experience of writing and using the documentation, which I like.

Cat • Jun 14 '18

I think the problem is that devs write docs the way they understand it. It's like giving someone giving you their class notes, but they write in short-hand or in analogies you don't exactly understand.

I absolutely appreciate the tag #explainlikeimfive--because, personally, need things to be explained in the most basic, layman terms as possible.

Sometimes people don't want to debase themselves to that of a child, mostly due to pride (and selfishness).

There are some designers (like me) who want to understand how to code better so that we can collaborate with engineers to streamline building a functional and new user-friendly product.

I have yet to find a #nicedoc. :(

Thank you for writing this post! It was a great read!

Eli Bierman • Jun 14 '18

I think the problem is that devs write docs the way they understand it.

Totally! Thanks for pointing that out. Writing documentation requires a shift in perspective and some detachment from the internal implementation of the code. Even the documentation links I included I think all suffer from this issue.

I wish developers would be more open to inviting other people into the documentation process. The best documentation can only be written by involving people that don't have an intimate familiarity with the codebase. Ideally the terminology used in the code itself reflects the way users actually talk about the concepts involved.

I hope you find a #nicedoc some day!

Cat • Jun 14 '18

You just gave me a brilliant idea. Brb brainstorming.

Eli Bierman • Jun 14 '18

Excited to see what you come up with :)

Jannis Jorre • Jun 13 '18

I really like your approach to documentation. I have to disagree on one point though: comment-generated docs. Generally I think that comments in code are a big no-go. I've never come across a case where good naming and clean code can't replace them.

Don't get me wrong, on projects others have to use I highly value good guides - but method-by-method explanations are either necessary in case of "dirty" code, or unnecessary in case of clean code.

That's my opinion. :)

Eli Bierman • Jun 13 '18

I know a lot of people agree with you on the comments too. :) I think comments in the code that repeat the code can be a burden to maintain, but I like to basically have a comment for each public function or type/object that briefly explains what you would use it for. Or to describe any non-standard code or design.

Jannis Jorre • Jun 14 '18

In my experience I have never come across a case where a comment was necessary. Feel free to give me an example. I like to challenge this thought. There's one exception that I can imagine, that is "unclean" code, that is necessary for performance purposes. But other than that...? Open for a challenge! ;)

Matheus Mohr • Jun 14 '18

I've had the same opinion for a very long time, until I started paying attention to some places where a simple javadoc saves a good amount of unecessary research. Taking 2 quick examples that come to my mind:

Wildfly Swarm, an application server meant for micro-services, in it's prior implementation where you had to create a class to configure your server, had a "swarm.createDefaultDeployment()" method. It is a method that makes absolute sense if you consider that it does exactly that, but... what is a default deployment? What would be this method's name if I explained it's details using it's name?
Jsoup, a library to parse HTML in java, has a method that returns you an element's index considering the element list it's part of. Considering the domain it belongs, the method "elementSiblingIndex()" makes sense on it's own, especially if you're familiar with the terminology used in html handling, but again, how self-explanatory could/should be without getting a paragraph?

I'm always against any kind of extreme measures, which goes for both "no docs in methods" or "document every public method you have". Documentation is meant to make everything clearer and help other developers, and I think that the "Clean code doesn't need docs" idea, although great on it's own, ends up leading to confusing code that, even though made sense when it was created, now leads to "what is a default deployment after all?" kind of question.

And yes, I always love debating this, so let's go haha

Jannis Jorre • Jun 14 '18

I just want to clarify upfront: I didn't say no docs are necessary. I just said that method-by-method docs are unnecessary, if the naming and code is done well.

Here's my attempt to "fix" the above issues:

I think, while a method like this is very nice to have, it actually does not state it's intent very well. I don't know what it actually does, but a naming like: createDeploymentWithXAndY() would describe it well. If you absolutely need a method as the above, it can simply "proxy" a method named like mine. When looking at the code of createDefaultDeployment(), you then know that it creates a deployment with X and Y. Another perk: you have another way to call it, which will enhance the usability of the interface.
This one is a bit more complicated, as I am not sure what it's supposed to do. The first improvement that came to mind was: getElementIndexAsSibling(). This still doesn't make me happy. So I'd suggest: getElementIndexInContainingElementList(). This name is relatively long, but since it describes very well what it's doing, it fulfills it's main task very well. Some people might add an "s", making it "...ElementsIndex...", But that is fine tuning based on preference, I'd say.

If you have any issues with these solutions I'd be happy to be challenged with those. :)

Matheus Mohr • Jun 14 '18

Indeed, and I agree that method-by-method documentation is makes as much sense as no documentation (I know that searchUserByName searches a user by its name ffs..)

Considering the examples, your solution to the second case is great, can't argue there, but when it comes to the deployment scenario, things get ugly.

Your ideas there lead to a spaghetti code structure that I've seen multiple times and is mainly what led me to question the statement "Clean code makes method-docs unecessary".

Think about what's involved in a deployment and the amount of tweaks you can do. Now let's consider the defaultDeployment option. I'm not saying you CAN'T do exactly what you said, but how many proxy methods with 1-2 lines would you need to create in order to "keep it clean"? And how much better would your code actually be after that? Would that amount of code and methods do a better job for your project than, say, a comment that explains what a single method does? Also, is your increased interface usability actually going to be of any use? Maybe your 3 new methods, all with a very clear signature, are never going to be used again (just like tons of generic code out there). And I say that based on multiple projects I've seen and worked, with exactly that kind of structure, where a proxy method was created for clarification, but 2 weeks later a new method was created, proxying the former, and eventually you end up with a 10 method chain, with 8 of them having less than 3 lines on them, and that's where I ask myself "is this really better than a 1-2 lines doc on a single method with, say, 3 parameters?". So far, my answer is no.

Considering my last 3 months of work (brand new project, from scratch), I've written very few method docs, but when I consider the amount of proxy methods I'd have to create in order to clarify what my 20-30 words doc does (including some parameter options), I can't see how that would be better.

Summing it up, I think that renaming your methods should always be considered ("are the methods name clear on their intentions" was actually one of our PR checklist items), but shouldn't be used as an argument to completely avoid method-specific docs, especially considering complex domain-specific scenarios.

Also wanna mention that indeed, you never actually said "Nobody should document their methods, ever", but your statement on "In my experience I have never come across a case where a comment was necessary." tends to lead less experienced developers to go full no code-docs, and that shouldn't be a thing (trust me, it has happened before with "Clean Code" book).

Eli Bierman • Jun 14 '18 • Edited

I think it's really important with documentation (just like with any writing) that you are really clear with who your audience is.

Usually the only people that will see code comments are maintainers of the code, so I think anything that you would want to tell another maintainer that you would work with should be communicated in comments. This is a personal preference that I think is based on not always knowing who would work on my code after me. But if code is successful, hopefully somebody you don't know will try to extend it and not have to ask you questions about it directly.

Comments on public functions or methods are usually for users of the library, who probably don't want to look into the source code to find out what a function does. The function name might be descriptive, but what about the parameters it takes, or the side effects it might have? Strongly typed languages can help somewhat with this, but I think these important details should always be communicated somehow to the user in the documentation.

Users often don't ask questions about things they are confused about, they just move on to something easier to understand.

An example of a code commenting style I strive for can be found in Fossil SCM's implementation of its internal scripting language TH1. It starts with a brief overview of what the file contains. Each function has a comment that answers the question "why does this function exist?" Check out the comment on an internal Buffer type. Here's an example of a comment that you might see as redundant, but another maintainer might see it as saving one precious minute of their time trying to figure out the purpose of that code.

Jannis Jorre • Jul 2 '18

First of all, sorry for such a long silence, I've been meaning to answer for quite a long time now... But I wanted to do it right and properly.

@mohr023
My solution to the second problem was not ideal, I admit that. Your Answer still makes a wrong assumption. Let me explain what I mean, and my attempt at a better solution.
Your answer relies on the assumption that by proxying some method, you end up creating a culture of proxying. But if everyone actually writes clean code (which, since this is a thought experiment, we can assume) somebody will notice the chain of method proxying and refactor the necessary code to be clean again. Which would probably end up in my next proposed solution. You also mention that increased usability, or at least increased size of the interface is not helpful, or necessary. If this is the case, which it surely can be, I would still use proxying and simple call a private method in such a case. This still makes it instantly clear what the default method does (since the next method call explains it and is the only in the default method's body).
You make a very good point about the many options though. My previously suggested solution does not scale very well, that's true. For such a high-configurable Entity, I like to use the builder-pattern. Using the builder pattern you always start off using a default state, but can customize whatever necessary or desired. I would assume that in the case that event the builder pattern does not scale enough, I'd probably check whether I can somehow split the code further. Otherwise I'm out of ideas as well.

@eli Bierman
Audience is key. That is totally true. I was mostly talking about in-project code, not code that would be used as a library. But if a library is very well written, I usually don't have to look at any documentation but can simply use autocompletion of my IDE to do what I want. Anything else will be faster via Google and StackOverflow or similar anyways, than reading the docs. At least for me. But specific project should have a full documentation either way. While programming languages should also be very intuitive in their naming, they do require docs with a full walk-through. Just to mention one relatively obvious example. There's always edge cases, and in German we have a saying: "Keine Regel ohne Ausnahme", which translates to "No rule without an exception" - which I find is really fitting for situations like this, and everywhere for that matter.

Partly in response to your (@eli Bierman) answer, but also generally as well, I just want to mention that it is highly language dependent, whether and how much documentation is necessary. Some languages work on a higher level, which generally make them more readable for us humans to understand, while others work on a lower level. These languages generally need more docs, as even experienced developers might spend a few seconds figuring out what exactly a method does, even if the naming is ideal.

I actually write a lot of comments. But I also delete most of them. Very rarely does one stay in the code after I have worked on it and somebody has reviewed it. By explaining what a method does in a comment, which feels less restricted than a name, I can optimize the comment to derive a very good name for the method or whatever it is I am trying to name/commented on.

In the end it does come down to preference though. I myself like to rather look at the code and understand what it does than read docs, while for others it might be the other way around. Which is both valid. And if your target audience is bigger, I'd opt for documentation on public methods. Just so that everyone can be happy.

Matheus Mohr • Jul 2 '18

Exactly, I don't think we'll ever come to a single option that fits every case, but I guess that in the end, it's still valid to use comments only as a final resource, you know, not using them as an excuse to leave an ugly code behind.

As you mentioned about preference, I actually prefer when a library I'm using has in-built docs, so that I can quickly ctrl+space and confirm that the input, process and output are what I expected, but that doesn't mean I'm filled with joy when I see a "createDefaultDeployment" method and have to read it's docs piece by piece just to start going.

You make some very good points though, helped me consolidate a few thoughts on comment using and refactoring, thanks for taking your time going through them.

Eli Bierman • Jul 3 '18

This is such an interesting conversation. When I was working primarily with Objective-C in Xcode, I was very comfortable using other code that didn't have any documentation, just a descriptive name like pushVideoFrameToBuffer, and I felt prepared to use it.

Now I usually work in dynamically-typed languages without an IDE, so that definitely biases me towards a small number of concisely named public methods, because it is easier to remember what to type (and for me, easier to read too). Usually those methods are configurable by passing in arguments, like the builder pattern you mentioned @jeyj0 . It's interesting that design patterns are more constant across languages than naming patterns...

My current primary language Elixir lets you read each public method's comment from the interpreter, so it feels comfortable to refer to documentation to answer my questions about a method's usage. If I come across a library in Elixir with minimal documentation and long descriptive method names, it feels uncomfortable to use and I will probably look for a different one. The total opposite of what I expected in Objective-C.

Our code has to be a chameleon and blend into the surroundings. :)

Hiroto Fukui 🐶 • Jun 13 '18 • Edited

Most people coming to a project are total beginners in it, so writing for other beginners from a beginner's perspective is immensely valuable.

I often find myself unsure "where to start explaining from?".
People who come to read your document have diffent technical background, I never got confident I wrote documemt they all can find information they wanted to acknowledge.

Eli Bierman • Jun 13 '18

Yeah, it can be difficult to find where to start. I think examples and guides is usually a good place to start. If you or one other person has a question (even if it feels like a silly question) or isn't sure how to do something, you can basically guarantee that somebody else has the same question. Something is better than nothing, even if it's just adding a brief note.

At the beginning of a guide you can say where you expect them to already know, and what the guide will show them from there. That way people know if it's relevant to them before they read the whole thing.

MOP Technology • Jun 13 '18

if you really wanna see crappy documentation - feel free to implement dynamic yield :-)

in terms of good documentation, i think amazon web services are currently state of the art.

Eli Bierman • Jun 13 '18

Thanks for the contribution, I added it to the list! AWS is a really great example of how it's hard to have too much documentation. Especially if they are organized as well as the AWS docs are. And now I know to stay away from dynamic yield :)