I've been writing code for 20 years. During that time I've worked with 17 teams coding different languages to build hundreds of projects. These include everything from a simple blog site, to APIs supporting 3,000 requests/second, to top selling apps.
From these experiences, combined with the books I've read, it's become apparent to me what matters most in code: readability.
On the surface, readability may seem subjective. Something which may vary between languages, codebases, and teams. But when you look underneath, there are core elements within all code which make it readable.
Many programmers are too close to the computer. If the code runs, nothing else matters. Although a common defense it removes all of the human elements from what we do.
Over the last several months I've worked to distill these elements into 10 practices for writing code with a focus on improving readability and decreasing complexity. I've written about these in detail and applied them to real-world code snippets in BaseCode.
Many will unfortunately dismiss these as too trivial. Too fundamental. But I assure you, every bit of bad code I've encountered has failed to apply these practices. And every bit of good code you find one, if not many, of these practices.
So much energy is wasted on formatting. Tabs versus spaces. Allman versus K&R. You'll reach a point where you realize formatting is not what matters in programming. Adopt a standard format, apply it to the codebase, and automate it. Then you can refocus that energy on actually writing code.
All those commented blocks, unused variables, and unreachable code are rot. They effectively say to the reader, "I don't care about this code". So a cycle of decay begins. Over time this dead code will kill your codebase. It's classic Broken Windows Theory. You must seek and destroy dead code. While it doesn't need to be your primary focus, always be a Boy Scout.
The foundation of nearly all code is logic. We write code to make decisions, iterations, and calculations. This often results in branches or loops which create deeply nested blocks of code. While this may be easy to track for a computer, it can be a lot of mental overhead for a human. As such, the code appears complex and unreadable. Unravel nested code by using guard clauses, early returns, or aspects of functional programming.
Despite the current era of Object Oriented Programming, we still have Primitive Obsession. We find this in long parameter lists, data clumps, and custom array/dictionary structures. These can be refactored into objects. Doing so not only formalizes the structure of the data, but provides a home of all that repeat logic which accompanies the primitive data.
While I don't adhere to hard numbers, code blocks can reach a critical length. When you determine you have a big block of code, there's an opportunity to recognize, regroup, and refactor the code. This simple process allows you to determine the context and abstraction level of the code block so you can properly identify the responsibilities and refactor the code into a more readable and less complex block.
Sure, naming things is hard. But only because we make it hard. There's a little trick which works well with many things in programming, including naming - deferral. Don't ever get stuck naming something. Just keep coding. Name a variable a sentence if you must. Just keep coding. I guarantee by the time you complete the feature or work a better name will have presented itself.
This single practice was the original game changer for me. It's what put me on the path of focusing on readability. Despite my efforts to explain, there's always at least one person who hates me for it. They have that one example where a comment was absolutely necessary. Sure, when the Hubble telescope telemetry system has to interface with a legacy adapter by return
687 for unknown readings then that may need to be communicated with a comment. But for pretty much everything else, you should challenge yourself to rewrite the code so it doesn't need a comment.
We return the oddest values for things. Especially for boundaries cases. Values like
null. In turn, a lot of code is written to handle these values. In fact, the creator of
null calls it The Billion Dollar Mistake. You should aim to return a more reasonable value. Ideally something that allows the calling code to carry on even in the event of a negative path. If there are truly exceptional cases, there are better ways to communicate them than
Rule of Three
Think of a mathematical series of numbers. I provide you with the number
2 and ask, "What's next?" Maybe it's
4, but maybe it's
2.1. In reality you have no idea. So, I provide another number in the series
2, 4 and ask, "What's next?" Maybe it's
16. Again, despite our increased confidence we don't really know. Now I provide another number in the series
2, 4, 16 and ask, "What's next?" Now with three data points our programmer brains see the squared series and determine the next number to be
256. That's the Rule of Three.
The example demonstrates without distracting us with code that we shouldn't predetermine an abstraction or design right away. The Rule of Three counteracts our need to fight duplication by deferring until we have more data to make an informed decision. In the words of Sandi Metz, _"duplication is far cheaper than the wrong abstraction."
Now for the final practice and one which gives any bit of code that lasting touch of near poetic readability - symmetry. This is pulled straight from Kent Beck's Implementation Patterns which simply states:
Symmetry in code is where the same idea is expressed the same way everywhere it appears.
This is easier said than done. Symmetry embodies the creative side of writing. It's underlies many of the other practice: naming, structure, objects, patterns. It may vary language to language, codebase to codebase, and team to team. As such, you could spend the term of your natural life pursuing it. Yet, once you start applying symmetry to your code, a purer form appears and the code takes shape quickly.
These were a high-level view of the practices within _BaseCode. I encourage you check out the resources linked in this post, watch screencasts applying these practices, or read about them in full detail applied to real-world code snippets in BaseCode._
Top comments (45)
You might find "Art of Readable Code" interesting. Slightly different take on readable code but related to your post.
Thank you for a book reference!
Nice. I'll definitely check it out.
Except for removing comments, the article reflects my thoughts pretty well, too. Code full of comments is arguably just as bad as code with no comments at all. But trickier parts of code need to be commented. No matter how obvious the implementation might be, it's important to document your intention. You never know when you'll get something wrong and if your intention is not written down, it will be impossible for the next person to determine when something goes wrong if it's because of the requirements or the implementation.
Also, while I have just found out about the rule of three, it turns out I have been advocating in its favour for years.
I've found a better way to pitch "remove comments": When you see a one-liner comment, there are two opportunities for its removal:
In this way our comments become reflected in the code itself, making the code more readable/understandable, more resistant to comment rot, and more meaningful to tools like IDEs.
One more statement: Not every comment can/should be removed, this is not absolutist advice.
I would challenge, "why is the code tricky?"
It sometimes just is.
You can have something as simple as:
return x / 3;
and in the absence of a comment you can't tell whether that was supposed to divide by 3 or by anything else.
Of course, in this particular case you can use a suitable method name and not use a magic number. But I think you get my drift, sometimes the business logic itself is not self-explaining (i.e. explaining why a null check is present where it's not obvious why and whatnot).
A good rule of thumb is to comment about the 'why' and not the 'how' or 'what'. This means that the comments that are left are the ones that aren't obvious from the code.
Of course, as with anything there are exceptions, such as dealing with magic values from external sources or one that I've had to do recently: comment what a series of regular expressions were doing!
Another favourite of mine is when the production code has checks for something that's injected only for tests. To add insult to injury, those tests are usually called... wait for it... unit tests!
Otherwise, yes, that looks like a sensible rule.
Just a quick question, lets assume we have a method such as getUser(email,password). The return type is User. Assuming that there is no user matching that email and password(a very possible situation), what would you return if not null, to indicate there is no user?
Depends. Generally, returning
nullfor objects is common practice. Specifically, I'd want to know what a method like
getUser(email, password)does? It seems like it has more to do with
loginin which case, there's a lot that could be improved for readability's sake (naming, returns, symmetry)
It was just a example, more generally what if the method searches for something in a database. What should it return if it can't find the required object?
Something that could possibly better represent emptiness. For example, many ORMs return an empty collection when a query yields no results.
I see. So basically, this is something that needs to be thought about from the design level up, and not a simple change that can be done to a method(in most cases at least). I am familiar with the practice of returning an empty list rather than null. Definitely reduces the number of null checks that have to be done in other parts of the code.
Methods that return collections should never return
null, ever. At the very least, an empty collection is returned. For plain objects, employ the Null Object Pattern. In case of Java, Optional class is what you should be looking for. 🙂
This is exactly what I was looking for, how to handle plain objects. Thank you, Andre.
Will definitely be implementing these where possible.
There is the null object pattern for this situation, which returns something that shouldn't break your existing code and only requires minimal changes to your methods which return objects.
Remzi Arpaci-Dusseau, my operating systems professor, stressed that maintainable code should be self-explanatory as much as possible. This touches naming, formatting, architecture, block size, basically anything that contributes to code appearance. That principle definitely aligns with your list.
Sounds like a good professor.
Empathy driven development... When everyone cares about the next reader of the code, amazing things start to happen within a team/organization.
The goal should be to reduce debugging time, not increase writing speed.
"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." (M. Fowler)
Good quote from a good book (Refactoring).
"Data clump" is such a good term. So evocative 💯
I know I'm guilty of over-using
nullas a return value - and symmetry, especially in a large codebase, can be so elusive. Thanks for this high-level cheatsheet; good to keep these things top-of-mind! 🙏
I usually like language that allow some optional type (like Rust or C++ through union types). It's a good element to indicate that "this piece of code may return in case all condition for it to appear are met". Same goes with other element.
This article is good and it's match perfectly with my obsession with semantic: the meaning should be directly observable from the code.
Yes pretty much agree on all.
One of the reasons I like Go is that its compiler, linter and idiomatic rules solve most of these issues and more. Lets the dev focus on the business and less on these development issues that are long solved but still arent applied everywhere.
Oh yeah, Go is awesome with the predefined formatting and more expressive grammar. Much like Python or Ruby in those respective areas.
Thank you for this article - I am working on a new project, implementing an API and was struggling with what to return for some of the operations. Your last practice - Symmetry in code is where the same idea is expressed the same way everywhere it appears - helped me settle on what I think will be the best approach.
My real estate agent told us when bidding on a house, as you're coming up and the seller is coming down, once you get the same price back from the seller three times, it's take it or leave it. The Power of Three...
Hi Jason, it might be a little bit off topic... but I'm glad to notify you that your article has reach the top 3 on Daily Now live feed. It means that thousands of developers from all over the world are being exposed to you content every time they open a New Tab on either Chrome or Firefox. Good job!
In case you haven't heard of Daily Now yet, you can check your ranking here: dailynow.co
Cool man. Thanks for letting me know.
Nice article Jason!
I, too, think that readability is one of the most important qualities of the code, if not the topmost one.
I wrote an article about that.
Yes, yes, yes, and... Yes!
All I have to say is read the book 'code complete' amazon.com/Code-Complete-Practical...
Great book. Believe it's on The Reading List. But while there are elements in each, no single book discusses all these practices. Heck, that's why I wrote one. :)
This one's a staple as well (if you're applying OOP): amazon.com/Object-Oriented-Design-...
Great article -- I think some of the "junior" devs need to read up more on articles like these!
Pick up a copy of BaseCode for the team!
This is a great article! Can it be translated into Chinese?
👌 Thank you ! I will share the Chinese link.