EDIT: Rather like a code review itself, my peers have brought up some very good points on the comments section and Twitter. If you've already read this post, see my notes in the EDIT
sections herein.
Also, read Code Review Guidelines by Philipp Hauer
Nearly any healthy programming workflow will involve code review at some point in the process. This may be a Pull Request on GitHub, a Differential Revision on Phabricator, a Crucible Review on Atlassian, or any number of other review tools. But however you do it, not all code reviews are created equal.
At MousePaw Media, we have a strictly enforced workflow that includes a mandatory pre-commit code review. These have helped us catch many bugs and sub-optimal code.
Yet many interns are afraid to do code reviews, fearing they have little to contribute, especially when reviewing code written by developers who have been there much longer than they have! Meanwhile, the quality of code reviews - even my own - can vary greatly depending on many factors: familiarity with the code, time of day, time of day, you name it.
I've compiled thoughts and notes on code reviews from the last few years into a guide, which I published on our staff network documentation. I wanted to share the result (slightly adapted to dev.to).
I must give credit where credit is due! I drew a lot of inspiration from Top Ten Pull Request Review Mistakes by Scott Nonnenberg, Doing Terrible Things To Your Code by Jeff Atwood, and Giving and Receiving Great Code Reviews by Sam Jarman.
Who Can Review?
As I said, it can sometimes be daunting to review someone else's code, especially if that person has more experience, expertise, or seniority than you do. But don't be afraid! When everyone participates in code reviewing, everyone wins! Here are a couple of helpful things to remember.
First of all, everyone makes mistakes, and we know it! When a coder knows he or she will be code reviewed, it's like a safety net: they can more easily relax and code, knowing that another set of eyes will be reading this code before it's considered "done".
Second, everyone learns from a code review. The author gains additional insight on how to improve their code; the reviewer can learn new techniques and ideas from the code they're reviewing; the bystanders get the benefits of both.
In short, don't be afraid to contribute feedback! Code reviewing can be one of the most valuable contributions you can make to a project.
That said, what goes into a good review?
Principle #1
The first and foremost principle of a good review is this: if you commit to review code, review it thoroughly! Expect to spend a decent amount time on this. Be sure to read the code, don't just skim it, and apply thought to both the code and its style.
In general, if you can't find anything specific to point out, either the code is perfect (almost never true) or you missed something. Thus, you can use this as a fairly accurate measure of how well you reviewed the code.
Aim to always suggest at least one specific improvement to the code (not just style) on the initial review. Follow-up reviews may not require this; otherwise we'd never land code!
EDIT: Especially if the code change is small, virtual perfection is absolutely possible. There may be reviews where no changes are needed at all, but you should be confident you put in the effort to actually arrive at this conclusion.
Principle #2
This goes hand-in-hand with the second principle: aim to understand every changed line. Research things you don't understand. Ask questions.
There are three major reasons why this is important:
In truly elegant code, simple is usually better than complex. The only way to know if the best solution is being used is to understand the current solution.
Other people may need to read this code. If you are having trouble understanding the code, it may need to be refactored, cleaned, or better commented.
The more knowledge you have, the better your code and reviews will be!
When you're done, you should be able to answer two following questions for yourself:
"What goal does this code accomplish?"
"How does it accomplish this goal?"
If you cannot answer both questions, you don't fully understand the changes!
EDIT: You do NOT necessarily have to understand the whole code base. Generally, you should assume that unchanged code works, and merely glance back at it to confirm that it is being used correctly in the changed code.
Principle #3
Don't assume the code works - build and test it yourself! You should actually pull down the code and test it out. On Phabricator Differential, code submitted for pre-commit review includes a Test Plan from the author.
Of course, when testing code, make sure you're building correctly. If the project has a build system, you should be able to use it. If the Continuous Integration system reported successfully building the code, you should be able to as well.
On this note, if the CI build failed, you should require that the code successfully build before it can be landed!
Once you've compiled the code, actually test it. At MousePaw Media, most of our projects have a tester that provides space for arbitrary code; you can use this to try things out.
You should also run the included automatic tests, don't leave it at this. Try to break the code! If you wind up finding cases the automatic tests could cover better, suggest that these cases be accounted for in the tests.
Once again, take a look at Doing Terrible Things To Your Code by Jeff Atwood for good testing tips.
EDIT: I may not have emphasized this enough, but trust the CI. The purpose here is to test the code outside of the automatic unit tests; in short, you're testing what the CI cannot test. If this doesn't apply, and there is truly nothing to manually test, don't waste your time.
Principle #4
Commenting matters. MousePaw Media developed and uses the Commenting Showing Intent standard, which means that roughly every logical statement should have a comment describing the programmer's intention for it. (See my article Your Project Isn't Done Yet for an explanation of why intent comments are important.)
Assuming you're working on a project that follows this convention, if you don't see an intent comment, you should request one to be added into the code. Then look for it before you approve. (If the project doesn't follow the CSI standard or something similar, consider proposing adoption of the standard for all future code.)
In regards to comments, it isn't enough just to have something there. Intent comments should actually describe intent. You should address any of the following problems:
The intent comment doesn't match the logic. This indicates that the comment, code, or both are wrong. We've caught many potentially nasty bugs this way!
The intent comment doesn't make sense. If the comment is confusing, it's as useful as no comment at all.
The intent comment has major typos. Grammar and spelling are important to meaning, especially when one doesn't know the audience. Will the next reader be English-as-a-Second-Language? Dyslexic? Just learning to code? Proper English is always easiest to decipher.
Principle #5
Review temporary code as strictly as production code. It can be shocking just how often temporary "patch" code and workarounds make it into production, and how much of it is never actually replaced. This is just a reality of real-world programming. Thus, we should hold all code to the same standards and expectations.
In other words, even if the code's solution isn't ideal, the implementation should be clean, maintainable, and reasonably efficient.
To put it yet another way, there is never an excuse for kludgy code.
Principle #6
Consider how the code will work in production. Design is important, and integration matters. How will this code function in the real world? How will it handle bad input and user error? Will it play well with the rest of the code base? In short, be demanding of the code. (See Principle #3.)
This ties in with Principle #5. It can be tempting to request (as the author) or grant (as the reviewer) grace for "unfinished" code, but therein lies a serious danger of shipping broken code!
If the code is broken, the user generally should not have easy access to it! An unfinished class may be marked as "experimental" and documented as such, thereby preventing a user from mistaking it for finished code. By contrast, a broken function should not be exposed in a non-experimental class.
Another way to look at this matter is this: if the code was shipped to end-users on the next commit, it may be functionally incomplete, but it should NOT be broken. In reality, this goal is rarely achieved, but the perspective will help prevent bad code from landing to your repository.
Principle #7
Check documentation, tests, and build files. Good code doesn't just include code, it includes all of the trappings that go with it. (Again, see Your Project Isn't Done Yet.
At MousePaw Media, we expect that every revision will contain all of the following:
Tests covering the new code. Review these as strictly as you do the code itself, to ensure the test will fail if there is a problem.
Documentation for the new code. The best documentation is written in tandem with the code itself. Don't accept documentation later; it should be present within the revision itself!
Build files updated for the changes. Any time code files are added, removed, or renamed, the build files need to reflect those changes. Similarly, if any dependencies have changed, the build files should reflect that too. This is one more reason why you should build the changes yourself (Principle #3).
README changes. The markdown files, such as
README.md
,BUILDING.md
,CHANGELOG.md
, and so forth should reflect the latest changes. In reality, these rarely need to be changed, but you should be sure they're up-to-date.
Principle #8
When reviewing, keep priorities straight when making suggestions.
Code should be...
Functional first,
Clean and maintainable second, and
Optimized third.
Code should ultimately achieve all three, but the order is important. If the code doesn't work, don't worry about style yet. Similarly, if the code is broken or poorly styled, optimization is only going to make things worse.
Principle #9
Follow up on reviews. After suggesting changes, you should be prepared to review it again. Ensure the necessary changes were made, and any problems you found were reasonably resolved.
Be sure to devote just as much attention to the follow up review as to the original one! Apply all ten principles anew.
Principle #10
Reviewing can be daunting, so it helps to remember that reviewers are not perfect! Issues may slip past you, bugs may evade detection, performance flaws may make it to production...in short, broken code happens!
If you are not familiar with the code or concepts, you may want to request that an additional reviewer provide feedback, but don't shy away from doing the review yourself! Ultimately, four eyes are always better than two.
If you do realize you've made a mistake in a review, the best thing you can do is own up to it. Raise a concern on the post-commit review system if appropriate, or else file an issue/bug report.
EDIT: One Twitter commentator pointed out another angle on this principle: keep your ego out of reviews! This isn't an arena for oneupmanship. If you go in with the intent to show your brilliance, tear down another coder, or otherwise beat them over the head with your experience, do everyone a favor and don't bother reviewing the code at all. A code review with ego attached is far worse than no review at all.
One Thing More...
At MousePaw Media, we actually have a strict revision checklist. Everything is expected to meet all these goals. Obviously, this is tailored to our particular project, but you might be able to take some notes for it and come up with your own.
When we first developed this checklist, I hadn't yet found A Code Review Checklist Prevents Stupid Mistakes by Blaine Osepchuk, but it's well worth a read!
So, for us, every revision must...
(1) Accomplish the feature(s) it was designed to accomplish.
We follow a rule of one-feature-per-revision. In some cases, the feature itself may be dropped, and only bugfixes and/or optimizations landed instead.
(2) Have merged all changes from master
into itself, and all conflicts resolved.
(3) Have binaries and unnecessary cruft untracked and removed. (Keep an eye on .gitignore
!)
(4) Compile and run properly - this should be confirmed via the CI system (Harbormaster/Jenkins in our case).
(5) Be free of compiler errors and warnings.
To the aim of #5, we compile all our C++ code with with -Wall -Wextra -Werror
).
(6) Be Valgrind pure (no memory leaks detected).
Once again, this is specific to our C and C++ code, but many languages have equivalents.
(7) Comply with the company's (or project's) Coding and Technical standards.
(8) Be free of linter errors.
We use cppcheck
for C++, and pylint
for Python.
(9) Be fully CSI commented.
Once again, see our Commenting Showing Intent standard.
(10) Have an up-to-date build script (CMake in our case) if relevant.
(11) Contain relevant tests.
(12) Have a Test Plan to aid reviewers in making sure your code works.
(13) Be reviewed, built, tested, and approved by at least one trusted-level reviewer.
(14) Have up-to-date (Sphinx) documentation, which compiles with no warnings.
We get the best results by not putting this off until later!
(15) Have all reviewer comments processed and marked "Done".
This is a side-effect of our particular review tool, Phabricator Differential, but you might request that all suggested changes be read and considered.
In Summary
These principles aren't actually new to MousePaw Media's workflow - we've been implicitly following them for some time - but I hope that by crafting this guide, we can achieve a more consistent application of them.
What code review principles does your project or organization follow?
EDIT: Please read the comments section on this, especially the superb comment by edA-qa mort-ora-y (and the conversation precipitating thereof). There are some valuable notes and alternative views on this topic that warrant consideration.)
Top comments (22)
Most of what we do is pretty ad hoc. Bookmarking this for the purpose of leveling up our game.
I'm going to agree in general, and if somebody finds they're lacking a process, this is a decent starting point.
My opinions differ on a few points:
+1 This is exactly what automated testing is such a powerful tool.
I think you make some valid points, and perhaps our processes better fit our organization than your project. At the same time, I would like to point out that "trusting the contributor" is very treacherous water indeed, because we get code blind. By way of example, I am the most senior developer at MousePaw Media, and the most familiar with the code, but I can point to many cases where an intern found a major flaw in my code, that would have been MUCH harder to catch had the code landed and shipped.
I know that reviewer time and effort is not inexhaustible (as Idan pointed out), but neither is the coder's time and effort. These practices are an investment. Far more time is spent trying to catch and fix shipped bugs than is spent catching them in pre-commit review to begin with.
As to the building step, remember that I said to trust the CI. Basic build problems should be caught there, and if it can build, any build problems on your end are basically your own. The reason I say to test is because automatic tests aren't perfect. Perhaps this is because, right now, we're mainly working in library and API design, but I have found in many cases that there is a MASSIVE gap between "passes unit tests" and "works in real life". These problems are only caught if someone actually tries to use the code.
I want to agree with and amend one other thing you pointed out - we can't all understand the entire code base. I certainly don't! But there is a difference between understanding the changes and understanding all the code. One can aim to understand all the changed code, while taking the unchanged code "for granted". Retrospect, I should have made this clearer.
For example, let's imagine the following is the only change in a file:
We might glance at the code for
cityDB.get()
to be sure it returns a pointer to something with the functionsname()
andtemp()
, but for the most part, we can just assume that these things are defined and work correctly. There's no need to fully read and understand this code to see that it is being used correctly.However, in fully understanding the change, we can spot an error: the third line of the loop block increments the loop iterator, meaning we're skipping every other city! Maybe this was translated from a
while
loop, or maybe the programmer's brain just ate a SPARC, but we can spot a problem that we'd have missed if we "trusted" the contributor too much.Furthermore, what if a casual glance at
cityDB
revealed an actual iterator class built into it? Using that would be far more efficient, and that would also warrant a helpful comment here.It's a tricky balance in practice. We don't have the time to understand everything. Yet, at our company, one project got indefinitely tabled because only one developer actually understood the code. That's never a good position to get one's organization into.
--
Lastly, yes, I know "find something to comment on" may be slightly overkill, but I hope basic discernment can speak into this. If we can actually say "this code needs no improvement," then we should do so and move on; however, we should be certain our comprehension of the code yields that conclusion, and we're not just jumping to it because we're lazy/tired/whatever. If we can't review it properly, we shouldn't be reviewing it at all.
I understand your concern about the product being useful. It's unfortunately common that programmer's produce things that don't actually work as intended, often because they didn't test it themselves, or there is a high-level compatibility. I have two approaches to get out of that environment:
I understand the problem you're solving with your approach. I agree you need a solution to the problem.
I guess trust depends on how well you know them. I primarily deal with a team I know. Our 3rd party contributions get a more rigourous review. But I don't mean about small details here, I mainly mean about the purpose of the fix. Certainly, even for code where I don't undrestand the goal I can still check several details of how it works. I can catch obvious failures even if I don't know.
For new vs. old code, yes, by all means assume the old code works. It's unfair to penalize pull requests because the old stuff needs improvement -- I even let some bad style slip through if it mimics the existing style. But there is some code that I just don't expect others to understand.
This is domain specific, and deals a lot with specialty algorithms usually. For example, I recently found a bug in the code I used to measure the length of vector paths. It took me a long time researching and finding the algorithms to begin with. Unless we want a reviewer to do the same research, and better, they simply would not have found the issue. They could understand the method names, and surroudning code, but the core algorithms present a bit of problem when it comes to reviewing.
For the same reason I just to have accept hacky workarounds #1 throuhg #7 on an Android target for our product. I assume the submitter did testing and research. I can verify the code is technically correct, ensure there's a manual test bit, but without spending lots of time I really can't say for sure if it's the correct approach, or even valid.
Idar Arye brings up a good point baout ROI as well. It's an unfortunate reality, that often it's more efficient, as a business, to ship buggy features (refer to Are we forever cursed with buggy software?. This is not an excuse though. The decision to trade priorites shouldn't be haphazard or done without thought.
Again, and this bears repeating: I agree code review should have rules and goals. I'm arguing only about some of the fine details here. Nobody should read this and come to the conclusion that the process is wrong. There are some details where I have alternate solutions, or have [hopefully] well reasoned objections.
Quality assurance is either a constant battle or it's being done wrong. :)
I concur. These practices may help catching problems, but they seem to have a very low RoI. The reviewrs' time and effort are not inexhaustible resources.
By the way (and separate from my rebuttal), I really do think you make some very good points that bear consideration - I just edited the post to draw attention to this thread.
Hi Jason,
thanks for sharing your principles. It's really interesting to see how others are doing code reviews.
You have many valid points. But we are less strict when it comes to documentation (code should be self-expressive without comments; only comment when it adds value to the code) and the build-and-test-it-yourself-thing (as you already pointed out, CI systems help here).
However, I experienced that the human aspects of code reviews are extremely important. Giving feedback can be hard and can lead to hurt feeling and relationships when done wrong. Constructive code reviews require a certain mindset and phrasing techniques. I experienced this multiple times in my carrier. That's why I focused on those points in our Code Review Guidelines. Check it out, they may be a good addition to your principles.
Excellent guidelines, @philipp_hauer ! I'll include a link to that in the edit section of the article, in fact. It's worth linking to.
Unrelated, but "self-expressive" code is only ever capable of expressing what it does, never the programmer's intentions (the code's "why"). That's why I recommend CSI so strongly. In years of using it in production, I've seldom encountered an intent-comment which did not add value to the code. In other words, "why" comments are practically always useful, while "what" comments are virtually never useful.
I know I keep using that word, but good code and good code review should focus on maintainability.
Code that is at a metaphorical 90% of perfect quality already gives you a high maintainbility, and that can usually be achieved with only a reasonable amount of effort. Putting more effort into it can get it up to 99% quality, but the ROI of that addditional (and typically not slight) effort tends to be far lower. Most importantly, given that most code serves a business need, the low ROI of perfectionism tends to harm the economic maintainability of that business.
Maybe so, but when you're working in open source software, all those dynamics get turned upside-down. If outside contributors can't understand the code, it isn't maintainable. (And that's what we're aiming for with this.)
Oops! I retract that comment, Alain. You are 100% correct. I read this backwards.
me takes a long swig of coffee
The only downside to relying on tests for this is that you have to leave the source to work it out, which greatly reduces your speed at learning the code.
I cover all these topics, including 'what vs. why' and 'comments vs. naming,' exhaustively in...
To Comment Or Not To Comment?
Jason C. McDonald
Wonderful article, I absolutely share It!
The only point that I disagree is principle 4 because I don't like comment, your code needs to be clear to all, clean for a good code review.
Don't use comments please
There are people who disagree w/ commenting in general, but the proof is in the pudding. Our ROI on intent-commenting has been tremendous: we've saved so much time and caught many bugs using them.
Read Your Project Isn't Done Yet for a full explanation of why intent comments are so vital to good code.
My boss told me, on the subject of code reviews: "I always assume you're wrong. It's up to you to prove otherwise."
For some excellent continued reading, see...
10 Lessons Learned Conducting Code Reviews
Jacque Schrag
Was this duplication done on purpose for irony? If so, π.
It actually wasn't! But maybe it should have been... :P
Well written and covers the topic nicely!
Some comments may only be visible to logged-in visitors. Sign in to view all comments.