loading...
Cover image for How to create effective Pull Requests

How to create effective Pull Requests

mpermar profile image Martín Pérez Updated on ・10 min read

Pull Requests, from now on referred to as PRs within this article. That unsexy but necessary step that sits between your wonderful shiny new commit and your master branch. PRs are unsexy because we actually know that our code is neat and tidy, and we don't want those evil code reviewers to come up with all those useless and nasty comments about our beloved codebase, or do we?

Of course, we do. Code reviews are not evil, in general :) Code reviews are a way to spread knowledge across the team and also help to get several extra pair of eyes into your code. Some people may argue that pairing is an alternative to a code review. And that is right, as long as you are good with only one person of the team knowing about your piece of code. And yet, you might pair with someone for a few hours and then that person can still come up with something totally different to what you worked initially in. So, code reviews are an opportunity to make sure everyone is on the same page and also to get quality feedback for your work.

Pull requests' quality is often overlooked. A badly written pull request can not only cause friction when it comes to code review ( arguments, fights, time wasted and ultimately burn down), but it also decreases productivity and increases the chances to introduce issues in production. Clean and tidy Pull Requests contribute to a faster pipeline, quicker code review cycles and increase the overall quality of our product.

So, here are some tips on how to write effective Pull Requests ( Note that the samples in here are totally fictitious).

Make sure your PR is supported by tests

This is a big one for me. Probably the most important. If you have to pick one tip, pick this one.

All of us, are busy people. Right? We have tons of stuff to do. Then, have you ever received a Pull Request that is changing code (i.e. not a trivial typo-fixing type of thing) and has no tests? I have. Plenty of times. And these days, my reaction to those is pretty much the same: "Please, can you include some tests that prove that the change you are doing is indeed verified".

Why am I being so harsh? Well, really because, with no tests, I am actually forced into doing a big effort trying not only to understand the logic but also to figure out any minor thing and dependency that could actually break the code. This might happen especially if you are the subject expert in some area. In such case, there is a considerable chance that you are always going to get these type of PRs where the submitter actually just wants you to verify the overall approach and the source code before even bothering to write any tests for it. Needless to say that this is bad. Quite bad. Don't take me wrong, there is nothing bad on asking for help, but there are better ways to do it like asking for a pair programming session or doing an unofficial code review in pair. That way, the reviewer's time will be used more effectively than with an out of the blue totally asynchronous PR.

By writing tests along with your PR you are providing the reviewer actual proof that you have gone through the exercise of testing the code yourself. The tests you write should go beyond the usual happy-path scenarios. Again, by providing negative testing, you are giving proof that your homework is done and that you have been thinking about what might happen when things go wrong. Tests will also help you with your logic and they will surface hidden issues that might go unnoticed.

Most importantly, tests will provide not only proof but also great documentation for your reviewer about what you are trying to prove and why you think you are indeed proving it. And both things, the why and the how provide fundamental information to decide whether the PR is valid or whether it might be incomplete or just missing some part or be missing entirely the point.

Write a meaningful PR title

This sounds kind of trivial but it's actually common and overlooked a lot of times. Especially when working on a single-person project when you are rushing and don't put that much attention on the details. But again, if you think about the future, chances are people will join later and if you don't put much attention on the titles then you end up with things like this:

Can you tell what is going on this PRs?

Can you tell what I was doing there? Well, you can tell I was refactoring some code, doing some bug fixing, but you have absolutely no clue about what the changes were. Now, that is a fictitious example. But, as a general practice, something like that implies having to click on each and every PR to figure out what is going on. Imagine that you are new to the project. Exhausting.

A title needs to be short, concise but at the same, it needs to communicate the overall purpose of the change. If you cannot summarize your PR in a single title, then it's very likely that your change is doing too many things and at that point, you should reconsider whether you should be splitting your work into multiple PRs.

Write a meaningful description

PR's descriptions should be meaningful. Bear in mind, that after the PR title, the next thing the reviewer will see is the description. A good PR description should tell us:

. What you are trying to do.
. How you are doing it.
. Why you decided to do it that way.
. How you are supporting this PR.

For example, compare these:

PR with no description
PR description extending title
PR that plainly explains nothing

With this one below:

Good PR description

So whereas on the first examples there was no description, or some cropped description from the PR title or a totally useless description pointing to some demo happening in the future, here, the engineer gives some good context about the problem, explains why this PR was needed and why it has been done that way. And finally, it also provides some information about why the PR is safe and the tests attempted. Probably not perfect, but certainly more useful and explanatory than the other examples.

Avoid huge and tiny PRs when possible

Who does not fear the mythical 100+ files PR? I've done some of those and I'm not proud of it. Huge PRs are usually the result of long-lived branches, big refactorings, heavy new functionality or simply could come from extremely creative contributors that have that ability to create complex code and tests very quickly. However, none of those are real reasons justifying huge PRs. Even if we are fortunate enough and have highly skilled engineers that can create code as fast as a kid can devour a candies box, that does not mean our reviewers are going to have that same mental ability to process PRs with hundreds of modifications.

On the other hand, if every code change is going to be a single PR, then that puts a lot of burden in the code and PR review process and might impact development with constant interruptions. Unlike with huge PRs, there are occasions in which there is no other choice than having a tiny PR as maybe we need to change just one file and very little of it. That's totally fine. What we should not do is, for example, to create five different PRs with one file each for some functionality that is cohesive enough and logically related and hence could fall within the same PR.

I will not set an ideal PR size as I don't think there is one, but I believe as soon as we go over 20 files it starts to be complex for a reviewer to actually have a mental map with all the changes and consequences that these changes carry over. Having tests alleviates this issue, but before sending a PR that is starting to feel large then we should ask ourselves if we would be comfortable reviewing that many files or not. In case we feel that it would be too much for ourselves then we should be considerate with our reviewers and divide the PR in smaller related chunks.

Apply Single Responsibility Principle to your PRs

Single Responsibility Principle is in some sense also applicable to PRs. Just as you want a module, class or function to have and to encapsulate a single responsibility, you want your PR to encapsulate and own a set of logically interrelated and encapsulated changes. In other words, you don't want to have a PR containing multiple logical changes that are not related.

One typical problem with merging non-related changes in the same PR is that if the code review takes longer than expected then all the functionalities that are enclosed within the PR will get blocked. So, imagine an urgent bug fix committed along with a non-related PR that has several parts that need review. The important bug fix will get blocked while the PR is fixed. Whereas if you submit two separate PRs then the quick and easy bug fix can proceed to the master branch while the longer and more conflictive PR gets reviewed.

Typically this issue might happen, many times involuntarily, when you are always working from the master branch on your fork. That is perfectly fine if you are going to deliver sequential changes. But as soon as you foresee multiple parallel changes then you should consider switching to temporal local branches on your fork which will allow you to submit multiple independent PRs.

Avoid void changes

Changes like new lines added or removed, spaces, tabs, etc. add which don't change the file at all, are just adding noise to your PR and make it look longer than it is. You should avoid this as such changes contribute to your PR to look messier and careless. And you are not careless, are you? Honestly, when I get a review where I see those patterns my internal self becomes immediately wary about the whole PR. I don't know why. It's probably one of those psychological patterns, like those that explain that the food that looks neat on a plate tastes better. In a similar way, PRs that look bad from their cleanliness look worse overall and raise alerts.

There are no changes on this PR section

Avoid formatting changes

Similar to the above point, at least certainly in the psychological aspect, you should avoid doing code formatting changes in your PRs. If you need to do changes in formatting then open a PR solely for that purpose and discuss it with your peers. Formatting changes like replacing tabs with spaces, or replacing spaces with tabs, or changing brackets positioning, or splitting a line into two or two lines into one, all contribute to diverting the attention from the reviewer into topics that are not related with the PR itself.

Guess what changed on this PR

Guess what changed above? Yes, you guessed it right. Nothing, or almost nothing, or certainly nothing relevant.

Remember, the reviewer is likely already doing a good mental effort to follow up with your PR and trying to understand what you are trying to do. The less you want to do is to add confusion by introducing irrelevant style changes.

Make sure your code is commented. Especially the difficult parts.

Communication is key for software engineers. I cannot say this often enough. And while I totally agree with clean and self-documented code principles there are many times that naming and structure are not enough and the engineer needs to provide a few lines explaining what the code is trying to do or why it was done that way.

Again, you need to put yourself into the skin of the reviewer and think: "If someone gives me this same code to review, would I burst into tears?", "Would I be able to understand this if I had absolutely no context of what is being done?". We probably don't have to go to extremes like "a toddler should understand it" but hopefully you get the idea.

The easier and the more effective we can communicate what we are doing, the quicker the PR will be reviewed and the quicker mistakes will be detected. So actually spending some time trying to communicate on an effective manner will have a huge return one way or the other. Needless to say, after this step the code ends documented, which is a huge bonus when new hires come in. Everyone loves an elegant and nicely documented piece of code.

Provide context providing own PR review comments in advance

Most source code management systems like Github allow you to review your own PRs. You can take advantage of this to provide feedback in advance. That is, information going beyond the PR description and that might be contextual to the task itself, or may be specific to certain files, or might be just general context, external comments, history, etc.

For example, comments like "This is the asynchronous task executor approach we discussed in the Start of Sprint the other day", or "The metrics aggregator thread that Joe suggested is implemented on this file", should not be in the code itself and might be too contextual to be on the description. However, adding these as comments can provide important and useful context for the review and we can put them just as our own code review attached either to the top of the PR itself or to individual files.

Leverage external tool integration

Some tools understand and can link pull requests with the existing tracking system. For example, in JIRA if you include the ticket id within the PR title then JIRA will auto include all the PR information within the ticket itself. This is extremely handy to track the different tasks and map them with actual engineering work.

JIRA Git integration

Most of the project management tools have some sort of integration with Git, so go ahead and make use of it!

Bonus: If you integrate your source code management tool ( e.g. Github ), with your collaboration tool (e.g. Slack, Webex Teams, Microsoft Teams...), and your PRs are tidy, then you get a wonderful team space containing all the code changes ready for your team to review, or to just be aware of, or just to revisit back after a few months to check something in the past. All for free, as long as your PRs are treated well.

Treat Pull Requests as valuable documentation

A PR can be a very important source of documentation when created in a responsible manner. It provides a meaningful description of what you are trying to do. It does contain a set of changes that are meant to achieve your task and it includes a set of tests that verify that indeed the requirements are met.

When done right, existing engineers or new hires can go easily through the list of closed PRs for a nicely historical segmented view of the most notable changes that have been introduced gradually into the project by the team or by an individual person over the last few months or years.

However, for being valuable, Pull Requests really need to be as good and tidy as our code is meant to be.

Hopefully, the tips above will help you with that. Thanks for reading.

Posted on Apr 9 '19 by:

mpermar profile

Martín Pérez

@mpermar

Engineering leader who works remotely and loves mountain biking, family and side projects.

Discussion

markdown guide
 

Great Post!

I agree that formatting and whitespace tweaks add to the overall noise of the diff, but I also think there can be appropriate times and a places to get the code tidy. If the PR Author knows a file is unchanged they could add a "heads up" comment for the reviewer. Alternatively, they could open a separate PR that is purely formatting/code cleanup.

 

I would add : rebase and squash your commits. This way when the PR is merged, the mainline is a clean series of atomic & significant commits

 

I disagree with squash on merge. I want the merge request to consist of atomic commits I can review.

Like the formatting requirement, I still think it needs more coordination, but if it is done in a different commit then the logic changes it being in the same merge request is not an issue.

If you touch a thousand file and say added argument documentation, that is still a quick review.

 

Squash is valid when there is a lot of back and forth within the commits themselves which sometimes happens. I do A, commit, change my mind, then do B and commit, then change my mind again and do C and commit and finally have to capitulate and get back to approach A and commit. Time to send a PR and I end up with a complete commit history but probably a history that I should clean up before submitting a PR.

I'm not saying there aren't times to take advantage of of it, but I have concerns because tools provide that as policy and I don't want people to take the recommendation that way.

There are always exceptions to the rule. I think with squash the most important thing is to cleanup before you push. Squash what should be squashed but never squash with something that has already been pushed.

Especially not after a (first) review/change-request! When you're rewriting history, I need to do over my entire review. Instead of just checking the changes since last the time.

I like the, add commits after review point. I intend to make the merge request a work in progress at that point and rebase when the review is complete.

My comment on squash was because gitlab can make every merge request squash before merge automatically.

 

This is totally right and great advice, especially as PRs tend to hide on UIs when merged and closed but commits stay more accessible. Thanks, Tony.

 
 

How would you recommend someone to "Avoid void changes" and "Avoid formatting changes"? I often see those on my PRs, but I'm not sure how to clean them up.

Thanks for the timely post by the way!

 

You can always append corrective commits when this happens.

If this is rather a problem of different members using different setups ( e.g. tabs, spaces or spaces as tabs) then you should collectively agree on a common format for your favorite development environment, store and share it.

Or alternatively you can resort into something like prettier.io/ which integrates with many languages and tools

 

+1 for prettier.

Also; if you still using the cmd line, please start using a graphical git client instead. Gitkraken, sourcetree, tower, there are enough. With a graphical client it's so much more obvious what you are going to commit.

Unstage/discard all 'garbage' and only commit after reviewing the changes. No more blind git add . && git commit -m.

I think that it's not a matter of whether someone uses the CLI or a graphical client for this, but more the overall experience, mindset and workflow of that person.

I personally always use a git diff to see what I changed and what I want to commit. If I'm happy with all unstaged changes being committed I use git add ., but if I see "garbage", I simply use git add -p .

After that I do a git diff --staged and if nothing went wrong I do a simple git commit -m with a descriptive commit message. When I'm done I do a git checkout -- <file> to get rid of the garbage that I didn't commit, or a simple git reset --hard if I really don't wanna keep anything.

What I think is that you can be as careless with both the CLI as you can be with a graphical client, it depends on how much effort you want to put into keeping your Git history clean and organized.

 

This is all good advice, and I prefer to take it further by asking for reasons in the commits themselves.

 

Great article Martín, thanks!

 
 

Great article Martin! Very good information!

 

I wanted to improve my PRs in my personal projects and this post was just perfect, easy to understand and very valuable

 

Btw, edited and added a section about Single Responsibility Principle on PRs.

Just experienced one of those today which recalled me that it is actually quite important.