Pull requests slow down continuous delivery when used as part of standard code changes in software development. If you make use of pull requests you should ask yourself:
- How much time are you spending waiting for pull request approvals every day?
- How confident are you that integration tests will pass when you open a pull request?
- How many times has code that doesn’t work as expected been approved?
Continuous Delivery Requires Integration Testing
When developing software we want to test and release software as frequently as possible so that we can get fast feedback and make rapid improvements.
In order for us to deliver software we need to run integration tests so that we can confirm the behaviour of the system and identify any problems.
But before that …
Integration Testing Requires Code Integration and Deployment
We need to integrate our code changes and deploy it to a test environment before we can run valid integration tests.
This process could be as simple as committing the change to the main branch and waiting for automatic deployment and testing. But for many teams this means:
- Creating a branch with a pull request.
- Messaging other team members asking for code review.
- Reviewing comments on the pull request.
- Making changes to the code.
- Repeating from step 2 until the right number of approvals are provided.
- Merging the code and waiting for deployment and testing.
How Pull Requests Get in the Way
Waiting for pull request approval of our code changes could take minutes, hours, or even days. This time directly impacts our ability to get feedback on the code changes because it occurs before our integration tests can be run.
Instead of wasting time waiting for approval of code that we haven’t fully tested, we should be aiming to get instant approval or complete the review once we have confidence in our code.
How Do We Move Faster?
There are two main options to prevent pull requests sabotaging your software development lifecycle by slowing you down.
Pair Programming
Both developers working together on the code whilst sharing a screen, keyboard, and/or computer can be used to guarantee minimal downtime between a pull request being opened and approved.
This works because both developers working together should understand the code, agree on the implementation, and be available at the same time.
Pre-Release Code Review
Completing code review after deployment and testing but before release can replace the need for pull requests. This still allows for the same level of review but gives developers the confidence and fast feedback of running integration tests first.
We can implement this code review process by adding some extra steps to our deployment pipelines.
First we need to diff the current code and new version, which can be done with git commands:
git diff e237udbas r2bghur392
Then require manual triggering of the release after reviewing the code changes.
What Do You Think?
These are just some examples of ways prevent pull requests from slowing us down when continuously delivering software.
Do you think these ideas could help improve your software development? Let me know on Twitter @BenTorvo or Email ben@torvo.com.au
Top comments (34)
Entirely agree up to this point:
It's entirely possible to setup a pipeline to run integration tests on pull requests, before they get reviewed and merged and you can continuously update the pull request until the integration test succeed. There's no reason why a pull request would have to slow you down...
You suggested solution to the non-existing problem is pair programming... How is pair programming less time consuming than a code review on a pull request? I mean... I'm not against the concept of pair programming, I just don't see any time benefit, compared to (remote) code reviews on pull requests...
This article is focused on testing that needs to be done after deployment, these tests can't all be done on a branch without other painful side effects.
The issue that pair programming (one of the suggestions) is solving is the delay in waiting for PRs which can be hours, days, or weeks. Having a short feedback loop by making changes, deploying them, testing them, and repeating is basically instant when having the review happen as part of the programming.
This means that the team has really fine setup in pair programming, which is quite rare. Typically, one participant gets tired after a few minutes and gets mentally sleeping.
Except cost of full test run which could be utterly high. I was reported that some projects require tens of hours even with good test setup.
For such cases I'd suggest marking in PR/commit (e.g. in commit message) what test subset to run automatically before passing to human check.
Well, pair programming is almost never a solution here, but the problem exists. It especially matters in some code types (e.g. "glue") where adding lowest-level functional ("unit") tests is useless because they will just copy the tested code.
Yes, everything depends on the test setup, but for pull request pipelines of the feature branches, only (~fast) unit tests should be sufficient. Full test run should only happen for merge requests to the staging environment (usually at the end of a sprint) where the human checks (user acceptance tests) will happen before going to production.
I don't think you can do anything about that until it actually becomes a problem... Humans are not designed to think ahead appearantly. We have to feel it first to learn from it, so we can forget it again and feel it again and so on. 😜 But in most cases, the quality of the unit tests will improve over time and the problem will go away, without the interference of time consuming pair programming strategies. 😅
The perceived problem might be bigger though, because these days the time to market is so short for new products, that test coverage and test quality havent had time to mature yet.
At my org, we took an easier approach IMO. Anyone can run integration tests on any branch locally. Because of this, integration tests are generally ran pre-commit. The feedback loop is about as fast as it can get. :) Additionally, we've segmented out the code so libraries are wholly independent with their own unit tests, and doing so means less overall integration tests needed for the whole application. By doing these splits, it also significantly reduced the execution time of the testing frameworks (library unit tests complete in ~1-2 seconds, full integration testing is less than 1 minute). Making them fast means they can run more often with less burden to developers.
What about the integration tests that rely on deployed infrastructure? Those are the most important and often overlooked integration tests.
In my particular case, integration tests can and do run on development infrastructure. We have parallel nearly identical infrastructure for development and production, with the main difference being production is scaled up to more nodes in more regions, but otherwise they're the same.
So the local branches are deploying development infrastructure per branch, all branches are deploying to the same infrastructure, or you aren't deploying the software before integration testing it?
One recent talk I’ve discovered talked about doing away with end to end tests entirely. They tend to be flakey, constantly being updated, and more time is spent fixing them then the actual value they provide. Here’s the talk that has me somewhat convinced to move away from end to end tests, and rely on frequent deployments. pushtrain.club/
What about when a dev pushes their code to Github and CI activates? Does CI run the full suite of tests?
Pull Requests Are Slowing You Down
Aren't PRs meant just for that? Can't see the problem yet 🤷🏽♂️😅
Pair programming as a solution... hmm I don't know Rick.
What if a team has an odd number of devs?
What if they both miss an error?
What if I have 2 seniors and 4 juniors?
What if some have a meeting but others don't?
Now being -more- serious, pair programming is OK in certain situations or as part of junior mentoring, but as a rule at work to use 8h a day it's mind-consuming and there's no reason for that, PRs are there for a good reason.
As advice, don't try to change the world without understanding how it works and why it works like that. Things are plenty of details when you look close to them and the major part had been studied.
Only raise new ways of doing when you understand something deeply, otherwise it's just posting your thoughts without taking the effort on learning about it.
Totally agree on this, even some of the non-serious items feel legit. It’s very likely that both the devs may miss a particular error/scenario which can cause problems down the line.
And honestly I would much rather utilise a senior dev’s time in something useful unless they are mentoring like you said.
These type of controversial takes feel very disconnected from reality, and in some cases solely focused on generating traffic.
100% agree, specially looking at the 3 posts of the OP. Can't find the sense on a single one of them.
I've read a few articles on proper setup of pair programming. Well, it's definitely not a piece of cake. Two rules mentioned there are that 1) pairs should be tested for psychological compatibility (using non-critical tasks) and 2) there should be more than one trained peer for each developer, and they shall be regularly regroupped. Some part of team could work on less important tasks individually, so, odd number of devs doesn't that matter.
But another thing here is that pair programming is exhausting: it spends mental energy much faster than individual work. Exceptions are rare and shan't be relied on. So, it should be used only for most quality-required tasks.
Well, there always shall be a reviewer who wasn't involved in this change development.
See above for answers to rest.
Looking at entire poster's blog, I'm certain he decided to follow the way of highly contoversial provocative posts. This shall be taken into account, and our task is to get what we need from such a discussion kind.
Agree. I wish there were more articles on this "code review after QA release". Even so, I still release many times a day if possible to prod without waiting. I've seen the scientific research, and PR's are great at finding bugs and making code better, but to your point, the cost is too high for non-open source projects, and pair programming or just occasional review is better. Sometimes it's hard to do a lot of pair programming if a developer wants to get in their own flow, or be left alone to work for a bit, or don't think they need to pair program.
It depends in the criticity of the project.
It's not the same pushing untested/unvalidated code into production for a silly landing page of your friend's barber shop than doing the same on a landing page of Adidas 10 minutes before a 5 million marketing campaing begins.
Not the same doing that in a community than doing it on a banking webApp.
You can't talk about "costs" without adding context. And this generalization or lack of context is applicable to any of the OP posts:
There's no real issue with any of this things IRL and those posts are either from a PoV of working on very little projects or just to gain viewers.
Of course each project has it's needs and is perfectly valid to have a side project in which you push directly into Master without automated testing, while having a more serious one that has tests but pushing code directly into Master, while having tests + PR at work on a more serious project.
Just make sure you choose the correct workaround for the project and be willing to change it whenever is necessary.
There is nuance, sure, but there is also a ton of compelling scientific evidence; this isn't "little projects". Check out any of Dave Farley's videos/posts on CICD, specifically the pair programming and feature branch ones. The state of devops and other scientific reports have studied these things and have super compelling numbers and quality increases from avoiding feature branches and mocks. If you want to say they're not compelling, you need to provide evidence to the contrary, not confidence.
Tests such as unit, integration, end to end: important and non-negotiable. PR's, for a team of devs working on the same project who trust each other, not needed. We're not saying remove tests or any quality gates in CICD, just replacing PR's with pair programming and small, frequent check ins.
Can you please link those studies with super compelling numbers?
By the way surveys are not "scientific reports" and "the state of devops" is a survey, which I read from 3 different sources (GitLab, CircleCI and Puppet) and none of them mention any of this.
As personal opinion I'd leave asap from a company that expects me to be on call the entire day for pair programming. Am I in a call center?
Avoiding PRs? Yes it can be OK if your team is composed entirely of seniors, I'm on that situation right now and PRs are a secondary thing. So it's a "yes... BUT".
Replacing PRs with pair programming is not trusting, is a waste of time. Sometimes you got good professionals to work with and sometimes you face people that need mentoring and supervision and I prefer PRs and pair programming when needed than being the entire day pair programming for no reason. They need to learn to be autonomous as well.
These are all problems I have dealt with in production environments of banks, media companies, and more. If you disagree that's fine but please provide reasons other than not providing enough context and that you don't think it's a good idea.
Sorry I thought it was self explanatory.
In security it's said that a system is as secure as it's weakest node.
Avoiding review steps in critical systems will speed up the process for sure but is not worthy in comparison with the estimated costs of any human error in the process.
Now imagine something big happens by a little mistake, and don't try to convince me about that, try to convince a judge about not using security checks because "they slow you down".
If I think of it you know what? Unit tests are slowing me down as well, already tested my feature/change and it works in my machinem, we can ditch them apart. Integration tests? Meh, tehre's QA environment, why bother? Meetings? That's much time, I rather code on my own without sync-ing with the rest of the teams.
Enough sarcasm for today. The pair programming thingy reminds me to those group exercises at college where some people won't do anything so IRL with that approach they'll be unnecessary resources.
The middle point on that is to apply branch permissions properly. If you have a team with 2 seniors in which you can trust and 2 juniors in which you can't, you can give those seniors permissions to merge/push directly into develop while keeping the PR process for those juniors, which is something more accurate with the reality and real project demands.
You rely a ton on your tests according to your posts. That's fine but remember that tests are done by people and also can include mistakes or not cover all the usecases.
We should remember that there are business around our code and we get paid to solve business needs. Now sum the fact that both in IT and business the thing is to be as much efficient and reliable as possible. A software engineer is paid 2 to 4 or 5 salaries worth of a low qualified job, that's because we can automate those jobs and the acceptance criteria for the product is usually being faster and more reliable than the humans it replaces.
Software is an investment for already existing business or a product itself for digital companies.
If you were right no single company would like to apply this PR process. But again in this platform (as almost every week) I need to remember that there's no [tech, methodology... whatever] to rule them all. You need to analyse your specific use case and choose wisely (If you're in a position to choose) because it's then your responsibility as well.
Cheers
I'm not saying don't do review, I'm saying don't use PRs for review as there are better alternatives.
The fact that you will suggest seniors should push to a develop branch indicates that you are avoiding PRs because they are slow. This article wasn't talking about how we should train juniors.
Yes, relying on tests is normal in modern software development.
You rely a ton on pull request reviews according to your responses. That's fine but remember that pull request reviews are done by people and also can include mistakes or not cover all the usecases.
Perfect, so we can agree on having PRs + Having tests are two checks and avoiding one of them will deal, allegedly, to a less reliable result consequently, let me explain why pair programming is not better than PRs:
First of all, pair programming means 2 resources on a single task during the entire development. Is much faster to check some development than developing it.
Now having a PR means you got the entire change history for that feature branch on a single point, being able to download it and execute it in your machine, checking the code in your IDE while keeping the possibility of calling the teammate for him to explain things if needed. You can double-check the tests, add some if you feel it necessary, execute them in that specific code version and so on.
What can be better than that?
If I'd to choose between both I'd rather pick PRs, but having PRs are not an exclusion for being able to pair programming or pair/peer review the code.
And no, relying on tests is not normal. Neither it is relying on PRs. We usually use both while adding several layers for quality assurance and security in the middle.
It's not me (or any dev) who's slowed down by PRs, it's the product, and past certain point, you need to choose between speed and reliability.
Ha ha
I'm actually curious about people's reaction on this :
I was recently presented with temporary environments. The idea behind this is to automate the creation of an environment at the creation of the PR, this environment getting updated with each new commit related to this PR. The developer is now autonomously able to test his/her changes. Once the PR is validated and merged, the environment is destroyed and the new version of the app is promoted to prod.
How does it answer the issue described in this article?
I have set up branch deployments before for teams of developers but in my experience the cost of doing so in complexity and extra infrastructure is not generally worth it. It can also be hard to get developers to manage the environments properly. It also increases the pr time if the builds are slow.
Thank you for your feedback!!
Even in a k8s infrastructure? I mean, there definitely is an overhead configuring it, but I'm wondering what would be the difficulty for developer managing the environments.
Yeah, even in k8s if you have 6 PRS then you need 6 versions deployed which takes up space and costs money. In that case there is less to manage but I don't recommend k8s based environments where the underlying infrastructure is managed separately to the application. It doesn't sit well with DevOps culture and I find that immutable infrastructure is a better approach.
Totally agree. I wish more people understand this. I've tried in many teams to introduce this and it has in some cases. But many people want to stick to the old habit of comment-ping-pong reviews.
My latest theory to speed up code review is to ask other developer to checkout the branch and fix something them self.
I agree completely, people behave differently when they are asked to fix the problems they see.
wow keep it up
Respectfully disagree.
So, to answer the questions :)
None. I create a PR when I'm done iterating and testing, and I move on to the next task until changes are requested or approval is given.
They should've already passed in your dev environment, or you should see it in the PR itself, before anything is merged.
Probably quite a few! Code review isn't meant to catch the same issues as a test suite or QA session. Tests should verify that the code works as expected. Code review should give you better confidence that the code is understandable, maintainable, extensible, debuggable, secure, and won't run into problems difficult or impossible to catch in a test suite - in other words, everything other than "does it technically work".
While reading this article the first thing that really surprized me is that Ben noted the variant to run tests after initial peer review approval as a something brand new. Well, the project in which I'm currently involved, uses this manner (with some complications). But, previously, most projects were using extensive tests that run immediately on review proposal. This difference is either due to domain mismatch (I had mainly been working at compact product companies) or some industry degradation, I haven't enough data to decide here. But, the change to the manner when reviewers have to spend their time before it's automatedly confirmed that a change seems being good, was an unpleasant surprize to me.
Pair programming is definitely not a stably usable variant because it doesn't avoid common errors and is extremely exhausting for participants. More so, 1 reviewer often isn't enough. In a usual case, 2 is minimal to provide enough quality. So I won't rely on pair programming here.
It is tightly depended on target domain and local language what is "integration test". I don't know bank specifics (Ben mentioned in comments), I'm largely in telecom and around. Usually we are able to add a unit test (i.e. lowest-level functional test) or cross-component functional test, for any component level, and such a test will be, besides counter-regression warranty, a piece of internal documentation. If even full set of functional ("integration") tests required to check correctness is unbearable for a every-hour run, a committer may add what test subset is required for the change. Then, CI might use it for the check before involving humans.
For used tools: I'd guess most corporative development is based on Atlassian stack which includes BitBucket (cloud or local). With my previous experience mainly with Gerrit and partially GitHub, I was nastily surprized how poor BitBucket is (at least in typical installations). For example, one usually can't compare newer PR commit version with older ones (it requires manual fetch of commits, some of which are already deleted from server). This adds its own contribution in development process disgrace.
There are a few tools on the market now (full disclosure: I work for Bunnyshell.com) that let you spin up ephemeral environments from a pull request. This will enable your continuous testing pipelines to test before merge. Pretty useful To reduce bottlenecks in that “single staging environment“ that all of us have seen at one time or another.