So, if you're reading this it means you're involved in a software project that's a steaming mess and you want to explore your options. Don't be embarrassed. This is a safe place; we've all googled "rewrite vs refactor" at some point.
The problem is that our profession is long on opinions and short on evidence for what to do with troubled projects. And the opinions we have are all over the map. So, it's hard to know what you should believe.
In this post, I'm going simplify things for you and help you navigate the rewrite vs refactor debate.
Let's get started.
It's easy to fall for a good story. But anecdotes are not the same as evidence.
Think about it: what do you really know about the expertise of the person who wrote whatever it is that you read? The bar for publishing a blog post online is extremely low. How would you feel if you learned the article you read was written by the worst programmer you ever met? How about someone with less than one year of experience? How about someone who has only worked on one system, ever?
And it works the other way too. Do you automatically believe what Martin Fowler says on refactoring because he's famous and wrote a book about it (also not peer reviewed by the way)? How about Robert C. Martin who wrote that "[rewriting] is probably the worst thing you could do"? That seems pretty dogmatic. No exceptions? Ever?
Rewrite, refactor, or continue with business as usual are not your only choices. You have a whole spectrum of options.
Even within one system, the best solution for different parts of it might be different.
You'll see this over and over again as you read blog posts about rewriting vs refactoring software. "Rewrite", "restructure", "redesign", "re-architect", "untangle", "port", etc. are all ill-defined terms. They mean what the author decides they mean. If you're lucky, the author will tell you what that is, but that's rare. So, one person's refactoring can and often is another person's rewriting.
This leads to unhelpful debates in the comments sections where people are arguing because they aren't even talking about the same thing.
If you only focus on technical considerations you could reach a conclusion that is completely unacceptable to your larger context. Perhaps your employer has priorities, obligations, or plans you don't know about.
For example, your company might prefer a more predictable approach to improving your software, even if everyone agrees that a rewrite will probably be faster. They could have made secret promises to customers about future features of your software. They might be counting on cash from another project to keep the business out of bankruptcy and you can't do that project and a rewrite or refactor at the same time.
I know programmers generally hate politics but your decisions don't exist in a vacuum. Contemplating a rewrite or refactor of a non-trivial project is going to get people's attention and you can bet it's going to get political.
It should go without saying that rewriting a short script that calls ImageMagick and resizes the images in a directory is COMPLETELY DIFFERENT than rewriting all the software in the F-22 Raptor. While this is an extreme example, even projects that are superficially similar might be very different on the inside and call for different approaches to keeping them viable.
This is another problem with loosely defined terms. People write stuff like "we successfully rewrote our software from scratch in three months and it would have taken much longer to refactor it". What does that mean exactly? What defines success? 10% ROI? 15%? The project had a positive net present value before interest, depreciation, and taxes? Seriously, what? The details matter.
And let's not just accept the other part of that statement without some examination. How do they know that a refactor would have taken longer? The only way they could know for sure is if they did both the rewrite and the refactor, which I guarantee did not happen.
Another question: could that successful project just have traded one set of problems for another of equal or greater ugliness?
Finally, was the project a success for all stakeholders? I can imagine a scenario where the developers could see the project as a success because they got to dump some of their technical debt. But the business people might be on the verge nervous breakdowns because the market changed and their competitors stole the majority of the business's customers while the developers were busy completing their "successful" rewrite.
When is the software ‘done'? Let's look at a "successful" rewrite and see if we can agree on when it is done.
Is it the day:
- you deploy it?
- it has exactly the same features as the existing software?
- enough of the new defects are fixed that customers can use it without cursing the day you were born?
- all the defects are fixed?
- the money runs out and the customer is stuck with whatever you give them, even if it's buggier and has fewer features than the version you are replacing?
Any of those things could be considered "done", right? What's the chance that the author of the blog post you read is using the same definition of done as you?
Most people want to do a rewrite. It's very appealing to start from scratch. Whereas, people often see trying to improve the existing system as the long, hard, ugly path to better software. So, the first thing you need to do is determine if a rewrite is even feasible.
Start by acknowledging that a large software rewrite is risky. Many fail, some seriously injure the company, and a few kill the company outright. But technical debt and obsolescence can be just as dangerous in their own way.
Sometimes factors outside your control force you to do a rewrite. For example you might need to change the behavior of a system for which you only have a binary, and not the source code. Or the software is tightly coupled to ancient hardware that you cannot reliably keep in service.
Whatever the case, if you've reached a tipping point on technical debt or obsolescence, everything you do will be risky, including continuing with business as usual. There are no good choices at that point. That's why you should do everything in your power not to allow your project to get that bad.
Certain things need to be in place before you can even contemplate a rewrite. Max Kanat-Alexander has written about this in his book Code Simplicity.
...there are situations where a rewrite is acceptable. However, they are very rare. You should only rewrite if all of the following are true:
It's quite a long list so I'm going to paraphrase Max's words:
- You have developed an accurate estimate that shows that rewriting the system will be a more efficient use of time than redesigning the existing system.
- You have a tremendous amount of time to spend on creating a new system.
- You are somehow a better designer than the designer of the original system.
- You fully intend to design this new system in a series of simple steps and have users who can give you feedback at each step along the way.
- You have the resources available to both maintain the existing system and design a new system at the same time.
But these are just the minimum requirements to consider a rewrite. The next section digs in a little deeper.
Can you and your stakeholders agree on the answers to these questions?
- Rewriting from scratch requires different skills than maintaining and extending a system. What evidence do you have that your people can complete this kind of project successfully? When was the last time they did it? What assurances do you have that you won't end up with a new system with just as many problems as the old system?
- The system you want to rewrite is complex, hard to understand, hard to change, is full of inconsistencies and defects, has low test coverage, and has low-quality documentation, if there's documentation at all. Correct? Would it be fair to say that nobody completely understands the existing system? If that's true, how are you going to figure out everything the existing system does and turn that into a set of explicit requirements for the new system *in a reasonable amount of time*? Can you even estimate with a reasonable degree of accuracy how many requirements there will be for the new system? Probably not. So, how can you estimate the implementation effort for the rewrite?
- Doesn't a big rewrite from scratch look a lot like a giant waterfall project where the requirements are scattered haphazardly in the source code of the existing project? Didn't we adopt agile and incremental feature delivery because waterfall and big up-front design was too risky?
- I've argued against simultaneous feature development because of the increased costs associated with delayed feature release. Isn't a rewrite an extreme version of this?
- Are you sure a rewrite is the best use of your people's time? There's a whole continuum of options between tiny refactorings and rewriting from scratch, with the rewrite being the most risky and most extreme. Is it possible that a less extreme plan might be more desirable?
A complete rewrite of a non-trivial system is a major undertaking that carries significant risks. But if you decide to go ahead with a rewrite, it's best to do it with your eyes open because you might be proposing the software world's equivalent of Jamaica attempting to conquer Russia. Via the north pole! In winter!
Success is not automatically impossible, but it won't be easy. And without proper planning, the support of your entire organization, and adequate resources, you might be volunteering yourself and your team for a death march. So, make sure you have a plan that will work before you commit your team to a rewrite.
The rewrite vs refactor debate is largely the result of the following factors:
- lack of evidence backed by quality research
- lack of standardized engineering methods in software development
- the desire of programmers to avoid bad code
- software developers believe (often naively) that they can produce a better system if they start from scratch
- poorly defined terms
- oversimplifications of complexity and acceptance of anecdotes as evidence
- a highly visible record of failed rewrites
- and a narrow focus on technical concerns
While appealing, rewriting a non-trivial system from scratch is risky. Many people advise against it in all but the most dire of circumstances.
But in some cases, a rewrite is your best or only option (example). In those cases, you need a good plan to maximize your chances of achieving a successful outcome.
So, now that we've identified and cleared up some of these issues, you should be in a position to think logically about what you read, how well it applies to your situation, and see the rewrite vs refactor debate for what it really is.
Good luck with your decision.
Agree or disagree? I'd love to hear your thoughts in the comments section.
Enjoy this post? Please "like" it below.
- great overview of the debate: https://techbeacon.com/17-opinions-resources-rewrites-vs-refactoring
- rewrite survival tips: https://timross.wordpress.com/2010/03/15/if-you-must-rewrite/
- argument for rewrites (I don't agree with everything here): https://softwareengineering.stackexchange.com/questions/6268/when-is-a-big-rewrite-the-answer