Folklore and common sense warn developers and teams against doing big rewrites.
To do or not to do
There are many reasons not to rewrite apps from scratch:
- rewrites take time
- the "legacy" app still needs to be supported and probably debugged
- rewriting the same exact app hoping to change the outcome could be an early sign of madness
- requirements will definitely change from when you start to when you finish
- the company already paid for the "legacy" app, now you want it to pay for the same thing twice
- management will probably be hard to convince
If the rewrite is justified though, there are some positive aspects:
- the time spent rewriting is time you're going to learn a lot
- the "legacy" app has become garbage fire (because of turnover, feature creep, bad design, lack of expertise, Saturn in opposition, whatever) and is slowing growth
- even if you think so, you won't end up rewriting the exact same thing
- the changes in requirements might result in a different, better, product
- if you're not a startup, the "legacy" app is usually funding the rewrite anyway
- management and your colleagues will trust you a lot in future years if you all manage to pull this off
- you get rid of all the tech debt just by deleting a folder (and you get to create brand new debt :-D, but let's not be picky)
Why I'm writing this
The other day I read two "old" posts about a successful "big rewrite".
In the first one, Against the Grain: How We Built the Next Generation Online Travel Agency using Amazon, Clojure, and a Comically Small Team, Colin Steele narrates a journey of moving from a giant pile of tech debt that was going to sink the company to a successful re-engineered product. In the fray there are mistakes made and... a succesful acquisition from another company.
The product is a hotel meta search engine.
The bulk of their story
Premise
- they initially had the wrong business model (quite common with startups, at least in my experience)
- the app was a spaghetti of monolithic PHP probably worked on by many hands
- the database was a mess
- there were no tests
- he was hired as a consultant and extracted a key feature using Ruby and async programming but the rest was too far gone in his opinion
Pre-execution
- he became CTO of such company and convinced management to attempt a rewrite
- they fired most of the existing devs and hired just a handful of seniors (another common theme in startups in damage control that are draining money)
- they started the rewrite while keeping the old product running
- they switched from hosted servers to cloud (keep in mind that this happened in 2010) which took convincing
Tech choices
- the frontend dev wrote a SPA with vanilla JS (again, in 2010)
- after thorough testing and some guts they settled on Clojure (even if he was a Ruby expert). Ruby was abandoned because it required more resources to scale and they had none and because of its builtin concurrency model
- Clojure was the right choice from them. As he wrote: as the CTO at a cash-strapped startup, Clojure was the answer to a prayer.
- Clojure was probably an easier sell than usual because how tight they were with time and resources and how well management trusted the team (it would be a though sell in 2018, imagine in 2010)
- the type of web app they had and the performance testing they performed justified the choice (and saved the company money)
Post execution
- they were acquired at the end of 2011
- all of the tech choices they made were questioned (why AWS, why Clojure)
- he says that they were able to "sell" the choice of Clojure to the new company because it sits on top of the JVM and because of the nice graphs about the performance of the system he showed them
End of the story
From the second post, 60,000% growth in 7 months using Clojure and AWS
Over the course of the last 7 months (we launched in January 2012), we’ve gone from about 1,000 uniques/day on hotelicopter’s site, to 600,000+/day on roomkey.com. That’s 60,000% growth in 7 months
So, the rewrite paid off.
Another thing to notice is the amount of trust management gave him and the team. Without that the rewrite would have probably failed or they would have run out of money or they would have had to incrementally refactor maybe taking more time. We'll never know.
If you want to read more about the tech choices and the stack read this second post.
An anectode from a solo rewrite I did
I once was hired to work on an unmaintanable app that had to be rewritten.
Coincidentally it was written in PHP as well and this too had a database structure that needed Sherlock Holmes to be deciphered. It took me at least a week of staring at MySQL tables with cryptic names and cryptic fields, googling PHP functions to figure out what happened to the data (most of the DB logic was in the app) and to design a new DB that was sane.
I ended up rewriting the app in a short time in Python (and migrate the data). It worked :D
The scope was smaller though and I had no choices to justify, they needed someone with expertise to bring a legacy app to a known stack and then hand it over.
The good thing about this rewrite is that knowledge of the previous stack wasn't ultimately required and I was happy to mostly ignore the app code and being able to focus on the data to bring along and the requirements.
Now that I'm writing this, I think the "legacy app" could also be used as an argument in favor of frameworks for less experienced developers working in small companies where they might not have seniors to interact with day to day. But I digress.
Conclusions
Keep in mind that there's not a single way to accomplish a rewrite, you might pull it off with a mixture of refactoring and rewrite, for more on this I defer to @bosepchuk:
The Rewrite vs Refactor Debate: 8 Things You Need to Know
Blaine Osepchuk ・ Mar 26 '18 ・ 9 min read
If you want to read another success story (seemingly less wild in its premises), read this by @jcreenaune:
Top comments (10)
You should avoid complete rewrites at all costs. Try to find one small aspect in your legacy application and rewrite that. Then integrate that newly rewritten part in the old system. If you have integration tests for the old system, you can use them with the new code.
This has several advantages:
The downsides are that you cannot do this if you basically need a completely new system
Yeah, I feel a complete rewrite should be the last resort. It made sense in the example I linked to because they had unmaintanable software with no tests and they were going to run out of money.
Usually in a more established company when you accrue tech debt, you can rewrite/refactor in steps as you said.
This is pretty much the micro service mantra, driven by a need to uncouple change in different areas of a system so the team can keep up with unpredictable demands (or at least only short term predictions), without throwing everything out from the monolith on day one. Martin Fowler (him again!) describes this as evolutionary architecture.
If the question did arise for rewrite, it needs rewrite !!
When apps grow, features are added and old features remain in code but are obsolete. Every other products where app connects also have new versions and better features, and your app contains code to overcome lack of those features.
While rewrite, you already have list of all useful and useless features so you can design it correctly. This brings benefits of unit testing, typically in young app there is no scope of unit testing when features are changing dynamically. Typically in product lifecycle, earlier age is spent in acquiring customers and product constantly changes due to feedback and internal code becomes mess !!!
After product has become mature, it needs rewrite because you have time and money to rewrite. Also you don't need to rewrite everything !! You can substitute every services with newly designed micro services.
Yes, that's a valid strategy and I think it relates to @phlashgbg 's comment about Carmack's quote.
Only one thing to keep in mind:
There are perfectly valid reasons for splitting a monolith into a microservices but it's not a given, in the sense that 20 microservices handled by 1 person are an anti-pattern.
The rewrite can be a modular monolith, a hybrid of a monolith with some serverless functions or a 100% microservice architecture, a totally serverless app or anything in between. It always depends on context and constraints.
I think rewrites/redesign/refactoring have their place. It is important to evaluate exactly why it is needed and whether it is cost effective in the long run. For instance if scaling is a major issue then switching to a faster language might be worth it.
At my company we have a single web page that was originally written about 4 years ago and was made very dynamically for this one paradigm by one person. Then we added many many new features to that page that did not fit anywhere in the predicted paradigm. Add to that that different people of varying skills, style and experience have added to those features thus making it a monster of at least 20k lines of code for one page (not including libraries or the like). We are considering rewriting that page so that we can add new features (which are added constantly) in a quick and maintainable fashion.
Is it like a dashboard type of page? I can't fathom a single web page having 20k lines unless they're autogenerated or something.
If you consider "rewriting", can you consider splitting it as well?
Because it might end up in the same spot in 4 years from the rewrite, maybe the issue is the page having too many responsibilities.
So it is an internal page that our employees use to generate most of our services. It certainly has way too many responsibilities and can definitely be split into multiple pages at this point. I think one person calculated over a million combinations of things you can do on that page.
There is what I used to know as 'Carmacks Law', but the Internet seems to have forgotten about it since the late 90's when I heard it, I must be getting old! I remember 'rewrite when you can, you will do a much better job the second time round without the baggage of old code', the nearest I can find is this:
brainyquote.com/quotes/john_carmac...
"One of the reasons microcomputers progressed so fast is people are willing to accept crashes. It's faster to build something and try it, even if it means you'll have to rebuild later... If you spent too much time building and massaging one vehicle, you don't learn anything"
Thanks Phil! I like building software prototypes (even those that are thrown away) so I definitely relate to that quote, and Carmack is a legendary engineer :D