Bertil Muth

Posted on Nov 8, 2017

(Big) Rewrites

#business #rewrite #legacy #coding

Maurice Hayward asked: "How to make a business case for switching to a new framework, using a different library, language, or technology? "
I will begin with a true story from a client.
Then I will tell you what I learned from it.
And finally, I will answer the question about the business case.
The article will go beyond the original question, to explain context.

The disaster

A successful company I worked for decided to rewrite a core business system. (They operated several dozens of different systems in parallel, serving millions of customers.)

The company started to develop in parallel: new features were implemented in the old system to serve existing customers, and implemented in the new system for future use.

The project lasted several years.

The company released the new system to the mass market. It broke down. It couldn't handle the amount of requests and verbose data transmission caused by the new system's architecture.

The company decided to squash the project.
Keep a few parts of the new system. Keep operating the old systems.
The total project cost was 500 million Euros.

What the new system "must do"

Over the years, I have been part of several projects that tried to rewrite whole applications.

The most common requirement that I had to hear was:
The new software must provide the same functionality as the old one.
Plus some extra features, of course.

Based on that requirement, they made their business case. Often the forecast was: several years of effort, a million Euro budget.

And exactly that tended to happen. Sometimes, the new system wouldn't even be released. It turned out to be impossible or too expensive to reimplement all the existing features.

What I learned: that business case is flawed

The requirement that the new software must do the same as the old one is normally based on several implicit assumptions:

Our customers will only accept the new system if it works the same way as the old one, or has even more features.
We need to develop the new system as a whole, and then release it. A big switch.
Functionality is the most important aspect to consider when doing the rewrite.

Here's my insight: if you make these assumptions without verifying them, you take a high risk. A risk that may cause the project to fail.

Assumption 1: Our customers will only accept the new system if it works the same as before

Do you still have the customer that originally requested a feature?
Even if you do: do you have data that proves which features the customer actually uses? And how often?

Even if the customer uses all features: could you make workflows easier, make the customer happy and reduce development effort at the same time?

Even if you can't make workflows simpler: how important is that customer to you? Maybe you have other customers that are more valuable. From time to time it may be ok to annoy a few people to please others.

Maybe you really need to reimplement everything. I am just saying that if you don't question it, you are taking a risk.

Assumption 2: We need to develop the new system as a whole, and then release it.

That is often a huge risk. Because if it fails, you lose all the money you invested in the rewrite. Try to find ways to migrate to the new system in an iterative and incremental way.

I am not going into details here, but one way to do it is the so called Strangler Application pattern.

When you develop a new application, already plan for how you can replace parts of it in the future, with new technology and frameworks. Examples for how to do that are microservices, self contained systems or modular monoliths.

Assumption 3: Functionality is the most important aspect to consider when doing the rewrite

In my opinion, that assumption is the most critical one. And the least likely to be true.

Think about it for a minute: The new software must provide the same functionality as the old one.

THEN WHY FOR HEAVENS SAKE ARE YOU REIMPLEMENTING YOUR SOFTWARE? YOU CAN HAVE THAT TODAY! JUST STICK WITH THE OLD SYSTEM!

Well, no. The truth is:
A. We noticed that our development speed went down. It takes more and more time to develop new features, OR
B. Our existing system can't serve the growing number of users, OR
C. The useability is horrible, OR
D. The performance is bad, OR
E. Any combination of the above.

Now we're talking.

The business case for a rewrite

In most cases, the real reason to do a rewrite is: quality requirements.
Sometimes called "-ilities".

There are external qualities, like: useability, reliability, performance.
Those qualities can be observed by a user of your software. They can drive away customers to your competition.They can cause your product to fail on the market (see the first story).

Then, there are internal qualities.
How much time do you need to extend the software? Can you test it easily and automatically, even without the GUI? How much effort is it to replace one database with another one? Can teams deploy into production fast and without coordinating with other teams?

Those internal qualities are important as well. They affect the lead time to market. And developer happiness.

So my advice is: make the quality requirements measurable, and base your business case on them. Even if you are "just" migrating to a new framework or picking a new technology.
As developers, we tend to jump on the latest, shiny technology. Instead, you should be able to describe the benefits from a business perspective.

An example

Let's say you notice that over time, your software becomes hard to change.

You should ask yourself:

How important is a feature for your business?
How often are changes requested to this feature, or features that are related to it in your codebase?
What amount of development effort could you potentially save by rewriting the feature?

So, let's say a feature is less important for your business.
It has not changed in years, and your company has no plans to change it in the near future. You may decide to not rewrite that feature right now. That is a business decision.

Let's say an important feature has changed 5 times in the last year. Every time, it took your team an average of about 6 days to change the feature. If you only had proper configuration and automation, you could drive effort down to 1 day. Isn't that reason enough to rewrite it?

Well, let's do the math. 5 x 6 days is 30 days, minus 5 x 1 day. 25 days.
That's your development budget for the rewrite for the next year, given that the rate of change stays the same. You need to estimate whether you will stay below that threshold.

There's your business case.

All models are wrong, some are useful

I know this is a simplistic model. You won't get exact results from it.
There are more sophisticated models, like Cost of Delay.

Also, I assumed that features are rewritten independently, which is often not the case - for example when migrating to a new framework.

What those calculations give you is a feeling: is it worth the effort?

And if you can't rewrite independently: focus on a small subset of the scope first. Create a prototype, rewrite a few features and get a feeling if your assumptions hold.

"Are we getting acceptable performance if we are using this new framework? Let's stress test one/a few features with more and more users and see when it breaks."
Then compare it to the actual number of your users.

When you decide whether to go for microservices, talk to business.
How much lead time to market for a feature do they really need?
Does the reduction in lead time justify investing in the additional technical complexity?

That's about it.

Maurice, you were the first to comment on my request. Thank you for that.
I hope my answer is useful to you.

To all readers: I hope you found it interesting.
I am glad to hear your comments and own experiences.

Top comments (2)

Blaine Osepchuk • Nov 8 '17

Excellent post, Bertil.

We've got some really old systems at work and we couldn't justify the risk or the expense of attempting a rewrite. We decided to refactor instead. Refactoring allows us to add features and deliver business value without interruption, which is very important for these systems.

When we (software developers) are trying to make a business case for a rewrite, we should be converting the risks to dollar values and including them in the calculation. But from what I've seen and read, most rewrites are undertaken with little thought of the economics.

Moe • Nov 9 '17

Bertil. Thanks for answering my questions. This was truly helpful.