Lucas Medeiros Reis

Posted on Apr 6, 2017

Solve All Bugs Or Implement New Features?

#codequality #scope #agile #risk

This post was originally published on my personal blog.

Most projects have moments in which they have new features to implement and, at the same time, bugs to solve. A choice has to be made on whether to focus more time on solving those bugs and making sure they don't appear again - let's call this option quality - or implementing new features - scope.

A lot of arguments have been made in favor of quality. One of my favorites is an argument presented by Joel Spolsky in this already classic text.

In this post I'll make the argument that, even from business and financial points of view, focus on raising quality should always come a before focus on implementing new features.

The concepts

First of all, let me explain better what I mean when I talk about quality and scope. Let's say there's a demand for X new features to be implemented in a given period and we know there are currently Y open bugs. One could say:

"We are going to implement all X features and solve as many bugs as we can."

Or you could say:

"We are going to solve all Y bugs in their root causes, and we'll implement as many features as we can."

See the difference? It's a matter of explicitly stating what your priorities are. In the first example, we'll make sure everything we implement is well tested and well designed, but some bugs will not be considered due to time constraints. In most cases, bugs will be listed in a spreadsheet or on a board with a "priority" property, and the business will choose which bugs should be solved per sprint, based on a "financial" analysis (which I'll talk about in the next section).

In the second case, a team will focus all of its energy on, at least, investigating every bug that appears. Whenever there's a decision not to solve a bug right away, it's because the developers diagnosed that the bug is not a manifestation of something that would stop the business, or incur a big loss.

Those are what I'm calling in this post "a focus on scope" and "a focus on quality", respectively.

Losses due to lack of quality and scope

One of the main criteria needed in order to choose between focusing on solving a bug or implementing a new feature is an analysis of the financial loss that the business will incur by not doing these things.

It's impossible to know for sure the loss a business could have in the future, so we'll need to estimate it. And to do the estimation we'll need to deal with uncertainty. My point is that the uncertainty of a loss due to not solving a bug is qualitatively different from the uncertainty of a loss due to not implementing a new feature.

To explain this difference, I'll briefly present two concepts introduced by author Nassim Taleb in his book Black Swan. I suggest this book to anyone interested in having their mind turned inside out. :)

Mediocristan and Extremistan

Taleb argues that uncertainties can "come from two places". The first place is Mediocristan: observations do not vary very much, or vary in a somewhat predictable way. An example of an Mediocristan uncertainty is "the average height of a random group of a hundred people". After you measure five or six groups, you would have a very nice idea of what other averages will be. Even if one person in the group is the tallest man on Earth!

Then there is Extremistan, a place where one observation can be very, very different from the others, which makes predictions very difficult, sometimes even impossible. An example is "the average net worth of a random group of a hundred people". Every group will have a very different average. We could have a run of thirty, or even fifty groups with a very similar average, and then suddenly Bill Gates is in a group and "BAM".

So, here's my first claim:

Losses due to not implementing a new feature belong in Mediocristan.

A new feature is something that in the past did not exist in your business (duh). That means that not having a new feature, in the majority of cases, will not cause a big loss. Or at least it will not make the losses you already have any bigger.

Let's consider for example an e-commerce business that wants their users to be able to pay with a new payment option, let's say Bitcoins. It could be difficult to estimate how much the company would make, but it's relatively easy to see that there won't be a big loss if the Bitcoin feature is not implemented. Current users are buying with current payment options and will continue doing so even without a Bitcoin option.

And that leads to my second claim:

Losses due to not solving a bug belong in Extremistan.

Every bug, no matter how small, can be the tip of the iceberg to a much deeper problem. That's why it's so difficult to estimate the effort needed to solve a bug. And deep problems can stop a business in it's tracks.

A bug in the error feedback of a credit card form could explain the drop in conversion rates. Also, that bug could be a manifestation of a much deeper problem with server side validation, which, in turn, could start to appear in other forms.

Some errors can be solved quickly by "patching" them. But that means that errors could reoccur if someone touches the code, or uses those same modules elsewhere. A lot of times, a more complex refactoring has to be done carefully, and must be fully backed by unit, integration and end to end tests. With these tests, we can diminish the chance of a bug creeping back up on you.

But is the whole team always going to be working on bugs?

What about all the new features that we need to implement??? When will that happen?

By solving every bug at its root cause, we diminish the chances of new bugs appearing. That means that in the not-so-long-term, almost everybody in the team will be focusing on implementing new features. The project gains traction, and will deliver a larger scope with a higher quality. And that's what everyone wants, isn't it?

I'm not saying that all bugs have to be solved right away. I'm saying that all bugs should be investigated, and the developers should diagnose whether or not the root cause could give rise to other new, more dangerous problems.

A script for dealing with a bug

Every time a new bug is reported, and comes to the developers to be solved, ideally we should follow the following steps:

Investigate the root cause of the problem. Ideally this should be done in pairs, with at least one of the developers having already worked on that part of the code. That would make things quicker.
Choose whether or not to solve the bug right away. The answer to this question should be "yes" in most cases.
Write tests to reproduce the bug. Unit tests preferably, but don't be afraid to write new integration and/or end to end tests, if needed. By creating new tests, we try to guarantee that if anything brings the bug back, it will be caught by tests before going to production.
Solve the root cause as well as you can.

This approach is not new. It's actually one of the main components of Toyota's "Lean Manufacturing" revolution, called Jidoka.

Summing up

Losses due to not implementing a new feature are much more predictable than losses due to bugs not being solved. Every bug is a potential business-stopper. And that's why we should always give quality a higher priority than scope: to keep our businesses alive.

Top comments (10)

Fabio Catunda Marreco • Apr 7 '17

Although I agree on the concept, I think your arguments are fitted to a service oriented team. If you work on a company that sells custom software , or at least one that customize an existing software, you'll have clients paying specifically for those features. Bugs won't even be considered by sales. That's a problem down the road for customer's support, who wont have the same importance as sales in the eyes of management (unfortunately).

Lucas Medeiros Reis • Apr 8 '17

I can see it happening in situations where there's little "skin in the game"; when people who are selling are not the same ones who will build it, and/or the people who are buying are not the ones who will use it.

It's definitely a sad scenario!

Fabio Catunda Marreco • Apr 9 '17

True, or even when the people who are selling (or delivering) are not held responsible for the after sales. Situation far more common then we'd like to admit, unfortunately.

Charles Reace • Apr 6 '17

As Devil's Advocate, I'll argue that losses due to not implementing a new feature are not that simple, if you compare not implementing it versus projected additional income of doing so. In most cases, that prediction is a fuzzy best-guess, but it still should be part of the equation -- though in many/most cases it will probably be about as fuzzy an estimate as losses due to bugs.

Lucas Medeiros Reis • Apr 6 '17

Hi Charles!

I also don't think it's simple - but I argue that one is not as fuzzy as the other.

A bug can be a small manifestation of a much bigger problem that can at some point stop your company from making money - for instance, a small problem in card number validation could be a problem with the card information you are sending to the payment gateway, and that could have a huge impact.

I can't imagine a situation where not having a new feature would stop you from making money, unless you are in a really early stage of a startup or something. But even then, losses due to bugs could be fatal!

Mike Petty • Apr 6 '17

If car stereo OEMs didn't implement the new features of support for Apple or Android Auto support, or if you didn't add driver support for USB 3 as a silicon vendor, or if after Google photos auto upload Apple didn't follow suit, or if when Apple supported backgrounding or new default apps and you didn't add that feature to your apps, or if you don't update your scanner software to support new barcode types, or if you don't support new industry reporting regulations, or if, or if, or if you could lose business.

I actually have a hard time thinking of a competitive market where if you don't maintain feature parity with competitors that you don't end up losing business.

Lucas Medeiros Reis • Apr 6 '17

Those are good examples!

But one could still argue that it doesn't matter supporting a new protocol, if your product as it is today has serious defects.

Of course, in the long run both are important, and having an obsolete product will drag you out of business. This article is only trying to help people understand the trade-offs when choosing not to at least diagnose every bug as they appear.

Charles Reace • Apr 7 '17

And the really important thing is that you actually think about it, and don't assume you know which is the right answer until you've done that thinking with the best information you can gather at the time -- which I think is ultimately what we all are saying. :)

Lucas Medeiros Reis • Apr 7 '17

That's it! :D

Rajasegar Chandran • Jul 16 '22

I love the Mediocristan and Extemistan ideas from NNT