The price of software: Unit testing

#testing #architecture #tdd #microservices

Test automation is one of the best investments an engineering team can do when developing software. The industry accepts that testing units is the fastest and most simple form to automate tests. But does that mean we should unit test our whole codebase? How much of our code should we cover with unit testing? How can we be most efficient?

Measuring Tests Effectiveness

Finding program bugs when developing is cheaper than fixing them afterward. Handling user concerns, maintaining transaction consistency, isolating, and then releasing a patch for a defect is expensive.

The goal of testing is to reduce the cost of software.

The difference between the price of process and the cost of bugs is the real return of investment of writing tests.

Code Coverage

The traditional way to measure how much we are testing is the so-called code coverage. We obtain the value of coverage as the percentage of source code that runs when executing tests.

In addition to the number of lines of code, different criteria can be used in the coverage calculation: function coverage, statement coverage, and so on. However, they all rely on the ratio of total code versus code executed when running tests as measurement.

The assumption behind measuring code coverage is that a program which entirely tested will have fewer defects. In practice, aiming for 100% coverage is not very realistic.

Testing Fatigue

The cost of writing unit tests increases exponentially as we get closer to 100% coverage. The number of defects detected decreases at the same rate.

Writing tests to get the test coverage from 90% to 100% is expensive and highly ineffective in preventing bugs.

Aiming for complete test coverage couples the tests with the implementation to a level where development effort for a feature can even double. We get slowly out of sync with product development when we spend too much time testing implementation details.

The Cost of Unit Testing

Being unit testing the fastest way to test, we fall in the trap of overusing it. To increase code coverage, we test code that does not correspond to a specific unit but ties many units together. The effort of mocking and stubbing becomes higher than writing the business logic itself. We are at that point writing integration tests using the unit test approach and tools.

The return on investment of those unit tests is minimal because they rely heavily on mocking of their dependencies and will pass even if those dependencies have changed their interfaces. Making mocks and fakes aware of changes in the constructs they replace results into heavy maintenance work.

Abusing unit testing adds overhead to the development process without reducing defect density.

It is usually much more cost-effective to write an integration test than several unit tests if we already have a decent integration test setup.

Using the traditional code coverage as a synonym for program correctness or code heath will lead to a negative return on test efforts and increase development time rather than decreasing it.

The Effective Code Coverage

Tests are beneficial if they prevent the maximum amount of bugs unit of time invested. Code coverage is only useful if we include the context in the calculation. A better parameter to quantify out test efforts is something we can call the effective code coverage. That is the percentage of code that the development team decided should be unit tested and the ones that execute when running tests.

Depending on the project and the programming language, this metric will not be higher than 20% to 40% of the total source code. That is because we target code that:

identifies as a unit
is ideally reused

Because tests do not suffocate development, the return on investment of this method is much higher. The approach gives space to investing testing time that would normally go into unit assertions in integration or end to end tests.

This testing philosophy requires flexibility in thinking and relies heavily on context: the kind of software we are writing and its purpose, language, frameworks, and, architecture choices we make. Therefore, the parts of the source code that must be unit tested change with the same frequency as our product does.

Although counter-intuitive, decreasing the amount of unit testing can improve the overall code quality.

We should avoid writing unit tests that are too close to implementation details. Instead, focusing on reusable modules and their interfaces will maximize the number of defects identified per unit time.

Approach unit testing in the same way you approach any form of automation. Traditional process analysis, including cost and time, help make a much better decision than absolute measures of coverage. The remaining testing time can be invested in more sophisticated testing methods, closer to user expectation, and changing less frequently.

Summing up

Rather than absolute metrics use context in your decision making. Be intentional about your choices and always keep an open mind.

Testing setups with the highest return of investment are a mix of unit, integration, end-to-end, and manual tests. The perfect blend depends on your product, tech stack, and team.

Did you enjoyed the content? Follow me on dev.to and LinkedIn to get more of this in your feed. As originally published on bogdanned.com.

How to Diagram Your Cloud Architecture

Cloud architecture diagrams provide critical visibility into the resources in your environment and how they’re connected. In our latest eBook, AWS Solution Architects Jason Mimick and James Wenzel walk through best practices on how to build effective and professional diagrams.

Download the Free eBook