Software Quality (8 Part Series)
In this article I’ll discuss what Code Coverage is and its usefulness and limitations. I’ll advocate for a risk-aware approach to software quality and give a few practical examples in C# and F#.
In the development world, the term code coverage gets thrown around a lot. There are many tools built around code coverage (some absolutely fantastic), and it is a very valuable metric to have. However, code coverage is seriously flawed if it is held on too high of a pedestal.
Consider the following C# code, noting the coverage indicators in the left margin:
This method has only 1 out of 6 lines not covered by tests. Since 5/6 is 83%, we can state that this method has an 83% code coverage, just referring to lines of code.
Is that good? Is that bad? It depends.
On the one hand, 83% is pretty high up there. It’s not 100% coverage, but we know with confidence that the majority of statements execute without experiencing issues severe enough to fail their tests.
However, without seeing the specific tests, we can’t know for sure that we’re testing anything regarding the actual result of this method, only that it was called during at least one test execution.
There are absolutists out there who say code needs to have an arbitrary degree of coverage. I most hear of people demanding 80% or 100% code coverage.
I think, however, that we need to consider risk in our testing efforts.
What’s the actual risk that we encounter here? What’s likely to break over time?
The code isn’t likely to suddenly return a slightly different spelling for Corruption damage. Additionally, adding a new case to the switch isn’t likely to cause an error the compiler won’t pick up. So what are we really testing?
With the code above the most likely failure, in my estimation, is that we add a new member to the
DamageType enum and fail to add a case for it here. This would result in an incorrect string coming back to the caller due to a line not currently being present.
And here’s the problem with code coverage: even if we have 100% code coverage of this method, adding a new member to that enum will cause a potential bug unless we remember to add a matching line here.
No, if we’re trying to minimize the risk of future defects in C# and do something like the following:
By throwing an argument out of range exception, we maximize the chance that we’ll find the issue. However, there’s still a risk that we wouldn’t find this case.
The best way of reducing risk of introducing a new enum value would be to transform this into a functional programming type of scenario and put functions related to the enum next to the enum. Here’s the equivalent code in F#:
match statement is like the C#
switch statement, but more powerful. Of note here, it checks itself to see if it is exhaustive of all possible inputs. If we ever add a new
DamageType member, we will be warned if we don’t have a case for it.
Yes, that’s a bit of a straw man argument. Most code is more complex than a simple switch statement. Let’s look at another example:
In this case our coverage is pretty good at 60%. There are two branches we have 0% coverage on, however.
The first is in validating
gameObject if it came in as
null. This is typically the sort of line that I tend to leave uncovered as the null check is present only to catch potential issues at the boundary of a major piece of code. For example, I’d rather get an argument-related error at the beginning of a method than a vague null reference exception later on.
Note: An exception to this is if you’re building a public API to be consumed by numerous other individuals and you want to make sure that your validation behavior remains consistent from release to release.
The second branch we don’t have coverage on is the portion where the requested cell did not exist. In this case, that cell is created on the fly and added back into the collection.
So, do these test cases matter or is 60% acceptable?
My personal inclination here would be to ignore testing the argument validation (unless I’m building a public facing API). I would, however, like to see the dynamic creation of the cell incorporated into a test case.
It’s not that I believe that the line in this method will someday fail, but rather the lack of code coverage for this statement tells me that I may be missing significant unit test scenarios in other methods as well.
This line tells me that I ever have dead code that will never be hit or I need to figure out the larger case when that line is hit and wrap a unit test around that process.
Finally, let’s close with a short example of 100% code coverage:
Here, we call
2. This method should give us back “The answer is 4”. Unfortunately, the developer didn’t really do addition and the method always returns “The answer is 42”.
Unfortunately, the unit tests are just built to ensure that the string starts with the expected prefix, so the actual value isn’t tested.
As a result, we have 100% passing tests and a blatantly incorrect method.
This is why you should take code coverage metrics with a grain of salt: Just because a line is executed by a test, doesn’t mean that the line is correct or accurately tested.
I’m not saying we stop tracking code coverage. I’ve actually integrated it into my builds with OpenCover recently, and have been very happy to watch my coverage climb.
What I am saying is that we should view code coverage only as a small part of the picture in ensuring quality over time.
When I add or change code, my expectation is that I have two strategies for catching issues with the changes before the code even makes it to quality assurance for review.
Sometimes the safety net is a unit test. In fact, this is my preferred way of adding a safety check to modified code.
Other times the safety net involves leaning on the compiler and/or source analysis tools to find blatant issues.
Manual testing by the developer is a good gate to go through as well, with a quick visual inspection or API call to verify that all is working well.
Keep testing your code, and keep tracking coverage, but keep in mind that what we’re after is not a magic number that climbs ever-higher, but an efficient, risk-aware software development process that optimizes code to reduce the risk of adding it as well as maintaining it in the future.