Code coverage shows the degree in percentage to which an application is executed by a testing framework. It is used for quality assurance and "helps" to increase quality of software.
In this article I will demonstrate why code coverage will not help increasing quality, can be faked and what you can do instead to ensure quality.
Most tools show the percentage of code coverage by the executed lines of code. Normally unit, integration and end to end tests help to increase the coverage, which means that by theory you have to write the test by yourself to get the code tested. But is that really the only way? Let's see if that is really the case with an C# example.
namespace ConsoleApp1
{
public class SimpleClass
{
public string SimpleProperty
{
get;
set;
}
public string SimpleMethod(bool value)
{
return value ? "simple yes" : "simple no";
}
public int SimpleMethod2(string value)
{
return value.Length;
}
public bool SimpleMethod3()
{
return true;
}
}
}
I have created a small class that has a property and three methods doing basic things. When a test is written for a class like this, it might look like in the code below.
[Test]
public void Test1()
{
SimpleClass simpleClass = new SimpleClass();
string value = simpleClass.SimpleMethod(true);
Assert.IsTrue(value == "simple yes");
}
This simple test executes one method and asserts the output.
There is a more complex way to "test" a class but it automates the execution of the methods which means that this code will automatically "test" other classes too. The code is very basic and more of an example. It should only help to show my point.
[Test]
public void Test2()
{
Assembly assembly = Assembly.Load("ClassLibrary1");
Type[] objectTypes = assembly.GetTypes();
var parameterInstances = new Dictionary<Type, List<object>>();
foreach (Type objectType in objectTypes)
{
if (objectType.IsClass)
{
ConstructorInfo objectConstructor = objectType.GetConstructor(Type.EmptyTypes);
object objectInstance = objectConstructor.Invoke(new object[] { });
var methodInfos = objectType.GetMethods();
foreach (var methodInfo in methodInfos)
{
var parameterInfos = methodInfo.GetParameters();
var methodParameters = new List<List<object>>();
for (int i = 0; i < parameterInfos.Length; i++)
{
Type parameterType = parameterInfos[i].ParameterType;
if (!parameterInstances.ContainsKey(parameterType))
{
if (parameterType.IsValueType)
{
parameterInstances.Add(parameterType, CreateValPossibilities(parameterType));
}
else
{
parameterInstances.Add(parameterType, CreateRefPossibilities(parameterType));
}
}
methodParameters.Add(parameterInstances[parameterType]);
}
List<List<object>> parameterPermutations = GetAllPossibleCombinations(methodParameters);
foreach (var parameterPermutation in parameterPermutations)
{
methodInfo.Invoke(objectInstance, parameterPermutation.ToArray());
}
}
}
}
}
To explain what this code does with the power of reflections:
- It loads the "ClassLibrary1" assembly and collects all types it contains
- It iterates through the type collection and creates an instance of the ones being a class
- It iterates through the methods of that class, creates parameters with specific values of certain types and creates a list of permutations to pass them to those methods for execution (working kind of like a Fuzzer)
Guess the code coverage of SimpleClass …
This code sample will not be able to execute every code line in other scenarios since there are branches you can not cover that simply. Sure you can tailor the code to have some higher degree of line coverage, but this does not enhance quality. By theory we now have achieved "quality" when reading this KPI. Executing code without real context to just execute it does not make any sense. But what should you strive for instead?
Context! We need context to know about the environment, the users, the input and the processes. To explain you, why we need context, let's make an example with SimpleClass and have a look on SimpleMethod2.
public int SimpleMethod2(string value)
{
return value.Length;
}
This method takes as parameter a string and returns the length of it. When executing a test with a string, we covered the complete method and reached 100% here. Cool? Well, when you look more closely, you will notice it. Do we really ensure "quality"?
The int here is a Int32 struct where the lowest possible value is -2147483648 and the highest 2147483647. Since the Length property of string is a type of Int32 the theoretical length of a string is the highest possible value of Int32. Practically you will not really reach this number even close since the Common Language Runtime (CLR) limits the size of single objects by 2GB (well, you could in fact change that in the config but normally there is no reason for that). By default a string uses UTF-16 which needs two bytes for a single character and in fact this means you probably reach a length of 1073741823.
Ok, the length of the string will not cause any problems but what can? Strings are reference types which means we have the power of null! When the parameter is null, the code will crash.
My point? Exactly this. Even though we have a high code coverage in this method, we still have the possibility that the application crashes.
We need context to understand what can happen and model our test cases that way. Instead of caring much about the code coverage KPI, we should cover the different use cases and misuse cases.
Shortly explained, while a use case defines the interactions between an actor and a system that contains a set of steps to accomplish a certain goal, does the misuse case describe the process of malicious acts against a system. But how can we ensure quality with context?
We have to take one step back and define what quality is and what quality characteristics are going to be needed to ensure "quality."
Quality is conditional, subjective and is understood differently by different people. While a consumer focuses on specification quality, producers have conformance quality in mind. Knowing the different focus of each group, we can dig deeper more specifically. To do this we can use the quality models defined in ISO 25010 which is part of the 25000 series also known as SQuaRE (Systems and software Quality Requirements and Evaluation).
The first model is quality in use. It shows how the product meets the needs of the users to achieve specific goals.
The second one is product quality and consists of eight characteristics which are composed of sub characteristics.
Knowing now these quality models helps to see the usage and the product through different perspectives. With those characteristics in mind, we can enhance the requirements and refine them to be more specific about the needs. Now together with the use cases and misuse cases, we can test the code against different scenarios and functional as well as non-functional requirements. Knowing the different scenarios allows you also to build your own KPIs. With a high context coverage, you will also inherently reach a high code coverage, but this does not work the other way around.
Testing your code automatically is important and working with effective KPIs helps you to focus on blind spots. Some KPIs are more effective than others so keep in mind what data is collected and processed to display you the result since a KPI can lead to false conclusions.
Top comments (0)