DEV Community

Cover image for Who tests the tests? Mutation testing with Infection in PHP
Rubén Rubio
Rubén Rubio

Posted on

Who tests the tests? Mutation testing with Infection in PHP

Introduction

In software engineering, writing tests is a good practice that allows us to increase confidence in our code. However, how can we be sure that the tests we write check the code correctly? How can we build trust in our tests?

One of the most well-known metrics is code coverage. It consists of the percentage of lines of our code that are executed in the tests. Nonetheless, this metric is not enough.

For instance, suppose we have tests that execute all our code. However, the only assert we execute in all the tests is assertTrue(true). In this case, code coverage is 100%, but our tests are not trustworthy.

What is the alternative, then? What is known as mutation testing. It consists of the following steps:

  1. As a previous step, all tests are executed to check that they passed.
  2. The source code of our application is modified to fail the test. For example, changing a > by a < in a comparison. This is called a mutant.
  3. All tests are executed again.
  4. Here we have two possibilities:
    1. If there is a test that fails, it means that a mutant was killed. That is positive. To have a good test suite, all mutants should be killed.
    2. The tests keep passing without failure, which means the mutant has survived. This could be because of two things:
      • The mutated line of code is not covered by the tests
      • The tests for that line are not really useful.

Obviously, we can not generate mutants manually. For that purpose, there are mutation testing utilities. For PHP, we have Infection.

In this post, we will see how it works and how to set it up in a Symfony project that uses hexagonal architecture.

Infection

Metrics

Infection uses the following metrics:

  • Mutation Score Indicator (MSI): it is the percentage of detected (deleted) mutants of the total generated for our code. The higher this value, the more robust our tests are.
  • Mutation Code Coverage (MCC): this is the percentage of code covered by the mutants. It is usually the same as code coverage.
  • Covered Code Mutation Score Indicator: it is the MSI for the code that is actually covered by our tests.

The metric that we will use to measure the quality of our tests is the MSI.

Example

Suppose we have the following method that checks if a number is positive or not (for simplicity, we will consider 0 as a positive number, even though that is not strictly correct):


final readonly class NumberChecker
{
    public static function isPositive(int $number): bool
    {
        return $number >= 0;
    }
}
Enter fullscreen mode Exit fullscreen mode

And we have the following test for it:

final class NumberCheckerTest extends TestCase
{
    public function test_isPositive(): void
    {
        self::assertTrue(
            NumberChecker::isPositive(10),
        );

        self::assertFalse(
            NumberChecker::isPositive(-10),
        );
    }
}

Enter fullscreen mode Exit fullscreen mode

The test passes. However, what is its MSI? We can get it by executing Infection for that class:

infection --threads=max --filter=NumberChecker.php --show-mutations
Enter fullscreen mode Exit fullscreen mode

The metrics we got were the following:

Metrics:
         Mutation Score Indicator (MSI): 66%
         Mutation Code Coverage: 100%
         Covered Code MSI: 66%
Enter fullscreen mode Exit fullscreen mode

The MSI is not really high, even though we have code coverage of 100%. We can see the escaped mutants in the output of the previous command:

Escaped mutants:
================


1) NumberChecker.php:11    [M] GreaterThanOrEqualTo

--- Original
+++ New
@@ @@
 {
     public static function isPositive(int $number) : bool
     {
-        return $number >= 0;
+        return $number > 0;
     }
 }
Enter fullscreen mode Exit fullscreen mode

We are not correctly checking the limit of the comparison. When writing tests for intervals, we must always test the limits. Actually, starting at 0 and going up, all numbers are positive: it is the same testing 10 as testing 987,654,321. The important value is 0.

And vice versa, even though it is not strictly necessary in this case, it is a good practice: starting from -1 and going down, all numbers are negative. The relevant value is -1.

Thus, for intervals, we must always test the limits because that is where the critical values are.

We can then rewrite the test as follows:

final class NumberCheckerTest extends TestCase
{
    public function test_isPositive(): void
    {
        self::assertTrue(
            NumberChecker::isPositive(0),
        );

        self::assertFalse(
            NumberChecker::isPositive(-1),
        );
    }
}

Enter fullscreen mode Exit fullscreen mode

If we execute Infection again, we now get an MSI of 100%:

Metrics:
         Mutation Score Indicator (MSI): 100%
         Mutation Code Coverage: 100%
         Covered Code MSI: 100%
Enter fullscreen mode Exit fullscreen mode

This is only an example. Infection generates plenty of mutants. You can check them all out in the documentation.

Configuration and execution

Normally, we want to execute Infection for our whole project. However, if we use hexagonal architecture, we would only want to execute Infection for the unit tests of our domain for two reasons.

First, if we generated mutations when executing non-unitary tests (integration, acceptance, functional or end-to-end), we would encounter timeouts. Infection executes all the tests for each mutant it generates, so it is not feasible to execute slow tests.

Second, if we executed Infection for the infrastructure layer, we would generate mutants for code that integrates with third-party code that is not under our control. So we would invest effort in an irrelevant layer for our business rules.

Therefore, for a Symfony project that applies hexagonal architecture, we could have the following infection.json.dist configuration file in the root folder of our project:

{
    "$schema": "https://raw.githubusercontent.com/infection/infection/master/resources/schema.json",
    "source": {
        "directories": [
            "src"
        ],
        "excludes": [
            "{Infrastructure/.*}",
            "{Domain/Exception/.*}"
        ]
    },
    "logs": {
        "html": "var/log/infection/infection.html",
        "text": "var/log/infection/infection.log",
        "summary": "var/log/infection/infection-summary.log",
        "debug": "var/log/infection/infection-debug.log",
        "perMutator": "var/log/infection/infection-permutator.log"
    },
    "mutators": {
        "@default": true
    }
}
Enter fullscreen mode Exit fullscreen mode

What we set up is:

  • Analyze the whole content of the src folder.
  • Exclude all domain exceptions, as it is code that we would not test directly. We also exclude the infrastructure layer, as we said previously.
  • Place all generated logs in the same folder where Symfony places theirs.

Suppose we have the following PHPUnit configuration, where we split our Unit and Functional suites:

<?xml version="1.0" encoding="UTF-8"?>
<!-- https://phpunit.readthedocs.io/en/latest/configuration.html -->
<phpunit xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="https://schema.phpunit.de/10.0/phpunit.xsd"
         backupGlobals="false"
         colors="true"
         bootstrap="tests/bootstrap.php"
         executionOrder="random"
         resolveDependencies="true"
         cacheDirectory=".phpunit.cache">
  <php>
    <ini name="display_errors" value="1"/>
    <ini name="error_reporting" value="-1"/>
    <server name="APP_ENV" value="test" force="true"/>
    <server name="SHELL_VERBOSITY" value="-1"/>
    <env name="KERNEL_CLASS" value="rubenrubiob\Infrastructure\Symfony\Kernel" />
  </php>
  <testsuites>
    <testsuite name="Unit">
      <directory>tests/Unit</directory>
    </testsuite>
    <testsuite name="Functional">
      <directory>tests/Functional</directory>
    </testsuite>
  </testsuites>
</phpunit>
Enter fullscreen mode Exit fullscreen mode

We could then execute Infection only for the Unit suite in the following way:

infection --threads=max --min-msi=100 --test-framework-options=\"--testsuite=Unit\"
Enter fullscreen mode Exit fullscreen mode

With this command:

  • We use the maximum number of threads available in the OS to accelerate execution.
  • With the --min-msi option, we force a return code other than 0 if the MSI does not reach a 100%. This is useful for pipelines.
  • Using the --test-framework-options we execute only the Unit suite.

With the configuration we set up, Infection generates an HTML log that helps track the mutants, both the killed ones and the escaped ones. We can see the summary for the example we saw:

Infection: general summary

And the escaped mutant:

Infection: escaped mutant

Conclusion

With this configuration, we can include Infection in our projects to increase the quality of our test suite. It is important to take into account that it may not be possible to achieve an MSI of 100% for all projects. And, in case we use a CI/CD pipeline, it may be useful to have a margin in case we need to deploy a hotfix some day. As always, it is important to adapt the configuration to the project.

Summary

  • We saw the importance of MSI as a quality metric for our tests and as an alternative to line coverage.
  • We explained the general operation of mutation testing.
  • We showed the main concepts Infection uses.
  • We reviewed an example of how to change our tests to kill mutants and increase the MSI.
  • We showed an Infection configuration for Symfony projects using hexagonal architecture.

Top comments (0)