Jannis Gansen

Posted on Apr 29, 2023

🧨 Think twice before setting code coverage as a goal for your teams

#leadership #testing #codequality

Sometimes we focus so much on measurements and KPIs that we forget about what we actually want to achieve. Code coverage is a case where architects and leaders often end up revolving around numbers instead of focusing on the outcomes they want to achieve. I'll show you my view on code coverage and why I prefer not set them as goal for teams in order to improve test awareness.

💥 When you don't test (...the right things)

Let's say a team in your product just rolled out a new release with a few features, nothing groundbreaking. Your users report bugs that “things are not working anymore”. Customer support is on it and discovered that indeed a certain user role cannot access the frontend anymore. A hotfix is on the way and will be rolled out tomorrow. Users report that it indeed fixed it…but broke something else.

Sounds familiar? Chances are that your team missed some important scenarios in your automated tests, or don't have any tests. This is more common than most companies want to admit. Maybe the team decided that tests only slow them down, maybe your culture didn't support testing when the system was originally developed or maybe the developers just didn't want to test/know how to test.

Indeed you can develop without testing, and especially in the beginning of a new product. But rather sooner than later your performance and/or quality will drop. I sketched a fictional product lifecycle here:

A lack of tests is usually not a big deal until you hit the first red line. A developer can implement his feature and test it manually. Afterwards your product owner checks again and you can release quickly. Usually you don't have a large user base here, so even if something breaks you can roll out a fix quickly.
Things start to get nasty as soon as you implement more and more features, especially as soon as features affect each other. For example, adding roles to your app suddenly requires you to make sure your product behaves correctly even when the current user has certain restrictions. Or you have a large form with fields whose values and validation depend on each other - it's repetitive, boring and cumbersome to test all combinations manually.

Things don't get better, but worse from here on. With time you will have larger changes in you application, refactor code without changing functionality, changing team members that may not know the original intention of your features and code. Combine this with a large user base and a lack of purposeful automatic tests and you will most likely cross a few of the following boxes:

Long release cycles due to extensive regression tests
Lots of Meetings over possible risks with detailed release coordination and code-freezes
A high test-to-development ratio even for small changes: Features are implemented in a matter of hours or days, but a rollout still takes weeks
Features that are implemented as workarounds and a continuous buildup of technical debt, as developers avoid larger changes

In the end it's always about confidence and trust: Do your developers have confidence that their changes work without issue? Can your organisation trust that your teams deliver tested and working features? If you don't build up automatic regression tests that keep complexity in check and provide a safety net for human error, the answer to your question will be ‘no’ sooner or later.

One indicator of missing tests is code coverage. Code coverage tells, how much of your production code has been ‘passed by’ when running your tests. Here's a small visualisation in case you are not that familiar with the term:

It's important to note that a test coverage of 50% means, that half of your code is not tested - but it doesn't necessarily tell you whether the other 50% of your code are tested properly.

Still code coverage is an extremely valuable measurement. Therefore a classNameic reaction in tech-management is to introduce a 60-80% code coverage as a goal when there's a lack of testing. I would advice you to think twice before doing that.

📉 Setting a target for code coverage may not change the behaviour as you intended

Code Coverage is a KPI that indicates how good the test awareness and discipline of your team is. This is extremely valuable, as soon as teams have more awareness and motivation to test, your coverage will go up. But Goodhart's law applies here: As soon as your code coverage becomes a target in your organization, it loses its value.

You won't measure test awareness anymore, but conformity to the rules.

Your teams won't suddenly become test aficionados. Their performance might drop (‘I'm finished with the story, but I just need to write tests’) and the reason why they didn't write tests didn't change. Nowadays, they will probably just use Copilot or ChatGPT instead of thinking about purposeful tests, as testing becomes just a regulatory requirement.

Also be aware that we're not living in the 2000s/2010s anymore: Many Products are built by integrating existing solutions and low/no-code may be used for implementing parts of your core domain. As there is no code written for these parts, you can't measure whether they are tested with code coverage.

My personal experience is that code coverage goals often lead to codebases with 60-70% coverage and tests that simply mirror the implementation. Teams do the classNameic ‘let's block one month where we bump the code coverage to meet the goal’. The aftermath is that tests don't add value and need constant attention, as they tend to not verify the behaviour, but the implementation of your code.

Rather than misusing your code coverage as a leverage for increasing the amount of tests I recommend an outcome driven approach and focus on the bigger picture. You don't want more tests. Period. What you want is improving confidence, trust, reliability, efficiency, and so on. This is the impact you should aim for and this should be the reasoning why you keep thinking about tests in the first place.

🧪 Outcome oriented testing

Instead of demanding a specific code coverage you should take a step back and ask yourself what do you actually want to achieve. Let's start with the original intention - increasing tests. I consider this an output centric approach. I’m taking inspiration from Joshua Seidens book, Outcome over Output, here.

An output driven approach

“We want the developers to write more tests and have a higher coverage”:
The reason why I don't like this approach is because tests are simply an output your team generates - they may or may not add value. But you don't want your team just to write tests. Tests without a purpose are overhead that consumes time and capacity without adding value. Imagine you implement a customer portal that allows users a self-service for your business needs. Your teams spent a whole sprint writing unit and integration tests for the customer profile function. This functionality is quite old and didn't have any tests, but at the same time is not really used a lot, as users don't add an ‘about me’. This is an example for an output that doesn't provide any outcome.

This is obviously not what we wanted, let's rephrase our goal to be more outcome-oriented:

Focusing on an desired outcome

“We want reduce the amount of manual regression testing”
This is way better, as it leaves it up to your team how to do this and you can actually measure it and set a clear goal (“We want to shorten the regression testing phase from three days to one day”). This is an outcome that can be achieved by writing purposeful automated tests. Josh Seiden explains that an outcome is a change in behaviour that is assumed to create positive results. We want to change the behaviour of the team towards efficiency: People spend less time doing manual, repetitive testing.
Let's revisit our customer-portal example: If your team has the goal to reduce the amount of manual regression tests, they would not prioritise the profile function. Rather they'd choose to implement tests for parts that cause a lot of manual effort, like error handling in an order form with many combinations. But the team could also go another route: Maybe they remove features that are not used by customers, but take a lot of time to verify and add tests for the remaining feature. You kept the outcome open, which led to a way better solution than writing tests for these unneeded features.

Now that we've seen the benefits of focusing on outcomes rather than outputs, let's explore how to decide which outcomes to aim for. Why not simply aim for ‘remove all manual test effort’? Let's take a step back and look on the bigger picture.

The bigger picture: Defining the impact you want to make

“We want to be a reliable partner customers can trust and known for quickly reacting to their needs”

“We want to be a place that offers teams a great development experience and enables them to focus on features without sacrificing quality”

This is the impact you want to make, the broader picture. This is hard and nothing you can delegate, order or introduce by defining a goal. You need to check whether your outcomes actually change behaviour in a way that is beneficial for your goals. Therefore, it is important to measure whether your outcomes align with the assumptions. Does the behaviour of your teams change in the right way? If not, you need to reassess and adjust your approach.

Let's revisit the example one last time:
Maybe your team managed to cut the amount of manual regression testing by introducing tests. At the same time your teams efficiency dropped noticeably. While visiting on of their dailies you hear phrases like “I’m actually finished with the ticket, but I still need to fix the tests” or “I did it quick and dirty now, otherwise I would to rewrite all our tests for a little change”. Not good!
You achieved one of your outcomes - reducing manual regression effort - but at the same time the maintenance overhead, efficiency and most importantly developer satisfaction degraded.
While you achieved a positive outcome by reducing manual regression effort, the overall impact was negative because your team's behaviour changed in a way that led to working around the tests. This is not a big deal, but you have to act and work on measuring and improving your teams performance and satisfaction.

🛠️ Putting it in practice

So I wrote an awful lot, but how can this be translated into the real world?

Let's summarize:

You don't sell yourself as “we're the people that test a lot of stuff” to your stakeholders, rather “we're reliable and you can trust our product” (…except when your a company that sells products that test a lot of stuff!). Your actions should therefore either count towards the goal of increasing reliability or decreasing costs while keeping your reliability at the same level.
Automated tests are not a goal you should aim for, rather an output that counts towards building up trust and confidence and reduce repetitive, error prone manual work. These are outcomes (changes in behaviour with a positive result) of writing the tests with a purpose.
Code Coverage is a valuable KPI that shows us how much attention a team has on testing if (and I repeat: only if!) you set a goal to reach a certain threshold. In that case your code coverage is tainted and developers may only work towards reaching a certain number - this means only producing tests with the goal of writing test.

So you observe that your developers are not writing tests and rely on manual tests of their work.
This observation should be backed by kpis/facts: A combination of low code coverage, long release cycles, bug peaks after releases, long and delayed test phases, etc.

Your expectation is that when developers write automated tests, these KPIs improve. Notice how your ‘Number of bugs after a release’ is more relevant for your goals than ‘Code Coverage’.

Instead of telling the developers to write tests, I would rather ask them for the cause of their behaviour. There are many possibilities here, but let's assume the team tells you that they don't have any time for writing tests, as they need to deliver features and have a very high workload.

I would now try to measure (or rather ‘sample', as you shouldn't be too scientific here), where the teams actually spend their time. This can happen by attending their dailies and asking questions in the aftermath (’you're working on that ticket for 3 days now without any updates, what impediments do you encounter?’), just make sure you they feel safe and not judged.

Knowing the cause of their behaviour better, you can now make an assumption: Maybe you notice that the developers struggle with a particularly complex form (for example an application process) and they spend half of their day entering values and checking combinations and error values. You assume that you could free up their time and increase quality by writing automated tests for this form.

With this assumption, you define an outcome: Together with the team you set a conservative goal and want to reduce the time to test forms during development by at least 30%.

The next step is to work on outputs that count towards outcome. Maybe you can provide an additional developer for one sprint that bootstraps the tests and introduces reusable testing patterns, or you train the developers. The team can use code coverage here to check whether they have blind spots that needs fixing.

It's important is that you verify your assumption - does a developer really save time after the their functionality is covered by automated tests? Does a test-driven approach make them more efficient? Do not expect miracle changes here, but focus on realistic assumptions and targets.

Remember: You don't want your developers to write test just for the sake of testing, but to improve the efficiency of your software development lifecycle and the reliability of your product. These are just examples and your mileage may vary.

🤔 Conclusion: There is no shortcut

While setting a goal for code-coverage may seem will increase the amount of tests, it may not be working towards the reliability of your product and may even have negative effects on your efficiency an motivation.

Code coverage is a valid way to measure your test-awareness, but it's nothing you want to improve just for the sake of increasing a number. Rather than trying revolve around a single metric, I advise you to focus on the bigger picture and the motivation and behaviour of your teams. There is no shortcut for building up a quality culture in your organisation.

(All images in this article are released under the CC-BY license).

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

DEV Community

🧨 Think twice before setting code coverage as a goal for your teams

💥 When you don't test (...the right things)

📉 Setting a target for code coverage may not change the behaviour as you intended

🧪 Outcome oriented testing

An output driven approach

Focusing on an desired outcome

The bigger picture: Defining the impact you want to make

🛠️ Putting it in practice

🤔 Conclusion: There is no shortcut

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

Top comments (0)

See why 4M developers consider Sentry, “not bad.”

Read next

Navigating the Maze of Open Source Funding: Best Practices for Success

Encoder-Free AI System Matches Traditional 3D Vision Models While Using Less Computing Power

AI-Enhanced Heart Rate Monitoring Cuts Through Exercise Motion Noise

New AI Model LASP-2 Speeds Up Training 2.5x While Using 33% Less Memory