Writing great tests is a skill. Achieving comprehensive test coverage across a codebase is even more challenging. In this article, I will introduce a new tool to boost your test coverage and ensure that all your functions are tested thoroughly. Cover-Agent, an open-source tool by CodiumAI, leverages AI to generate tests aimed at increasing test coverage. Previously, I wrote about another tool from Codium, the PR-Agent, which you can read here.
Understanding Test Coverage
Before diving into CodiumAI Cover-Agent, let’s discuss what test coverage entails, particularly code coverage. Code coverage measures how much of the code is executed by tests. Key aspects of code coverage include:
- Function Coverage: The percentage of functions or methods called by the test suite.
- Statement Coverage: The percentage of executable statements run by the test suite.
- Branch Coverage: The percentage of branches (like if-else and switch-case) executed.
- Condition Coverage: The percentage of Boolean expressions evaluated as both true and false.
Benefits of Test Coverage
- Identifying Gaps in Testing: Helps developers spot untested parts of the code.
- Improving Software Quality: Comprehensive tests catch bugs and issues early.
- Maintaining Code Reliability: Regularly measuring test coverage maintains and enhances code reliability.
However, high test coverage alone does not guarantee high-quality tests. It is possible to have high coverage with tests that do not thoroughly validate the code's correctness. Therefore, while aiming for high test coverage, it’s crucial to ensure that the tests are meaningful and well-designed. Codium Cover-Agent addresses this by not only aiming for coverage but also generating quality test cases that test edge cases, invalid inputs, and errors.
A Simple Example with CodiumAI Cover-Agent
To illustrate how CodiumAI Cover-Agent works, let’s start with a basic example. We will create a simple calculator.py
file with functions for addition, subtraction, multiplication, and division.
# calculator.py
def add(a, b):
return a + b
def subtract(a, b):
return a - b
def multiply(a, b):
return a * b
def divide(a, b):
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
Next, we write a test file test_calculator.py
and place it in the tests folder.
# tests/test_calculator.py
from calculator import add, subtract, multiply, divide
class TestCalculator:
def test_add(self):
assert add(2, 3) == 5
To see the test coverage, we need to install pytest-cov
, a pytest extension for coverage reporting.
pip install pytest-cov
Run the coverage analysis with:
pytest --cov=calculator
The output shows:
Name Stmts Miss Cover
-----------------------------------
calculator.py 10 5 50%
-----------------------------------
TOTAL 10 5 50%
This indicates that 5 of the 10 statements in calculator.py
are not executed, resulting in 50% code coverage. For simple examples like this, adding more tests manually is straightforward. But let's see how Codium Cover-Agent can enhance this process.
Setting Up CodiumAI Cover-Agent
To set up Codium Cover-Agent, follow these steps:
-
Install Cover-Agent:
pip install git+https://github.com/Codium-ai/cover-agent.git
Ensure
OPENAI_API_KEY
is set in your environment variables, as it is required for the OpenAI API.-
Create the command to start generating tests:
cover-agent \ --source-file-path "calculator.py" \ --test-file-path "tests/test_calculator.py" \ --code-coverage-report-path "coverage.xml" \ --test-command "pytest --cov=. --cov-report=xml --cov-report=term" \ --test-command-dir "./" \ --coverage-type "cobertura" \ --desired-coverage 80 \ --max-iterations 3 \ --openai-model "gpt-4o" \ --additional-instructions "Since I am using a test class, each line of code (including the first line) needs to be prepended with 4 whitespaces. This is extremely important to ensure that every line returned contains that 4 whitespace indent; otherwise, my code will not run."
Command Arguments Explained
-
source-file-path
: Path of the file containing the functions for which tests need to be generated. -
test-file-path
: Path of the file where the tests will be written by the agent. It’s best to create a skeleton of this file with at least one test and the necessary import statements. -
code-coverage-report-path
: Path where the code coverage report is saved. -
test-command
: Command to run the tests (e.g., pytest). -
test-command-dir
: Directory where the test command should run. Set this to the root or the location of your main file to avoid issues with relative imports. -
coverage-type
: Type of coverage to use. Cobertura is a good default. -
desired-coverage
: Coverage goal. Higher is better, though 100% is often impractical. -
max-iterations
: Number of times the agent should retry to generate test code. More iterations may lead to higher OpenAI token usage. -
additional-instructions
: Prompts to ensure the code is written in a specific way. For example, here we specify that the code should be formatted to work within a test class.
On running the command, the agent starts writing and iterating on the tests.
This is the code that it generates
import pytest
from calculator import add, subtract, multiply, divide
class TestCalculator:
def test_add(self):
assert(add(2, 3), 5
def test_subtract(self):
"""
Test subtracting two numbers.
"""
assert subtract(5, 3) == 2
assert subtract(3, 5) == -2
def test_multiply(self):
"""
Test multiplying two numbers.
"""
assert multiply(2, 3) == 6
assert multiply(-2, 3) == -6
assert multiply(2, -3) == -6
assert multiply(-2, -3) == 6
def test_divide(self):
"""
Test dividing two numbers.
"""
assert divide(6, 3) == 2
assert divide(-6, 3) == -2
assert divide(6, -3) == -2
assert divide(-6, -3) == 2
def test_divide_by_zero(self):
"""
Test dividing by zero, should raise ValueError.
"""
with pytest.raises(ValueError, match="Cannot divide by zero"):
divide(5, 0)
You can see that the agent also wrote tests for checking errors for edge cases.
We can now test the coverage again
pytest --cov=calculator
Name Stmts Miss Cover
-----------------------------------
calculator.py 10 0 100%
-----------------------------------
TOTAL 10 0 100%
In this simple example we reached 100% test coverage.
Trying Cover-Agent in a real Codebase
The previous example was simple, and even a beginner can completely cover the codebase. Let’s see how the agent works for a more real-world application. I have a codebase with more than 15 files with approximately 1000 lines of code. This is the coverage of a certain module of the application.
Name Stmts Miss Cover
-------------------------------------------------------------
maintenance\__init__.py 0 0 100%
maintenance\events\degradation_event.py 6 2 67%
maintenance\events\event.py 22 8 64%
maintenance\events\reach_machine.py 4 1 75%
maintenance\events\repaired_event.py 4 1 75%
maintenance\future_event_set.py 24 14 42%
maintenance\models\__init__.py 0 0 100%
maintenance\models\action.py 12 0 100%
maintenance\models\engineer.py 15 5 67%
maintenance\models\factory.py 63 36 43%
maintenance\models\machine.py 79 1 99%
maintenance\models\state.py 14 0 100%
maintenance\policies\policy.py 12 1 92%
maintenance\policies\reactive_policy.py 27 8 70%
maintenance\simResults.py 65 65 0%
maintenance\simulation.py 43 43 0%
maintenance\tests\__init__.py 0 0 100%
maintenance\tests\conftest.py 25 0 100%
maintenance\tests\factory_test.py 12 0 100%
maintenance\tests\machine_test.py 63 0 100%
-------------------------------------------------------------
TOTAL 490 185 62%
The goal is to increase the coverage by writing tests for the factory.py
file. Let’s do it
As before we create a skeleton for the test file. I have written one simple test here.
import pytest
from maintenance.models.factory import Factory
from maintenance.tests.conftest import factory
from maintenance.models.action import Action, ActionType
from maintenance.policies.reactive_policy import ReactivePolicy
from maintenance.events.event import Event, EventType
import numpy as np
class TestFactory:
def test_execute_action(self,factory:Factory):
factory.execute_action(factory.engineers[0], 0, factory.policy.determine_action(factory.state)[0])
assert factory.state.fse_available[0] == True
assert factory.state.fse_location[0] == 1
Then we run the following command
cover-agent \
--source-file-path "maintenance/models/factory.py" \
--test-file-path "maintenance/tests/factory_test.py" \
--code-coverage-report-path "coverage.xml" \
--test-command "pytest --cov=maintenance \
--cov-report=xml --cov-report=term" \
--test-command-dir "./" \
--coverage-type "cobertura" \
--desired-coverage 80 \
--max-iterations 3 \
--openai-model "gpt-4o" \
--additional-instructions "Since I am using a test class, each line of code (including the first line), In your response will need to be prepended with 4 whitespaces. This is extremely important to check to make sure every line returned contains that 4 whitespace indent otherwise my code will not run."
The agent starts writing the code. This takes about a minute to finish. Once the code is generated, the updated coverage report can now be checked.
Name Stmts Miss Cover
-------------------------------------------------------------
maintenance\__init__.py 0 0 100%
maintenance\events\degradation_event.py 6 0 100%
maintenance\events\event.py 22 3 86%
maintenance\events\reach_machine.py 4 1 75%
maintenance\events\repaired_event.py 4 0 100%
maintenance\future_event_set.py 24 13 46%
maintenance\models\__init__.py 0 0 100%
maintenance\models\action.py 12 0 100%
maintenance\models\engineer.py 15 3 80%
maintenance\models\factory.py 63 10 84%
maintenance\models\machine.py 79 0 100%
maintenance\models\state.py 14 0 100%
maintenance\policies\policy.py 12 1 92%
maintenance\policies\reactive_policy.py 27 2 93%
maintenance\simResults.py 65 65 0%
maintenance\simulation.py 43 43 0%
maintenance\tests\__init__.py 0 0 100%
maintenance\tests\conftest.py 25 0 100%
maintenance\tests\factory_test.py 45 0 100%
maintenance\tests\machine_test.py 63 0 100%
-------------------------------------------------------------
TOTAL 523 141 73%
We have increased the total coverage for the module by 10% and the coverage of factory.py
from 43% to 84%. This is amazing, given almost no effort from our side. This is the test code written by the agent. More edge cases can be tested. We can add them manually or instruct the agent to handle some specific edge cases.
import pytest
from maintenance.models.factory import Factory
from maintenance.tests.conftest import factory
from maintenance.models.action import Action, ActionType
from maintenance.policies.reactive_policy import ReactivePolicy
from maintenance.events.event import Event, EventType
from maintenance.models.engineer import Engineer
from maintenance.models.machine import Machine
import numpy as np
class TestFactory:
def test_execute_action(self,factory:Factory):
factory.execute_action(factory.engineers[0], 0, factory.policy.determine_action(factory.state)[0])
assert factory.state.fse_available[0] == True
assert factory.state.fse_location[0] == 1
def test_machine_from_id_invalid_id(self, factory: Factory):
"""
Test the machine_from_id method with an invalid machine ID.
Ensures that the method returns None for an invalid machine ID.
"""
machine = factory.machine_from_id(-1) # Invalid machine ID
assert machine is None
def test_process_event_degradation_machine_under_repair(self, factory: Factory):
"""
Test the process_event method for the DEGRADATION event type when the machine is under repair.
Ensures that the degradation event is ignored.
"""
machine = factory.machines[0]
machine.under_repair = True
event = Event(EventType.DEGRADATION, machine.id, 0, intensity=1)
factory.process_event(event)
assert machine.degradation == 0 # Degradation should not change
def test_process_event_degradation_machine_failed(self, factory: Factory):
"""
Test the process_event method for the DEGRADATION event type when the machine has failed.
Ensures that the degradation event is ignored.
"""
machine = factory.machines[0]
machine.degradation = machine.failure_theshold
event = Event(EventType.DEGRADATION, machine.id, 0, intensity=1)
factory.process_event(event)
assert machine.degradation == machine.failure_theshold # Degradation should not change
def test_process_event_degradation_threshold_reached(self, factory: Factory):
"""
Test the process_event method for the DEGRADATION event type when the degradation reaches the failure threshold.
Ensures that the machine handles failure correctly.
"""
machine = factory.machines[0]
event = Event(EventType.DEGRADATION, machine.id, 0, intensity=machine.failure_theshold)
factory.process_event(event)
assert machine.degradation == machine.failure_theshold
assert machine.has_failed()
def test_process_event_repaired_machine_not_under_repair(self, factory: Factory):
"""
Test the process_event method for the REPAIRED event type when the machine is not under repair.
Ensures that the method handles this scenario gracefully.
"""
machine = factory.machines[0]
machine.under_repair = False
event = Event(EventType.REPAIRED, machine.id, 0)
factory.process_event(event)
assert not machine.under_repair # Machine should still not be under repair
def test_get_results_zero_sim_time(self, factory: Factory):
"""
Test the get_results method with zero simulation time.
Ensures that the method handles division by zero gracefully.
"""
sim_time = 0
with pytest.raises(ZeroDivisionError):
factory.get_results(sim_time)
Conclusion
CodiumAI Cover-Agent simplifies achieving high test coverage and generating meaningful, high-quality tests. By using AI to identify and address gaps in testing, it ensures robust software development practices. Try CodiumAI Cover-Agent on your projects to see the difference it can make in your testing workflow.
You can check out the open-source repository for the cover agent at https://github.com/Codium-ai/cover-agent.
Want to connect?
Top comments (2)
This is a fantastic tool for enhancing test coverage! Does CodiumAI Cover-Agent require any specific configuration for different programming languages, or is it primarily designed for Python?
No. There is no special configuration required for different languages. On their repo you can see examples of Go as well. You basically just point it to a source file (for which the tests have to be generated) and the output file (where the tests nave to be written). It recognizes the programming language from source.