Vincent Davis Leonard
TL;DR
I applied three test design optimization techniques (Input Space Partitioning or ISP, Control Flow Graph or CFG Analysis, and Mutation Testing) to the four main features I manage in the GBIM project. These features include account registration, account activation via token, account verification by admin, and the approval or rejection of submissions. The results include 33 new tests in the backend, 3 commits in the frontend, an expanded mutation scope, and the addition of Stryker threshold enforcement to the CI pipeline.
Tools and Methods Used
1. Input Space Partitioning (ISP)
ISP (Ammann & Offutt, Introduction to Software Testing, 2016) is a technique that divides the input domain of a function into equivalence classes. These classes are groups of values expected to be treated similarly by the system. Instead of trying all combinations, we select one representative value per partition.
I used the base-choice coverage strategy. We choose one valid value as a baseline and then vary one characteristic per test. This approach is efficient without falling into combinatorial explosion.
Tools used
- Manual analysis of the source code (
authentication/serializers.py,RegisterForm.tsx) - Annotation
# ISP <characteristic>.<partition>in each test for traceability
Examples of partitioned characteristics for the Register feature
| Characteristic | Partition |
|---|---|
valid, no @, active duplicate, inactive duplicate, >254 char, whitespace |
|
| Password | <8 char, exactly 8 without digits, no uppercase, valid strong |
| Role |
KAPRODI, GURU_BESAR, ADMIN (blocked), invalid enum |
| Activation Token | valid fresh, expired, already used, malformed, missing |
2. Control Flow Graph (CFG) Analysis
CFG (Ammann & Offutt, ch.7) represents the program execution flow as a graph where each node is a basic block and each edge is a conditional branch. From the CFG we identify prime paths which are the shortest non-repeating paths through all nodes.
Target module _validate_transition (pengajuan/services.py 66-75)
The actual code
66: def _validate_transition(self, previous_status: str, new_status: str) -> None:
67: if (previous_status, new_status) not in ALLOWED_TRANSITIONS:
68: raise ValidationError(
69: {
70: "status": (
71: f"Transisi status dari '{previous_status}' ke '{new_status}' "
72: "tidak diperbolehkan."
73: )
74: }
75: )
The CFG of this code (node = line of code, edge = execution flow)

Prime paths
| Path | Condition | Test |
|---|---|---|
| 1→2→3→5 | transition is not in ALLOWED_TRANSITIONS
|
test_disetujui_to_menunggu_raises, test_menunggu_to_menunggu_raises, etc |
| 1→2→4→5 | transition is in ALLOWED_TRANSITIONS
|
test_menunggu_to_disetujui, test_menunggu_to_ditolak, etc |
This state machine CFG has 4 legal transitions and 5+ illegal transitions that all must have tests.
Tools
- Manual source code analysis + Mermaid diagram (planned in the
cfg/folder) - Annotation
# CFG: from_state→to_statein the tests
3. Mutation Testing
Mutation testing (Jia & Harman, IEEE TSE 2011) measures the quality of a test suite by injecting small defects or mutants into the source code. Examples include changing > to >=, or removing a condition. The process then checks if the test suite detects it (meaning the mutant is "killed"). The mutation score is calculated by dividing the killed mutants by the total mutants.
Tools
-
Backend
mutmut(Python) with operators including AOR, LCR, ROR, and statement deletion - Frontend Stryker Mutator (JS/TS) with operators including arithmetic, logical, equality, string, and array
Both tools are complementary. The mutmut tool is more aggressive in statement deletion, while Stryker is richer in JS/TS level operators.
Application to the Project and Evidence of Improvement
ISP Application
Before this sprint, the test_register_serializers.py test only covered the happy path and one or two errors. After the ISP audit, the following changes were made.
New backend tests added
| File | New Partitions | Count |
|---|---|---|
test_register_serializers.py |
email whitespace, email >254 char, inactive duplicate email, password exactly 8, password no digit, password no uppercase, password whitespace, role ADMIN blocked, role null, telephone invalid format | 13 |
test_activation_views.py |
token malformed, account already active | 2 |
test_views_admin_account_verification.py |
filter role invalid enum, filter status invalid enum, search no match, pagination beyond max | 4 |
test_views_admin_account_verification_detail.py |
approve AKTIF (idempotent), approve DITOLAK (reactivation), reject DITOLAK, reject non-existent, unauthorized non-admin | 9 |
New frontend tests added
| File | New Partitions |
|---|---|
RegisterForm.test.tsx |
API 429 rate limit response (ISP ApiError.status.429) |
useUpdateStatusPengajuan.test.ts |
MENUNGGU to DITOLAK transition (CFG happy-path DITOLAK) |
CFG Application
Previously, StatusChangeService._validate_transition was only tested for 2 to 3 legal transitions. After the CFG analysis, I added tests for all 5 illegal transitions.
# CFG MENUNGGU to MENUNGGU (illegal self-loop)
def test_validate_transition_menunggu_to_menunggu_raises(self):
...
# CFG DISETUJUI to MENUNGGU (illegal backward)
def test_validate_transition_disetujui_to_menunggu_raises(self):
...
A total of 5 new CFG tests for illegal transitions were added along with annotations on the existing tests.
mutmut Scope Expansion
Previously, the pyproject.toml file only mutated pengajuan/views/ and kegiatan/views/. Now it includes the following paths.
paths_to_mutate = [
"pengajuan/views/views_admin.py",
"pengajuan/views/views_kaprodi.py",
"kegiatan/views/views_kegiatan.py",
"statistik_prodi/views.py",
"authentication/serializers.py", # new addition
"authentication/services.py", # new addition
"authentication/views.py", # new addition
"pengajuan/services.py", # new addition
"pengajuan/serializers.py", # new addition
]
Stryker Threshold Enforcement
The following configuration was added to stryker.config.mjs.
thresholds { high: 80, low: 70, break: 70 }
This means the frontend CI pipeline will automatically fail if the mutation score drops below 70 percent. This ensures the test quality does not regress.
Benefits, Concrete Data, and Best Practices
Quantitative Data
| Metric | Before | After |
|---|---|---|
| Number of BE tests (auth and pengajuan scope) | ~26 tests in target files | +33 = ~59 tests |
| Number of FE tests (3 target files) | 97 tests | +2 = 99 tests |
| mutmut scope (paths_to_mutate) | 4 paths (views only) | 9 paths (views, services, serializers) |
| Stryker threshold FE | none | break 70, high 80 |
| Documented ISP partitions (BE) | ~5 (implicit) | 28 explicit and annotated |
| CFG paths covered (StatusChangeService) | 2 to 3 legal | 4 legal and 5 illegal |
Connection to Literature
ISP by Ammann & Offutt (2016)
Ammann and Offutt define Input Space Partitioning as the division of an input domain into partitions where each must be represented by at least one test. The base-choice coverage strategy I used is their recommendation for balancing coverage with efficiency.
Mutation Testing by Jia & Harman (2011)
The survey by Jia and Harman shows that the mutation score is a more reliable predictor of test suite quality than statement coverage. They also documented that equivalent mutants (mutants that are semantically identical to the original code) are a major challenge. I document these cases when found.
Petrović & Ivanković (ICSE 2021) on Mutation Testing
This paper reports the results of deploying mutation testing at scale at Google. Developers who receive mutation testing feedback consistently write better tests. The Stryker threshold I implemented follows this principle by maintaining the mutation score as an automatic quality gate.
Meszaros, xUnit Test Patterns (2007)
Test smells such as Assertion Roulette (multiple assertions without messages) and Obscure Test (tests that are difficult to understand) are anti-patterns I avoid. Every new test has one clear assertion and an ISP or CFG comment.
Google Testing Blog on Mutation Testing at Google (2018)
Google recommends focusing on killed mutants per time rather than the raw mutation score. This means prioritizing mutants in frequently changing code, which aligns with the auth and pengajuan features in this sprint.
Best Practices Followed
- Each-choice minimum with base-choice for crucial parameters (Ammann & Offutt recommendation)
- Mutation score as a quality gate rather than an optional metric (Google engineering practice)
-
Test isolation via
override_settingsandlocmemcache for rate limiter tests to avoid flakiness -
Annotation-based traceability (
# ISP,# CFG) to allow coverage auditing without reading all the code
Critique of Previous Testing and Measurable Improvements
Anti-Patterns Found in the Old Test Suite
Anti-Pattern 1 Happy-Path-Only Register Serializer Test
Location authentication/tests/test_register_serializers.py (before this sprint)
Problem The registration test only verified that valid data passed the serializer. There were no tests for the following scenarios.
- Passwords with a length of exactly 7 (boundary condition that should fail) versus exactly 8 (should pass)
- Emails that are already registered but not yet activated (which behave differently from active ones)
- The
ADMINrole which should not be able to self-register
Why this is weak A mutant changing len(password) < 8 to len(password) <= 8 or len(password) < 7 would survive because no test could distinguish the difference. This is a classic Boundary Value Analysis gap based on Myers' The Art of Software Testing (1979).
Fix Added test_password_exactly_seven_chars_invalid, test_email_already_registered_inactive, and test_role_admin_cannot_register with the annotation # ISP password.length_boundary.
Anti-Pattern 2 StatusChangeService Did Not Test Illegal Transitions
Location pengajuan/tests/test_services.py (before this sprint)
Problem There were only tests for legal transitions (MENUNGGU to DISETUJUI, and MENUNGGU to DITOLAK). There were no tests verifying that backward transitions (DISETUJUI to MENUNGGU) or self-loops (MENUNGGU to MENUNGGU) raise an exception.
Why this is weak A mutant removing one condition in the ALLOWED_TRANSITIONS dictionary would survive. A state machine that is not tested exhaustively could allow status transitions that corrupt data integrity.
Following the principles from Meszaros (xUnit Test Patterns), tests must verify error behavior just as rigorously as happy behavior.
Fix Added 5 CFG tests for illegal transitions with assertions that ensure exceptions are raised.
Anti-Pattern 3 Frontend Tests Did Not Cover API Error Variants
Location tests/features/authentication/components/RegisterForm.test.tsx
Problem The registration form tests only mocked the success scenario (201) and generic errors. There were no tests for the following situations.
- HTTP 429 rate limit response which should display a throttling message
- HTTP 400 with field-specific errors which should map to the correct fields
Why this is weak A Stryker mutant changing the HTTP status check condition would survive. This is also an Over-Mocked Service smell according to Meszaros, as overly generic mocks do not exercise real branch logic.
Fix Added test_register_form_shows_rate_limit_error with the annotation // ISP ApiError.status.429.
Measurable Improvements (Before and After)
| Dimension | Before | After | Delta |
|---|---|---|---|
| Annotated ISP partitions | 0 (implicit) | 28 (explicit) | +28 |
| Covered CFG illegal transitions | 0 | 5 | +5 |
| Mutmut scope (auth and pengajuan) | 0 paths | 5 new paths | baseline capture enabled |
| Stryker threshold FE | none | break 70, high 80 | active CI guard |
| Test methods for auth and pengajuan BE | ~26 | ~59 | +33 |
| Test methods FE (3 target files) | 97 | 99 | +2 |
Connection to Industry Standards
Google researchers (Petrović et al., ICSE 2021) found that mutation testing is most effective when integrated into the developer workflow as automated feedback rather than just a final report. The Stryker threshold I set implements this pattern by automatically blocking any merge request to staging if the mutation score regresses.
The Stryker Mutator whitepaper (2023) recommends setting coverageAnalysis to "perTest" (which is already active in the config) to isolate mutants to the specific tests that cover them. This reduces false positives and execution time.
Commit Links
BE-GBM MR !158
| Commit | Message |
|---|---|
b7564fc5 |
chore(testing) expand mutmut scope to authentication and pengajuan services |
0c774595 |
[GREEN] test(auth) add ISP partitions for register serializer, activation, and admin verification |
a2def037 |
[GREEN] test(pengajuan) add CFG prime path coverage for StatusChangeService state machine |
MR Link: [https://gitlab.cs.ui.ac.id/ppl-fasilkom-ui/2026/kelas-d/group1-gb/be-gbm/-/merge_requests/158]
fe-gbm MR !142
| Commit | Message |
|---|---|
f18590fe |
chore(testing) enforce Stryker mutation threshold at 80% high and 70% break |
42d9b028 |
[GREEN] test(auth) add ISP partitions for RegisterForm and useActivation hook |
19328728 |
[GREEN] test(pengajuan) add CFG branch coverage for useUpdateStatusPengajuan |
MR Link: [https://gitlab.cs.ui.ac.id/ppl-fasilkom-ui/2026/kelas-d/group1-gb/fe-gbm/-/merge_requests/142]
References
- Ammann, P. & Offutt, J. (2016). Introduction to Software Testing (2nd ed.). Cambridge University Press. (ISP ch.6, Graph Coverage ch.7).
- Jia, Y. & Harman, M. (2011). "An Analysis and Survey of the Development of Mutation Testing." IEEE Transactions on Software Engineering, 37(5), 649-678.
- Petrović, G., Ivanković, M., Fraser, G., & Just, R. (2021). "Does Mutation Testing Improve Testing Practices?" ICSE 2021.
- Petrović, G. & Ivanković, M. (2018). "State of Mutation Testing at Google." Google Testing Blog.
- Meszaros, G. (2007). xUnit Test Patterns Refactoring Test Code. Addison-Wesley.
- Stryker Mutator. (2023). "Mutation Testing in Practice." stryker-mutator.io.
- Myers, G.J. (1979). The Art of Software Testing. Wiley. (Boundary Value Analysis).

Top comments (0)