DEV Community

Kiran Kotari
Kiran Kotari

Posted on

How I Reduced My Test Suite by 72% Using TestIQ

The Problem: AI Generates Tests Fast, But Redundantly

AI coding assistants (GitHub Copilot, Cursor, ChatGPT) are game-changers for test generation. But there's a critical issue: they don't check if tests duplicate existing coverage.

Here's what happened when I let AI generate tests for a simple calculator:

# AI cheerfully created all 5 of these:
def test_addition(): 
    assert Calculator().add(2, 3) == 5

def test_add_numbers():
    assert Calculator().add(2, 3) == 5

def test_sum_calculation():
    assert Calculator().add(2, 3) == 5

def test_calculate_sum():
    assert Calculator().add(2, 3) == 5

def test_basic_addition():
    assert Calculator().add(2, 3) == 5
Enter fullscreen mode Exit fullscreen mode

Different function names, slightly different formatting, but identical coverage. Each test executes the exact same code paths.

The Scale of the Problem

I analyzed the complete calculator test suite:

  • 54 tests generated by AI
  • 39 exact duplicates (same coverage)
  • 168 subset duplicates (tests completely covered by others)
  • 45 similar test pairs (significant overlap)

Only 15 unique tests were actually needed.

That's a 72% redundancy rate! 🤯

Why Text Similarity Doesn't Work

My first instinct: "Just compare the test code!"

But these tests look different yet do the same thing:

# Test 1: Simple assertion
def test_addition():
    result = Calculator().add(2, 3)
    assert result == 5

# Test 2: Different style, same coverage
def test_sum():
    calc = Calculator()
    total = calc.add(2, 3)
    self.assertEqual(total, 5)

# Test 3: One-liner, same coverage
def test_calc_add():
    assert Calculator().add(2, 3) == 5
Enter fullscreen mode Exit fullscreen mode

Text similarity would rate these as "different" - but they execute identical code paths.

The Solution: Coverage Analysis

I built TestIQ to analyze execution patterns instead of text:

  1. Collect coverage - Track which lines each test executes
  2. Compare patterns - Find tests with identical/overlapping coverage
  3. Visualize results - Interactive HTML reports with side-by-side comparison

Types of Duplicates Detected

Exact Duplicates - Identical coverage

test_addition → lines [9, 11, 12]
test_add_numbers → lines [9, 11, 12]  ❌ DUPLICATE
Enter fullscreen mode Exit fullscreen mode

Subset Duplicates - One test covered by another

test_basic_add → lines [9, 11]
test_comprehensive_add → lines [9, 11, 15, 20, 25]  ⚠️ SUBSET
Enter fullscreen mode Exit fullscreen mode

Similar Tests - Significant overlap (configurable)

test_addition → lines [9, 11, 15]
test_subtraction → lines [9, 11, 20]  ⚠️ 66% similar
Enter fullscreen mode Exit fullscreen mode

How It Works

1. Install TestIQ

pip install testiq
Enter fullscreen mode Exit fullscreen mode

2. Collect Coverage Data

pytest --testiq-output=coverage.json
Enter fullscreen mode Exit fullscreen mode

TestIQ includes a built-in pytest plugin that tracks per-test coverage. No configuration needed!

3. Analyze for Duplicates

testiq analyze coverage.json --format html --output report.html
Enter fullscreen mode Exit fullscreen mode

Get an interactive HTML report showing:

  • All duplicate test relationships
  • Side-by-side coverage comparison
  • Recommended actions (keep/remove)

TestIQ Report

4. Compare Tests Visually

Click any test pair to see a split-screen comparison:

Coverage Comparison

Real-World Results

Calculator Example (AI-Generated)

  • Before: 54 tests, ~10 minute CI run
  • After: 15 tests, ~3 minute CI run
  • Savings: 70% faster CI, 39 fewer tests to maintain
  • Coverage: 100% maintained ✅

Production Django Application

  • Before: 847 tests
  • After: 612 tests (removed 235 duplicates)
  • Savings: 28% faster CI runs
  • Coverage: No reduction ✅

CI/CD Integration

Prevent duplicate tests from merging:

# .github/workflows/test-quality.yml
name: Test Quality Check

on: [pull_request]

jobs:
  check-duplicates:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install testiq pytest pytest-cov
          pip install -r requirements.txt

      - name: Collect coverage
        run: pytest --testiq-output=coverage.json

      - name: Check for duplicates
        run: |
          testiq analyze coverage.json \
            --quality-gate \
            --max-duplicates 0 \
            --format html \
            --output testiq-report.html

      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: testiq-report
          path: testiq-report.html
Enter fullscreen mode Exit fullscreen mode

Builds fail if duplicates are detected. No manual review needed!

Configuration

Fine-tune duplicate detection:

# .testiq.toml
[analysis]
similarity_threshold = 0.3    # 30% overlap = similar
min_coverage_lines = 5        # Ignore trivial tests
max_results = 100             # Limit report size

[reporting]
format = "html"
split_similar_threshold = 0.5  # Split similar tests

[performance]
enable_parallel = true
max_workers = 8
Enter fullscreen mode Exit fullscreen mode

Or use YAML:

# .testiq.yaml
analysis:
  similarity_threshold: 0.3
  min_coverage_lines: 5

reporting:
  format: html

performance:
  enable_parallel: true
Enter fullscreen mode Exit fullscreen mode

See full configuration docs.

Try It Yourself

TestIQ includes a demo with the actual AI-generated calculator example:

pip install testiq
testiq demo
Enter fullscreen mode Exit fullscreen mode

This generates and opens an HTML report showing all 207 redundancies in 54 tests.

Key Takeaways

  1. AI tools generate tests fast - but don't check for coverage duplication
  2. Coverage analysis beats text similarity - catches duplicates that look different
  3. Significant reduction possible - 72% in AI-generated suites, 28% in production
  4. No coverage loss - Remove duplicates while maintaining 100% coverage
  5. CI/CD ready - Block PRs with duplicate tests automatically

What's Next?

TestIQ is open source (MIT) and actively developed:

Contributions welcome! Issues, PRs, and feedback appreciated.


Have you dealt with test suite bloat from AI-generated code? Share your experiences in the comments! 👇

Want to try it? pip install testiq && testiq demo

Top comments (0)