DEV Community

Cover image for How I Used Set Theory to Catch Bugs That Unit Tests Miss
Yaniv
Yaniv

Posted on

How I Used Set Theory to Catch Bugs That Unit Tests Miss

Most test automation tutorials teach you to test layers in isolation: UI tests check buttons, API tests check status codes, DB tests check records. But the bugs that actually cost money in production? They live between the layers.

I learned this the hard way while building a test automation framework for a financial expense tracker. This post is about one specific technique — Set Theory validation — that catches data integrity bugs that no single-layer test will ever find.

The Problem: Everything Passes, But Data Is Wrong

Imagine this scenario:

  1. A user creates an expense for $100 through the Web UI
  2. The UI shows a success message ✅
  3. The API returns 201 Created
  4. The database has... $0. Or two records. Or nothing.

Every individual layer test passes. The UI test confirms the success message appeared. The API test confirms the status code. But nobody verified that the actual data made it through the entire pipeline correctly.

In financial applications, this is not a cosmetic bug — it's a silent data inconsistency that can go unnoticed until an audit.

The Approach: Database State as a Mathematical Set

Instead of checking "does a record exist?", I treat the entire database table as a mathematical set, and use set difference to prove exactly what changed.

Here's the concept:

# Step 1: Capture DB state BEFORE the action
old_set = {(id, name, amount) for each row in expenses}
old_sum = SUM(amount) from expenses

# Step 2: Perform the action (create expense via UI or API)

# Step 3: Capture DB state AFTER the action
new_set = {(id, name, amount) for each row in expenses}
new_sum = SUM(amount) from expenses

# Step 4: Validate using set difference
isolated_record = new_set - old_set

assert len(isolated_record) == 1          # Exactly ONE new record
assert new_sum - old_sum == expected_amount  # Amount is correct
Enter fullscreen mode Exit fullscreen mode

This is powerful because:

  • It's operation-independent. Whether the expense was created via UI, API, or direct SQL — the validation is the same.
  • It catches duplicates. If a bug causes two records to be inserted, len(isolated_record) will be 2.
  • It catches phantom data. If some other process modified the table during the test, the set difference will include unexpected records.
  • It catches amount drift. If $100 was entered but $99.99 was stored (floating point issues, rounding bugs), the sum check catches it.

Real Implementation

In my framework, this looks like this across the stack:

DB helper that captures state:

@staticmethod
@allure.step("DB: Get all expenses as set")
def get_all_expenses_as_set(cursor):
    cursor.execute("SELECT id, expense_name, amount FROM expenses")
    return {(row[0], row[1], row[2]) for row in cursor.fetchall()}

@staticmethod
@allure.step("DB: Get sum of amounts")
def get_sum_of_amounts(cursor):
    cursor.execute("SELECT COALESCE(SUM(amount), 0) FROM expenses")
    return cursor.fetchone()[0]
Enter fullscreen mode Exit fullscreen mode

Cross-layer E2E test that uses it:

def test_api_create_reflects_in_db(self):
    # Capture pre-state
    old_set = DBActions.get_all_expenses_as_set(cursor)
    old_sum = DBActions.get_sum_of_amounts(cursor)

    # Act: Create expense via API
    response = APIActions.post(session, url, payload)
    APIVerification.verify_status_code(response, 201)

    # Capture post-state
    new_set = DBActions.get_all_expenses_as_set(cursor)
    new_sum = DBActions.get_sum_of_amounts(cursor)

    # Validate integrity
    diff = new_set - old_set
    assert len(diff) == 1, f"Expected 1 new record, got {len(diff)}"
    assert new_sum - old_sum == expected_amount
Enter fullscreen mode Exit fullscreen mode

The same pattern applies to update (set difference shows one record with changed values) and delete (old_set - new_set shows the removed record).

Where This Caught Real Bugs

While building this, the Set Theory approach caught two issues that single-layer tests completely missed:

1. MySQL CHECK constraint silently rejecting negative amounts.
The UI happily accepted -50 as an expense amount. The API returned 201. But MySQL's CHECK (amount >= 0) constraint blocked the INSERT — so the record never existed in the DB. The set difference was empty when it should have contained one record. Without the cross-layer test, this would have looked like a perfectly passing test suite.

2. VARCHAR(255) overflow truncation.
A 300-character expense name was entered through the UI. The API accepted it. MySQL truncated it to 255 characters silently. The set difference caught the mismatch because the stored record didn't match the expected data.

The Full Picture

This technique is one piece of a larger framework I built with 53 tests across 4 layers (Web/Playwright, API/Flask, Mobile/Appium, Database/MySQL). The cross-layer E2E tests that use Set Theory are a small percentage of the total test count, but they catch the highest-risk bugs.

The architecture enforces strict separation:

Tests → Workflows → Actions/Verifications → Page Objects + Data
Enter fullscreen mode Exit fullscreen mode

Every layer has one job. Tests never call raw UI or API actions directly — they go through workflows that compose actions into business flows. This keeps the set theory validation reusable across different test scenarios.

When You Should (and Shouldn't) Use This

Use it when:

  • Your application handles financial data, inventory, or any domain where data accuracy matters more than UI polish
  • Data flows through multiple systems (frontend → backend → database → reporting)
  • You've had production bugs where "the UI said X but the DB had Y"

Don't use it when:

  • You're testing a static website or content-only app
  • The DB is behind a well-tested ORM with strong constraints and you trust the abstraction
  • Test execution time is critical and you can't afford the extra DB queries

Try It Yourself

The full framework is open source:

GitHub: Financial-Integrity-Ecosystem

You can run the entire suite with a single command:

git clone https://github.com/Yaniv2809/Financial-Integrity-Ecosystem.git
cd Financial-Integrity-Ecosystem
docker-compose up --build
Enter fullscreen mode Exit fullscreen mode

This spins up MySQL, Flask, JSON Server, and Playwright — runs all 37 non-mobile tests automatically.

The cross-layer E2E tests are in tests/api/test_e2e_api_db_expense.py and tests/test_e2e_web_api_db.py if you want to see the set theory pattern in action.


I recently shared this project on r/QualityAssurance and got valuable feedback that led to several improvements, including adding a testing philosophy section that explains the business risk behind each layer. If you have feedback or have used similar patterns in production, I'd genuinely like to hear about it.

Yaniv Metuku — QA Automation Engineer

Top comments (0)