DEV Community

Nico Reyes
Nico Reyes

Posted on

My validation caught everything. Except what mattered.

Built data validation for a product feed importer. CSV comes in, checks run, flags issues. Clean data goes to database.

Spent a week on it. Checked for empty fields, invalid URLs, price format, date format, SKU format. Passed every test.

First real import. 2000 products. Zero validation errors. Perfect.

Client checks dashboard next morning. "Why do 400 products say $0.00?"

Turns out zero is valid

Prices were there. Format was valid. Just all zeros.

My validation:

def validate_price(price):
    if not price:
        raise ValidationError("Price required")

    try:
        float_price = float(price)
    except ValueError:
        raise ValidationError("Price must be number")

    if float_price < 0:
        raise ValidationError("Price cannot be negative")

    return True
Enter fullscreen mode Exit fullscreen mode

Looks fine. Checks for empty, checks format, checks negative.

What could go wrong?

Everything apparently.

Zero is technically valid. Not empty. Not negative. Passes all checks. But no product costs $0. Obviously wrong data.

Went back through CSV. Supplier had encoding issues. Cells that should have been "$49.99" came through as "$0" after their export broke.

Validator saw valid numbers. Database accepted valid numbers. Dashboard showed valid numbers.

All wrong.

The one line fix

Added range check. You're selling something, it costs something.

def validate_price(price):
    if not price:
        raise ValidationError("Price required")

    try:
        float_price = float(price)
    except ValueError:
        raise ValidationError("Price must be number")

    if float_price <= 0:  # Changed this
        raise ValidationError("Price must be greater than zero")

    return True
Enter fullscreen mode Exit fullscreen mode

Reimported. 400 products flagged immediately. Got corrected feed from supplier. Imported clean.

Other nonsense I found

Started checking for stuff that looks valid but isn't:

Dates in future. Product available date set to 2099.

Weights of 0. Physically impossible unless you're selling air.

Descriptions that are just spaces. Technically not empty string.

Categories that don't exist in system. String validation passed, reference check didn't.

None of these broke format rules. All broke business rules.

Validator has two layers now. Format layer catches malformed data. Logic layer catches nonsense data that happens to be formatted correctly.

Still find weird edge cases honestly. Log everything so I can add checks when something stupid gets through.

Top comments (0)