The hidden cost of unique_for_date

#django #python #webdev #codequality

Can you spot the problem with this Django code?

from django.db import models

class ExampleModel(models.Model):
    date = models.DateField()
    name = models.CharField(unique_for_date='date')

It's unique_for_date. unique_for_date ensures name will be unique for date. So far so good, but unique_for_date (and it's siblings unique_for_month and unique_for_year) has a lot of gotchas and is very brittle:

The constraint is not enforced by the database.
It is checked during Model.validate_unique() and so will not occur if Model.save() is called without first calling Model.validate_unique().
It won't be checked even if Model.validate_unique() is called when using a ModelForm if the form that form excludes any of the fields involved in the checks.
Only the date portion of the field will be considered, even when validating a DateTimeField.

That's a lot of "buuuuut" for a developer to keep in their minds when building a mental model. Lots of room there for the unexpected to creep in:

If a developer does ExampleModel.objects.create(...) ad hoc in the shell and forgets to first call validate_unique(). Sure we should not SSH into the production shell and create records ad hoc, but he without sin etc.
If a view or serializer does ExampleModel.objects.create(...) or Model.save() without explicitly calling validate_unique(). Yes, code review should catch it but if code reviewers could catch 100% of mistakes 100% of time with 100% consistency then we would not need code review at all because such Übermensch would not create bugs in the first place.

So unique_for has many traps that can be triggered by human error. When implementing these fields the developer may conclude that these problems don't apply for the specific problem they're solving, and they trust themselves, their current and future team mates not to make mistakes. However, over time requirements changes. Over time things tend to get more different, not more similar. Code entropy is real. As the situation on the ground changes can we be sure that one of those problems won't be hit? What's your risk appetite?

Avoiding the problem

Instead of the terse but brittle:

from django.db import models

class ExampleModel(models.Model):
    date = models.DateField()
    name = models.CharField(unique_for_date='date')

We can do a more verbose, less DRY, but simultaneously more explicit and more future proof:

class MyModel(models.Model):
    date = models.DateField()
    name = models.CharField()

    def save(self, *args, **kwargs):
        # change specific filter depending on need.
        if MyModel.objects.filter(date=self.date, name=self.name).exists():
            raise ValidationError({'name': 'Nein!'})
        return super().save(*args, **kwargs)

This validation will be called whenever Model.save() is called, but unfortunately not when Model.objects.update() is called, but this is about harm reduction rather than perfection.