Discussion on: Database constraints are (often) harmful

View post

I see your points, and even agree in principle, but disagree with most of them in the real world.

A lot of assumptions people have about anything are wrong. For instance, a lot of services use a unique email as an identifier, but there's no one-to-one mapping between people and email addresses. I have lots (ben@my.work.address, ben@my.play.address...), some people share one (support@my.work.address, thesmithfamily@example.com) and enforcing uniqueness then becomes a business problem instead or you'll get a buttload of users who're breaking your Ts&Cs.

In the real world, code gets deployed which may or may not have been tested well, and tests may or may not cover every eventuality. The QA user buys negative PI beers.

Without constraints, and especially without transactions, data will get corrupted. It's like using C to manage strings as character arrays. You'll screw it up at least once while you're writing your program and see some sort of delicious segfault. If you have the chance to use a string type which handles all that for you, why not do it?

The difference between my string example and the RDBMS of the question is that the database persists longer than your new code. Errors in it go unnoticed, code crashes part-way through processing a "DROP user" and you have dangling subscriptions. Then you need to write a new function called removeOrphanSubscriptions which you call on cron every night at the software witching hour.

And the problem with optimisations is that choosing to not use a constraint because it migh degrade performance is the optimisation, not the other way around.

Jon Lauridsen • Apr 14 '22

I hadn't considered transactions as part of my argument, we're definitely in agreement that transactions form a backbone of reliable writes. There should never be a dangling Subscription because a service crashed halfway through, and if what I wrote implies I'm against transactions then please point out the offending text so I can fix it :)

I think we both agree that the correct place to prevent buying negative beer should be in the code, right? Maybe we'd argue exactly where and how to do it, but from what you write I think we'd both argue the code should be responsible. The code is the responsible system. You describe the case where a mistake in the code would let an error through that'd get caught in the database, and yes you're right constraints might catch such errors, but it's kind of based on luck: Because business requirements can't be (or certainly shouldn't be) modeled entirely in the database, there are no guarantees that errors in the code layer will be caught in the database. It is possible to get lucky and catch an error, but in my opinion & experience it can just as easily be a regex-constraint that's gone out of sync with the code and now produces hard-to-verify errors because of some edge-case where a certain combination of characters result in a database error. Hope is not a strategy, and if the code should be responsible I'm all for focusing all efforts on that layer instead of adding partial safety nets where they don't really belong.

Perhaps constraints as you describe it made a lot more sense when code was managed by different people than the database, because those constraints allow the database-context to establish rules to make sense of its own data. But if one team manages both code and DB? I'm really suspicious of the value a constraint adds in that context, other than taking up even more minutes to writing and verifying a migration.

That said I'm not here to tell anyone how to work, you make good points and as long as we both can work on systems we enjoy I'm all the happier :)