Discussion on: DO NOT trust your frontend validators

View post

Replies for: Don't do your validation in the API/middleware either! To be truly robust all constraints should be built into the database and APIs will call sto...

Don't do your validation in the API/middleware either! To be truly robust all constraints should be built into the database and APIs will call stored procedures

I love the idea of moving validation logic as close to my data storage as possible. However, I also don't like putting business logic directly into my database. Yes, it's an oxymoron, I know :D

But this is a matter of taste I guess. I see your point here, especially when you've got multiple APIs accessing the same database - However, I suspect it's difficult to prevent users from using raw insert and update statements anyways, which of course would bypass the stored procedure inserts and updates ...

However, I think this is a matter of taste tbh with you, and you're definitely "closer" to my personal opinion than the guys simply adding frontend RegEx validators to the mix ... ;)

Jack • Aug 19 '22

This is kinda my take too. I know the most robust way is to build validators and constraints directly into the database. But in reality, you should only need to validate data at its contact point.

Once I've validated the request payload, I (as the developer) should know that my data is "safe" and the only person who can screw it up is me 😅

danjelo • Aug 19 '22

I usually use constraints for at least PK/FK keys. I have gotten in serious mess a few times when there were none and data migrations and faulty logic put wrong ids as keys :)

As a side note, some ORM's such as EF Core have some nice code first functionality where validation in models are reflected in db as constraints.

Thomas Hansen AINIRO.IO • Aug 19 '22

As a side note, some ORM's such as EF Core have some nice code first functionality where validation in models are reflected in db as constraints

The problem I've got with EF is the disparity between the RDBMS and its "OOP circus". For instance, it's very tempting to just do myObject.Save(). This model of using a database increases bandwidth consumption (passing in whole object during updates for instance), it increases chatter towards DB, and it makes it harder to synchronise access, resulting in the need for "locking records" either logically, or physically somehow ...

danjelo • Aug 19 '22

Yes agree. Have to say I am not really a fan of ORM's in general for the OR impedance mismatch for one thing and its tendancies to generate hellish SQL :) Recently troubleshooted a slow EF Core query. Could not find the issue, likely some sort of "parameter sniffing" issue where the query plan was not used.

Aaron Reese • Aug 19 '22

@jack:

But in reality, you should only need to validate data at its contact point.

Getting a bit OT here, but I absolutely disagree. You are about to 'POST' a customer order. How do you know if between the time the customer started the order on the app/website and submitted it, that the finance team have not put the customer account on hold for non-payment. This can only be done on the back end. On a really busy system (e.g. Amazon on Black Friday) this order request may even go into a message queue and may not get processed for several minutes. By the time it gets loaded into the system, the stock may be gone or the account may be suspended.

Thomas Hansen AINIRO.IO • Aug 19 '22

These are problems 90 percent never faces …

Jack • Aug 19 '22

You've quoted me but without the italics which totally changes the tone of my statement 😆

I don't work at Amazon or anything close that kind of scale, and the chances of something going wrong between contact point and database is virtually (virtually) 0.

Thomas Hansen AINIRO.IO • Aug 19 '22

Hehe 😅

You wish I was sorry. Sorry, but I’m not 🤪😉

Aaron Reese • Aug 19 '22

I also don't like putting business logic directly into my database

Why not? I can think of a few reasons but I would love to hear yours. To a certain extent I was being controversial with my original reply. Perhaps there should be a distinction between 'business logic' and 'data integrity'. Entering a telephone number and postal address in different countries doesn't break data integrity but it could be against business rules.
Ultimately someone/something has to be responsible for the validity of the data. If the data store is the one constant (you mentioned that [backend] users could do direct INSERT statements - well put the logic in a trigger.
In case you can't guess, I am a database guy. When the FE or API developers screw up the logic, guess who has to sort out the mess :)

Thomas Hansen AINIRO.IO • Aug 19 '22

Why not? I can think of a few reasons but I would love to hear yours

First of all I find it incredible hard to write validation logic in SQL. For instance, how do you validate an email address being valid in a stored procedure. I'm sure it can be done, I'm just not entirely sure if I want to see the code ... ;)

Perhaps there should be a distinction between 'business logic' and 'data integrity'

100% agree! Everything you can make the database take care of, you should make the database take care of, such as referential integrity, not null / versus null, field length, etc. However, in my video I illustrate a case where the validator semantically communicates that a field is not long enough back to the client. Validating things such as these in your stored procedure would be hard, and also probably result in an exception that it's impossible to return to the user because of security issues. Not to mention that the database is typically deployed on a different machine, possibly different network, as the backend API, resulting in one additional network request, resulting in that it's faster to validate in the API backend.

In case you can't guess, I am a database guy

Ahh, makes sense :)

By all means, apply as much data validation as you can in the database, I guess I just have a somewhat similar opinion to database validation as I do with frontend validation; "It's cool, nice to have, but don't exclusively do it" ... ;)

(For different reasons though)

guess who has to sort out the mess

I feel your pain ... :/

András Tóth • Aug 31 '22

However, I also don't like putting business logic directly into my database.
And why is that? Because the database is a really clunky coding experience.

I came to the conclusion that it is time for reimagining SQL. The language and connection must be modernized:

gain the ability to easily integrate with source control tools like git
modern programming language features: move away from thinking "it's a language to query the database" to have packages, code modules, unit testing/mocking capabilities

If this sounds ridiculous how does it sound to do n non-transactional rounds to the DB just because the team can only use ORMs and they don't know how to write the one action as one database transaction...

Thomas Hansen AINIRO.IO • Aug 31 '22

how does it sound to do n non-transactional rounds to the DB just because the team can only use ORMs

I've already covered ORMs ... ;)