DEV Community

Does Phoenix make database CRUD too easy?

Michael Herold on August 14, 2018

In 2014, the Phoenix framework emerged onto the web development scene as a fast, productive, and concurrent solution for the modern web. Its focu...

Read full post

Dian Fay • Aug 15 '18

I'm neither a Ruby nor an Erlang/Elixir developer so I can't speak to most of this, but I am going to go ahead and challenge the idea that 'business logic is happier without knowing it works with a database' :)

There are arguments for ensuring that the logic is portable without regard for a particular flavor of (relational, since if you're going NoSQL you're already locked in) database, most commonly that you're building an enterprise product which will be deployed in environments that already run aSQL with admins who don't want to have to learn and manage bSQL. But it's not a foregone conclusion.

Databases and SQL are better at many things than higher-level general purpose programming languages. When user input isn't required, it makes no sense to pull information out of a database, operate on it in application code, and put it back when SQL could accomplish the same goals in a fraction of the time. Database constraints too are the only way to truly enforce data validity and integrity, so you're already performing some business logic functions in the database. This stuff never really happens in a vacuum, and coming to grips with that is a lot more sustainable than trying to pretend otherwise.

Michael Herold • Aug 19 '18

I've thought about your comment for a few days now. Here are my thoughts on the topic, not in the order in which you presented them since I think my thoughts will be more coherent that way. Hopefully, that's okay! :)

When user input isn't required, it makes no sense to pull information out of a database, operate on it in application code, and put it back when SQL could accomplish the same goals in a fraction of the time.
[...]
admins who don't want to have to learn and manage bSQL.

I'm curious about this. It sounds to me like you're suggesting you use things like triggers or PL/pgSQL (in the Postgres world, of course!) to directly update data within the database? Is that something you do? How does it work out? Have you run into any maintenance issues there?

I've often thought that triggers would a nice thing to use, but splitting your logic between the database and the application feels like it would lead to a maintenance headache in the form of increased cognitive load. When there is a bug in the application, you then have to look in multiple places for the source of the bug. In addition, there's more "surface area" for onboarding new developers onto the application.

Maybe I misunderstood and you're not suggesting that. If I did misunderstand, I'd love to hear what else you meant!

I am going to go ahead and challenge the idea that 'business logic is happier without knowing it works with a database' :)
[...]
Database constraints too are the only way to truly enforce data validity and integrity, so you're already performing some business logic functions in the database.

Between these two statements, I think that you and I mean two different things when we say "business logic". To me, business logic consists of all the things that can't (easily, at least ... I know you can do some magic things in procedural RDBMS languages) express in the confines of SQL and/or relational algebra.

Business logic is why we have applications instead of spreadsheets.

Data integrity is a separate, related set of logical changes to the data that is input by the humans and other systems into your application. Constraints help protect you from errors. They help you clean your data as it changes over time (for instance, foreign key constraints with ON DELETE CASCADE).

Data integrity is there to help you reduce the cognitive load in your business logic and, hopefully, make it so you can write your business logic how a human thinks about it, not how you have to tell a computer to do it.

Does that make sense? Thank you for the thoughtful comment!

Dian Fay • Aug 19 '18

I make extensive use of views, triggers, functions, and stored SQL scripts (which last my data access framework treats ~identically to functions; to bring this back to Elixir momentarily, Moebius is built toward similar goals by Massive's original developer). Lately I've applied tobyhede's Postgres event sourcing model in a couple of cases, and of course trying to do that outside SQL would be impractical at absolute best.

I definitely agree that splitting business logic can lead to headaches -- which is why I try to put as much of it as possible in the database and write fairly minimal higher-level APIs on top! All you need for that is a simple web framework, while migration tools like sqitch make it practical to iterate on the database the same way you do on the higher-level code. Of course this all does require some facility with SQL, or the time and willingness to learn.

Testing is also a concern. pgTAP exists but so far ensuring API routes exercise the database fully and maintaining good test coverage for those has served me well enough.

Data integrity is there to help you reduce the cognitive load in your business logic and, hopefully, make it so you can write your business logic how a human thinks about it, not how you have to tell a computer to do it.

If you ride that train of thought a little further, the boundary will start to get get really, really blurry. It's a false dichotomy: business logic as you presently conceive of it is a set of rules which describe things happening to information. So are integrity constraints: even your example of CASCADE on a foreign key is a rule which declares that when a tuple is deleted, tuples in another table with an attribute referencing it must also be deleted. That's low-level business logic!

To pontificate a little: web developers often have a weakness for synecdoche, taking 'application' to mean strictly a subset of data processing and user-facing components written in higher-level languages. When you think instead of a complete system assembled to achieve a purpose, the database is more than simply a storage device: it's an organizing principle. The database describes how information is structured and what can or can't be stored. Data storage considerations shape the architecture of higher-level components (how many XControllers have you run into where X is a table?). It's a mistake to think of the database as a wholly separate concern. If something would be faster or simpler as a trigger and you don't have a pressing reason to minimize database usage, make it a trigger! That's why the database developers gave you the option!

Raphael Costa • Aug 15 '18 • Edited

I think there's a misconception here. Ecto itself does not need to be tied to your database.

I actually use Ecto a lot to validate data and enforce a schema, without even mentioning the database.

There's even an open issue at the Ecto repository that will split the ecto package into ecto and ecto_sql, with the former enhancing the schema and changes management of the first one with database access functionality. So in order to use Ecto the way it is today, you will need to include both packages, and if you don't need the database access functionality you can just ditch the second package. All of this in the next version (3.0).

So no, I don't think Ecto is a bad choice for linking the domains, as it is positioning itself as a data validation library that can be enhanced with database access. :)

Michael Herold • Aug 15 '18

That's really interesting, I hadn't seen that issue yet.

Do you use an intermediate layer of Ecto changesets to map between contexts? Using embedded schema and other such patterns has lead to some headaches for me. I posted a question on the Elixir Forum that I never got a reply to (sadly, I can't link to it because they appear to be down right now) around this very problem.

I'd love to hear how you approach the problem of decoupling from the database while still using Ecto! It sounds like Ecto is moving in an interesting direction.

Raphael Costa • Aug 15 '18

Well, I usually don't :D

But I use Ecto to normalize some parameters and returns that have nothing to do with the database (like returns from external APIs). This gives me confidence that I could move my persistence layer without that much change in my domain's boundaries.

I'm interested to hear how using embedded schemas backfired with you, though. :)

Michael Herold • Aug 15 '18

Elixir Forum is back up, so I can link you directly to the issue.

Basically, I'm presenting a schema that embeds two other schemas to use in a single form within Phoenix.HTML.Form.form_for/4. This schema is then translated into an Ecto.Multi for committing. However, if there's an error I lose some of the context when translating back from the Ecto.Multi from the schema-of-schemas.

This is all to work around what I see as a deficiency in the impl for Ecto.Changeset in Phoenix.HTML.FormData.form_for_errors/1. It doesn't play nicely with the error state for my construct.

There is likely a better way to do what I'm trying to do, but I'm still wrapping my head around this. :)