Dave Cross

Posted on Jun 28 • Originally published at perlhacks.com on Jun 28

Choosing the Right Database Abstraction

#database #dbi #dbixclass

A question came up recently in the Perl community asking whether, in a Mojolicious application, it’s better to use DBIx::Class or a Mojolicious-specific module like Mojo::Pg.

It’s an interesting question, but I think it’s asking the wrong thing.

I’ve spent almost forty years moving up the stack of database abstractions. Every few years, someone invents a new layer, and every few years, experienced programmers explain that they don’t need it because they’re perfectly capable of working at the layer below. I’ve watched exactly the same argument play out over CGI, web frameworks, ORMs, containers and now AI-assisted coding.

Sometimes the new abstraction turns out to be a dead end. More often, it’s simply another step that lets us spend less time on plumbing and more time solving the problems our users actually care about.

I think that’s what’s really going on here.

You’re comparing different layers

The first thing that struck me about the discussion was that it jumped straight from DBIx::Class to Mojo::Pg, as though they were equivalent choices. I don’t think they are.

If you zoom out a little, there are several layers involved in database access:

Application
DBIx::Class / DBIO (ORM)
DBI
DBD::Pg / DBD::SQLite / DBD::MariaDB
Database

Each layer builds on the one below it. DBIx::Class sits on top of DBI, DBI sits on top of a database driver, and the driver talks to the database itself. Every layer gives you a slightly richer abstraction whilst hiding a little more of the mechanics.

Mojo::Pg occupies a rather different place. It’s a PostgreSQL library designed specifically for Mojolicious. That’s a perfectly sensible design if you’ve already decided you’re using both PostgreSQL and Mojolicious, but it couples together decisions that I’d normally prefer to keep separate.

Personally, I like choosing my web framework independently of my database, and my database independently of my data access layer. The more independent those decisions are, the easier it is to change one without affecting the others.

Why I moved up the stack

I started writing embedded SQL in C in the late 1980s. When I moved to Perl in 1996, I used Sybperl, which was a thin layer over the proprietary Sybase database API. When I discovered DBI, I adopted it enthusiastically because it removed an enormous amount of repetitive code whilst remaining database-independent. Later, when DBIx::Class became mature enough for production systems, I happily moved another level up the stack.

None of those decisions happened because I couldn’t write SQL. I’ve been writing SQL since 1988 and I’m perfectly happy doing so.

The point was that I no longer wanted to spend my time writing the same joins, the same foreign key lookups and the same bits of boilerplate over and over again. Those aren’t the interesting parts of my applications. They’re necessary, but they’re plumbing. Good abstractions let me write that plumbing once and then think about the domain I’m modelling instead.

This isn’t just about databases

A few weeks ago, I wrote about The Long Road from CGI to Containers. The point of that article was that the history of web development is largely the history of building better abstractions.

We moved from CGI scripts to web frameworks. From hand-written deployment procedures to containers. Every step let us express our intent at a higher level whilst worrying less about the implementation details.

Database programming has followed exactly the same path.

But I like writing SQL

Whenever I recommend an ORM, someone inevitably assumes it’s because I don’t know SQL.

Nothing could be further from the truth.

SQL is probably the programming language I’ve been writing the longest. I enjoy writing SQL. But I don’t enjoy writing the same SQL repeatedly.

I don’t want every controller containing another join between the same tables. I don’t want to duplicate the same WHERE clause in half a dozen places. I don’t want to remember every foreign key relationship every time I touch the code.

Those things aren’t business logic; they’re plumbing.

An ORM lets me move that plumbing into one place and give it names.

What DBIx::Class gives me

For me, the biggest advantage of DBIx::Class isn’t that it generates SQL.

It’s that it models my application.

A Person has many Titles.
A Title belongs to a Person.
A User> has many Orders.

Those relationships become part of the vocabulary of the application. Queries become reusable methods. Business rules become object methods instead of comments beside SQL statements.

Instead of thinking about joins, I’m thinking about people, titles and orders.

That’s a much nicer level of abstraction.

Isn’t it slower?

One criticism often levelled at DBIx::Class is performance.

That’s not entirely unfair. It’s perfectly possible to write inefficient DBIx::Class code. I know because I’ve done it.

While recently carrying out some optimisation on my Line of Succession application, I discovered one page was generating hundreds of unnecessary database queries. The solution wasn’t to abandon DBIx::Class and replace it with hand-written SQL. The solution was to understand the ORM better.

Adding a couple of well-placed prefetch clauses reduced the number of queries dramatically with almost no change to the surrounding code.

The ORM wasn’t the problem; my use of the ORM was.

That’s a useful lesson. Every abstraction has a learning curve. If you don’t understand how it’s working underneath, you’ll eventually hit performance problems. But once you do understand it, you can often achieve dramatic improvements without sacrificing the higher-level API that made you choose the abstraction in the first place.

What if I just want to write SQL?

One thing that struck me about the original discussion is that it jumped straight from DBIx::Class to Mojo::Pg, as though those were the only two choices.

I don’t think they are.

If all I wanted was to execute SQL from a Mojolicious application, I’d probably start by looking for a DBI-based solution rather than one that’s tied specifically to PostgreSQL. Something like Mojolicious::Plugin::Database seems like a more natural fit, as it lets me use the standard Perl database abstraction whilst integrating cleanly with Mojolicious.

That still leaves me free to choose PostgreSQL, SQLite, MariaDB or whatever other database makes sense for the project.

By contrast, choosing Mojo::Pg bakes the decision to use PostgreSQL into your application at a deeper level than I’d be comfortable with.

Good abstractions should reduce coupling, not increase it.

The model layer

Of course, in larger applications, there’s usually another abstraction on top of the ORM itself. My controllers don’t generally talk directly to DBIx::Class resultsets; they ask domain-level questions such as “Who was sovereign on this date?” The fact that the answer currently comes from DBIx::Class is an implementation detail. That’s another useful abstraction—but it’s probably a topic for another article.

Aside: What happened to DBIx::Class?

For a long time, this was the awkward question.

Around a decade ago, active development effectively stopped. This wasn’t because the technology had failed or because the community lost interest. The maintainer had explicitly stopped development, and measures such as the CPAN NOXFER co-maintainer flag prevented the normal open source succession process from happening.

For years, that left DBIx::Class in an uncomfortable position. It remained stable and widely used, but there was understandable uncertainty about its long-term future.

Fortunately, that situation finally appears to be changing. DBIO appeared on CPAN just a few days ago. It has taken the hard-fork route, preserving the ideas that made DBIx::Class successful whilst allowing active development to resume.

Whether DBIO ultimately becomes the successor to DBIx::Class isn’t really the important point.

The important point is that these ideas are still valuable enough that people are prepared to invest significant effort in carrying them forward.

Choosing your abstraction

I don’t think this is fundamentally a question about Mojolicious.

I’d happily use DBIx::Class with Mojolicious, Dancer2 or any other web framework. I’d happily use DBI with all of them too. The web framework and the data access layer are largely independent architectural decisions, and I generally prefer to keep them that way.

The important decision isn’t the framework.

It isn’t even the database.

It’s deciding where you want your application to live.

Do you want to think in terms of SQL?

Do you want to think in terms of tables?

Or do you want to think in terms of the concepts that make up your application’s domain?

After nearly forty years of writing SQL, I know which one I’d rather spend my day thinking about.

The best abstraction isn’t the one that hides the most detail.

It’s the one that lets you spend most of your time thinking about the problem you’re actually trying to solve.

DEV Community