Allow me to kick off with a bold statement: ORMs —like Hibernate (JVM), Entity Framework (.NET), ActiveRecord (Ruby), Django ORM (Python), Eloquent (PHP/Laravel), Doctrine (PHP), GORM (Go), etc.— are never a good idea. Period.
Even when using object oriented (OO) languages.
Why? The short (TLDR) version of this argument is:
They make the simple queries slightly simpler, but they do not help you for the harder queries. — me
For the long version continue reading the rest of this article...
Simple gets simpler with an ORM!
Instead of the potentially dangerous:
db.execute("select * from User u where u.id = $userId")
You can use an ORM and simply write:
Users.getById(userId)
With the following benefits:
- Harder to write code vulnerable to SQL injections.
- The result of the ORM call is usually mapped to some
User
object, where in case of the SQL you merely get the raw result. - Less to type and easier to read! (code should be optimized for readability)
But there's more! Following associations is really easy when using an ORM. Let's say we want to fetch the user's permissions from the database. Using SQL this would look like:
db.execute("select * from Permission p where p.user_id = $userId")
When using an ORM this could be as simple as:
val user = Users.getById(id)
user.permissions()
Finally, I think ORMs make a programmers life significantly easier when it comes to saving and deleting data in the database. Let's add a permission for the user and then delete the user altogether:
db.execute("insert into Permission (user_id, value) values ($userId, $permissionEnum)")
db.execute("delete from User u where u.id = $userId")
When using an ORM this would be:
Permission(userId, permissionEnum).save()
Users.getById(userId).delete()
With all these benefits, why do I argue against using an ORM?
Disadvantages of ORMs
Every database-backed application at some point needs more complex queries: you will need a query that cannot be expressed using the ORM (like complex aggregations, recursive queries or bulk inserts/updates). And then? You could just retrieve much more data and do the processing in the application layer, which is very slow. Usually in those cases the ORM is bypassed by writing SQL-in-strings, after which the code base contains two ways to query the db: (1) through the ORM and (2) using SQL-in-strings.
We can conclude ORMs are a very leaky abstraction (thanks Joey). In other words: you cannot merely learn how to use an ORM, you still need to learn SQL as well!
The whole reason to learn an ORM and the myriad concepts you need to master in order to use that ORM effectively, are just to make the simple queries slightly simpler.
What are these concepts common to most ORMs? Well...
-
Entities or Models – For mapping database table rows to objects. This may require quite a bit of coding: all
CREATE TABLE
statements now also require mapping code. - Associations – Like one-to-one, one-to-many, and many-to-many. These need to be specified on the models.
- Lazy/eager loading of associations – To controlling when related data is fetched.
-
Transactions – Ensuring data integrity with commit and rollback operations. Most ORMs have a specific way of achieving this (by abstracting over the db's
TRANSACTION
s). - Caching – Optimizing performance by avoiding redundant queries. Not all ORMs have caching layers, but many do. This also ties into the way your application is clustered.
- Concurrency – Managing conflicts when multiple users update data. Also know as optimistic locking.
- Identity mapping – To ensure the same object instance represents the same database row within a session.
- Sessions or Persistence contexts – To manage object life-cycles and database commits efficiently.
- Life-cycles – ORM objects usually life-cycles that tie in with translations, sessions and clean/dirty tracking.
- Clean/dirty tracking – data that was not changed, does not need to be updated in the database.
- Cascading of operations – Describes if the associated ORM object should be persisted. This ties in to the ORM,s notion of clean/dirty.
Not all these concepts apply to all ORMs. This list is merely to illustrate what concepts you may have to learn when using an ORM.
Code that uses an ORM —as the name implies— looks just like any object oriented code. For example, when we want to have retrieve the ISO code of the country that a user's company is registered in, the ORM-based solution would look like:
Users.getById(userId).organization().registeredAddress().country().isoCode
In this example it is not obvious when the query/queries to the database are made, how many queries are made, and what data is retrieved. With an ORM's lazy/eager settings it could be that fetching a "user" also fetches it's organization. The lazy/eager settings are usually set once: and thus cannot be tweaked per query.
While the following SQL-in-string alternative is longer, it is also very clear in what value is being retrieved (only the "iso code"):
db.execute("""
select c.isoCode
from User u
join Organization o on u.organizationId = o.id
join Address a on o.registrationAddressId = a.id
join Country c on a.countryId = c.id
where u.id = $userId
""")
Since ORMs tend to hide queries, analyzing performance bottlenecks of ORM-based code is much harder compared to "just SQL".
When the performance of a relational database backed application is critical, the business logic is usually for some part described in the database queries. In these kind of applications the library used to interact with the database has many call-sites in the application code, and often becomes tightly coupled with the business logic. Changing the library that is used to interact with the database for a large code base quickly becomes prohibitively difficult. Therefor, it is very important to carefully consider your options before making the investment.
So, when dealing with your data base, would you rather learn both...
...and...
...or just the latter?
If not an ORM, then what?
There are other ways to make integrating SQL queries in your language of choice easier. Usually these approaches do not alleviate you from learning SQL, but may provide some of the other benefits ORMs are known for, namely:
- Improved type safety.
- Some checking of SQL queries (so you get compile errors or failing tests when you change the name of a table but forget to update the query accordingly).
- Make it harder to write code vulnerable to SQL injections.
- Reduce boilerplate.
These non-ORM options that improve embedding SQL in general purpose languages generally come in two flavors:
- Improving on SQL-as-strings.
- As an embedded domain specific language (eDSL).
Compared to ORMs both approached are much thinner abstractions. In both cases it is very easy to "see the SQL". The solutions in category two have the potential to provide more features. The solutions in category usually perform slightly better (the eDSL solution are basically query builders: the queries are built at runtime).
Examples of libraries in the first category are:
Some of these can tie into your build process with a task that generates code based on the database schema (for this to work a database needs to be available at build time). Depending on the chosen library the generated code may contain:
- Tables names, column names, columns types and sometimes also index names; to add some type safety and IDE code completion.
- A "record" DTO definition for each (main) table to represents a row in that table.
- Methods to reduce the boilerplate code needed to follow an association.
Examples of solutions in this category are:
- jOOQ (JVM),
- LINQ's method syntax (.NET),
- Kysley and Drizzle with SQL-like syntax (TypeScript),
Conclusion
Using an ORM you will:
- Find it hard or impossible to write efficient queries:
- easy to write queries in loops,
- often way too much data is being fetched (not easy to reduce this), and
- only a limited set of the database's features are available.
- Become locked-in to your ORM library: the API surface is big and often called directly from the business logic code.
It is said that ORMs are a good fit for very simple CRUD applications. Many programmers learned about ORMs because they "came with the framework": Hibernate in Java EE frameworks like Spring, Entity Framework in (ASP.NET), ActiveRecord (Ruby on Rails), Django ORM in (Django) and Doctrine (Symfony). When the application is small and gets barely any traffic, the ORM does not hurt yet. But as the application grows the ORM becomes a liability.
It seems to me we have optimized these web frameworks for super clean code "Getting Started" guides. Maybe that's the only thing ORMs are good at: super clean code "Getting Started" guides.
Top comments (2)
For simple CRUD, why not? Use an ORM. 🧠 But it gets tricky with complex queries, so I’d argue for having a solid grasp of writing SQL queries. 🐱🏍
No application only does simple CRUD, so even when using an ORM for simple CRUD you need to know SQL for those queries that are not "simple CRUD". Now you must know SQL and an ORM.
You have produced a great series on Hibernate: a testimony to it not being simple technology. An ORM introduces a lot of new concepts, a lot of things you have to be mindful of.
I argue: it is never worth it to use an ORM. Even a simple "CRUD app on top of an SQL db" becomes more complex by introducing an ORM.
It's bad tech, and only became popular by virtue of web frameworks bundling ORMs so they could show super simple OO-looking code for CRUD examples. When actually using those frameworks you quickly needed to drop down to SQL, and had to learn all the ORMs concepts -- thereby totally negating the advantages of the ORM.