After a very long time working on NodeJS, and having tried several databases and modules and Object Relational Mappers (ORMs), I think sharing my views may help save many from jumping through or stumbling over a few hoops. Or prevent that wrong misinformed decision that will entrench you when your application grows too big.
First off, there's no "one-size-fits-all" when it comes to databases and modules/ORMs. You have many choices out there, MongoDB, MySQL, PostgreSQL ("Postgres" or "PG") and etc. My opinion on this is that what you will search and read are largely polarised. So, take them all with a pinch of salt.
In my five-part series to build Instagram, I also reiterated this point (very abruptly) and opined that PG is more suitable as the main database. Actually, that is said for the sake of brevity. In truth, data in a social app like Instagram can scale beyond what's best for PG to handle. But it is not whether which is better, but rather, what is for what.
In this article, I will just cover MongoDB (Part 1) and PostgreSQL (part 2). There's pretty much no reason to use MySQL, when you can use the much more powerful and faster PostgreSQL.
To understand the hype (or mis-hype) about MongoDB, I needed to cover some history. This is not a post to flame or support MongoDB, I just want to list out its pros and cons.
- It comes readily shipped and pre-configured with many Node modules out there.
- Setting up is extremely simple.
- It is built to scale, and supports many such features out-of-the-box. Believe me, the simplicity of set up here is amazing compared to other databases.
- Very readable syntax, again, right out-of-the-box.
But the problem is that many articles or modules have you believe that it is the right choice for everything. In truth, it's use-cases are very limited.
In its early days, everyone is rushing to get into the bandwagon, and wants to look very modern. Almost every new piece of Node technology or boilerplate shipped with MongoDB.
This can be misleading. And many of these new toys ditched a lot of fundamental database features -- like Reaction Commerce.
For a significant part of reactioncommerce's history, it actually can be running into serious data consistency issues, because it was lacking transactions (MongoDB only implemented transactions in v4.2, August 2019). Transaction is an important feature that is available in almost all relational databases. It ensures that updating data across several tables are completed in full, or rolled-backed totally if it some updates have failed, perhaps due to validation failure, data corruption, or temporary hardware failure. This is fundamental in many applications, ecommerce more so probably.
An article in 2017 by Brent Hoover (in the article it says he
"manages Reaction's community/client tea") said NoSQL database is perfect but wrote vaguely about how data in ecommerce is modelled best by NoSQL. Separately on Gitter, he responded to a query on how Reaction overcomes lack of transactions, saying that, "In a document database, this sort of referential integrity is not required because the entire object is stored in one record."
How can everything you need ever be only stored in a single document? Just how big will that single document be, and how much data replication you need to achieve this? There must have been some level of normalisation (breaking into smaller documents) at some point.
Consequently, I don't know how Reaction Commerce deals with such issues, but I can imagine, in your effort to extend new features, you have to give up some data consistency, or write your own transaction-equivalent syntax.
Don't get me wrong. Reaction Commerce is a great framework. But my point is that it was a fast and sexy car that shipped with no airbags. And they didn't tell you it came without. If you asked, they'd say it's not required -- that's simply untrue; a compromise was made silently. So please don't be lured in by fast cars with no airbags.
However, if anyone knows of how this is dealt with in context of Reaction Commerce now, I would love to hear your thoughts in the comments section below.
The way I see database, is not in whether to use this, or to use that. But rather, in a large application, you will best be using a few databases to optimise performance.
If your application is small, and for simplicity since a lot of Node modules ship with MongoDB, you can just go ahead.
However in my experience, every piece of app I have built almost always very quickly scale beyond being the kind of "small" initially imagined.
This is really where MongoDB shines. You have to dissect your application data and see which part has a lot of data that may not have a fixed structure. While many relational databases such as PG also supports storing schemaless JSON data, querying it can be very clumsy, even with use of modules or ORMs.
So, I'll conclude on this point about dissecting data, by referring back to the example of building Instagram (tutorial I mentioned earlier) and how I would do it:
- User's posts
- Interactions (likes, comments, shares) between Users and Posts.
- User activities (logins, time on site, behavioural patterns)
- Post performance and statistics (how many click-throughs, trending data)