DEV Community

MongoDB Atlas & Azure - a forced marriage?

Deyan Petrov on August 09, 2021

TLDR; MongoDB Atlas on Azure (smaller instances with smaller storage) works but comes with a number of pitfalls you should be aware of. You'd save ...
Collapse
 
kyrylkov profile image
Sergiy Kyrylkov • Edited

Does your app require MongoDB / NoSQL in general or you could use PostgreSQL instead?

Collapse
 
deyanp profile image
Deyan Petrov

Could probably make it work with PostreSQL as well, however MongoDB does have some goodies like:

  • No need for ORM or manual mapping of 1 domain object to N rows in N tables (in case of normalized relational model) ... and then you have to use db trxs to wrap all together, and queries to join back the stuff upon read ...
  • MongoDB has very good performance on very low hardware specs - e.g. with 2 burstable vcores and 4GB RAM you are good to go in production
  • MongoDB clusters are pretty highly available, by default with 3 nodes primary-secondary-secondary, automatic re-election etc
Collapse
 
kyrylkov profile image
Sergiy Kyrylkov • Edited
  • Could you provide a specific example?
  • In my research PostgreSQL has better performance than MongoDB for most use cases. That being said, we had serious perfomance issues renaming one document field, even on M40 clusters.
  • We never had issues with even one node running on MLab, before it was shutdown by MongoDB. Thus it looks like high availability is oversold for most use cases.
Thread Thread
 
deyanp profile image
Deyan Petrov
  • Sure, imagine you have a domain object/aggregate root/whatever called AccountingTransaction, which has some properties like Id, State, CreatedOn, and a list of AccountingEntries, where each Entry has Type (Debit/Credit), AccountId (reference), Amount, Currency. In a standard relational database I would need by default somehow to map this entity to more than 1 table - probably I would create AccountingTransactions and AccountingEntries tables, with 1:N relationship between them, then insert 1 row in the first table and maybe 2-3 rows in the second for a single domain object. Moreover, the inserts must happen in a db transaction. In contrast, in a NoSQL db like MongoDB I am just inserting a single document, which has nested "entries" array inside, without a db transaction ...

  • Really, PostgreSQL is faster than MongoDB on the same basic hardware (e.g. M20 which is 2 Standard_B2s core + 4GB RAM)? That is really surprising to me, thought PostreSQL needs more to run normally .. Is there something on the Internet which can give me more info on that (testing it myself is quite time-consuming ...) ...

  • Can't state the same about MongoDB Atlas on Azure. As mentioned in the post, we had quite a few random node failovers, and anyway, every scheduled cluster maintenance requires such ... I would not take a db which does not have a well-tested failover capability, as in the cloud everything can happen any time ...

What about pricing PostgreSQL vs MongoDB Atlas on Azure? Not that the latter is cheap, but I expect the former to be more expensive ..

Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov
Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov • Edited

Regarding AccountingTransaction, you can use PostgreSQL to store and query JSON and it's also faster than MongoDB from what I read postgresqltutorial.com/postgresql-...

And especially for accounting data updates and queries, breaking AccountingTransaciton to relational tables will provide huge benefits down the road. But even if you don't want do to that, you get the best of both worlds with PostgreSQL.

That being said, accounting and banking are probably the golden use cases for relational databases.

Thread Thread
 
deyanp profile image
Deyan Petrov

Not really, I used to think the same, but when it comes to concepts like Aggregate Root (and even Event Sourcing) storing the whole aggregate root (or event) as 1 single document (with 1 single db insert, no db trx) in the Write Model (CQRS) is unbeatable, IMHO.

Thread Thread
 
deyanp profile image
Deyan Petrov

@kyrylkov , you seem to know PostreSQL, I just took a look at the Azure hosting for PostreSQL, and stumbled upon some big limitations - e.g. are the connections really that limited (2 vcore instance on Azure allows only 100-145 user connections max??), and I need to use something like pgBouncer in between? Asking because one of my M20 instances has currently 625 open connections out of 3000 possible, and it does not even sweat ...

Thread Thread
 
deyanp profile image
Deyan Petrov • Edited

and one more - does PostreSQL have something like MongoDB Change Streams (which we use heavily)?

Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov
Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov • Edited

Regarding hosting options. Look at the difference in pricing between PostgreSQL and MongoDB scalegrid.io/pricing.html

  1. MongoDB is more expensive. This is due to MongoDB licensing, which puts any 3rd party provider into a disadvantaged posititon in comparison with MongoDB's own Atlas.
  2. You will end up with much larger data size in MongoDB in comparison with PostgresSQL). So in the end you may end up paying times more for MongoDB and still get slightly worse performance even for JSON storage.
  3. With MongoDB Atlas you're locked in to more expensive cloud providers (AWS, GCP, Azure). Because if you look at ScaleGrid, you'll see that DigitalOcean and Linode are significinatly more cost effective, but they are not available for Atlas.

So the question you have to answer is whether actual or perceived benefits you get with MongoDB are really worth it.

Thread Thread
 
deyanp profile image
Deyan Petrov • Edited

Don't believe the ScaleGrid website. Not only the pricing is off by 2x - e.g. I am paying for M20 cluster on Azure (2 cores, 4 Gb RAM) with 128Gb (500 IOPS provisioned, burstable up to 3500) around $250 (vs. $486 on the ScaleGrid website) but also the IOPS are incorrect ... And their competitor comparison page is completely outdated (from 2018) ...

Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov

The provide their services on top of multiple cloud providers, so of course, it's not what the cloud providers offer directly. But if you check directly cloud provider prices, they are correlated. For instance, heck for DigitalOcean charges for hosted MongoDB vs hosted PostgreSQL.

Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov • Edited

ScaleGrid provides their services on top of multiple cloud providers so you can easily migrate between them. Naturally ScaleGrid pricing is higher than what the cloud providers offer directly.

Check what DigitalOcean charges for hosted MongoDB vs hosted PostgreSQL. MongoDB is still more expensive and MongoDB data takes more space than PostgreSQL. There is no way around restrictive MongoDB licensing that stiffles competition. Before MongoDB changed their license there was mLab and many people used it. After the licensing change, mLab business went under and MongoDB acquihired it.

Thread Thread
 
deyanp profile image
Deyan Petrov • Edited

Sorry, was comparing my MongoDB Atlas charges vs what ScaleGrid would charge for PostreSQL on Azure.

If I compare PostreSQL vs MongoDB via ScaleGrid on Digital Ocean (for 3 node cluster in both cases, 2 cores, 4GB RAM) I see $140 vs $160, which does not sound like a big difference to me (also not a big difference compared with the $250 on MongoDB Atlas/Azure with more storage).

What I am not sure about PostreSQL though are these connection limits which I asked you about above, and how well it handles failovers from one node to another in case of VM failure, or simply maintenance etc.

Thread Thread
 
deyanp profile image
Deyan Petrov • Edited

of course, I am not aware of Digital Ocean's VM instance sizes, and whether their 2 core / 4 GB is fully dedicated/100% provisioned or burstable like in case of Azure ... if fully dedicated/not burstable then that would be a good cost advantage in favor of Digital Ocean, which is however anyway not a hosting provider I am looking at due to many other PaaS services we are after ..

Thread Thread
 
kyrylkov profile image
Sergiy Kyrylkov • Edited

Azure seems to suffer huge issues with disc performance (post from Nov 10, 2020) bunnyshell.com/blog/aws-google-clo...

DigitalOcean trumps all of the big 3, but Azure stands out in a bad way.

Atlas IOPS stats for Azure confirm this.