DEV Community

loading...

Discussion on: PostgreSQL vs MongoDB

Collapse
chiefnoah profile image
Noah Pederson • Edited

I've worked with both MongoDB and PostgreSQL quite extensively (it's been awhile since I've used MongoDB seriously though). There are almost no cases where you should use a non-relational datastore over a relational store. You will regret it later if you chose the former. It's a little known fact that one of the most important parts of designing a system is fleshing out your data model. That doesn't mean knowing every detail, but thinking through your data enough to define a well-structured model. Software 'joins' are also significantly worse than JOIN, both in performance and brittleness. I'd really like to hear if anyone can legitimately think up a case where a non-relational datastore is better in some way than a relational, because I've thought about it on multiple occasions and have never come up with anything.

Collapse
greenroommate profile image
Haris Secic

Agree that data structure is important. However dissagree with almost no case. My data structure is represnted as relationship but actual relationship in Neo4j where it works much faster for searching and doing actual related stuff. SQLs are called relational but are not so much that. Most bottlenecks I faced were due to having many to many combined with other joins as "relational" dbs didn't offer arrays or embedded objects and so. Now that might be just 10% of project structure but unavoidable 10% being used all the time. Rest of the structure made way more sense in relational ones but at the cost of having loading icons in many places once user reaches most usefull features of the project like "I click here and it updates 15 different data types"

Collapse
chiefnoah profile image
Noah Pederson

I would consider graph databases their own thing, separate from traditional document (non-relational) and relational stores. But only very specific types of data fit properly into a graph database (GIS data, for example). If you're choosing between a document store and a relational store, there's a clear winner, which is the point I'm trying to make.

Also you are incorrect in saying that relational DBs don't support embedded objects, PostgreSQL does support embedded objects, either in the form of K/V, JSON (natively supported), and BLOBs. At least in the case of JSON, you can index on the objects too.

Thread Thread
greenroommate profile image
Haris Secic

Not read properly. They didn't, they do now and it makes no sense to store data in it which might varry in structure just becuse they did some support for it. Why use C for web API? Or even Rust. I don't care if I can it's much easier with others. Same goes for DB. Each DB has some purpose and forcing one type doesn't cut it for me. There's no clear winner. Just improper usage. You wanna store site clicks, and search requests body? Why the hell would I bother with structure, I just need it to push data so someone plugs a tool in it and do they're casual analitics which they probably delete on monthly basis or so.

Collapse
achix profile image
Achilleas Mantzios • Edited

Hi Haris, PostgreSQL had arrays like forever, some 20 yrs ago I built a genealogical-based indexing (tree-like), based on arrays and int_array contrib (which also was there since 2000's) - and besides ltree being in the core distro already.
In my first emails to the pgsql-sql list , I was almost advised to stay away from arrays since they were designed for internal relations and structures, but I didn't listened to them.. still happily running the app based on GIN indexes .

Thread Thread
greenroommate profile image
Haris Secic

That's the difference between you and me. I only use whats actually considered standard and what's considered ok or good to use. I honestly don't think people get how easy it is to setup MongoDB with all that working and no customisation required. When I picked MongoDB for Geo stuff first I found benchmarks old a year which showed faster MongoDB geo search on non indexed data in comparison to PostgreSQL and it was least old benchmark. Others were couple of years old. Second, no extra config was required to use geo tools while PostgreSQL demanded some plugin enabling or whatever. Couldn't care less if it's doable out of the box because I automated mongo in 5 minutes and it gets the job done. Works with +20k polygons quite fast. Storing sensor values which could also contain objects or array is straightforward. Why would I bother with schema? I have no benefit of it nor do I care if this DB has also support for this or that. MS SQL and Oracle if I'm not mistaking have graph like support. Probably so does PostgreSQL or will have. Why would I use it instead of Neo4j? Because some enterprise dudes told me so? I store measurements in there which have different fields/properties. Then I can aggregate it regardless of different structure. Why would I use Postgres' JSON? I mean it's by default json-like why others have need to enforce using relational stuff?

I'm not fully against using some stuff out of standard but SQL-s great for some stuff and I would like to use others that are great for those other things. Basically MongoDB has been serving me quite well and it's well hated just like Java. And frankly people still use it

Thread Thread
achix profile image
Achilleas Mantzios • Edited

Ok. arrays might not be advertised by some emails I got in 2001, but they were standard, documented and fully supported already by the same year 2001, so it was pretty legit to use, maybe the guys who answered were relational purists, I cannot know, which year are you referring to in your comment about "relational" dbs not having arrays?
What's actually considered "standard" for most people is running Oracle or MS SQL Server. You want a second definition of "standard"? It is standard to spend SUBSTANTIAL time resources in advance for design and planning , every time my programmers talk to me about "5 minutes" I get terrified about what happens in 1 year or so. Dont get me wrong, bad code and bad practices are also super possible with (postgre|my|ms|DB2)sql and so forth, heck even hibernate terrifies me as I see those awfull queries in the pgbadger reports. But at least by a good design (and LOTS OF DB CONSTRAINTS / triggers/automations/integrity checks) you are doing your best about the durability, reliability and credibility of your system and happiness of your bosses. I am not familiar neither with mongodb geospatial nor with PostGIS, but I'd say here that nothing is free in life, if mongo makes it appear so appealing against a system which has been so heavily looked after and developed then I'd place my bets that there would be surprises sometime in the road ahead.
PS
Does mongo give you some way e.g. to disable ovelapping polygons? Like it is forbidden that two different non-affiliated owners have pieces of land that overlap one another?

Thread Thread
greenroommate profile image
Haris Secic

I don't know nor did require such feature. I only seek where do given coordinates fit in. But good thing to investigate thanks fot the question.

5 minutes for dev purpose in the cloud. Developers were not born yesterday neither was I. No need to point the obvious that thing can go wrong and nothing is free. It's as simple as it works now and there's no proof of it falling later. I know some people would love it to fail to prove their point but I'm not like that, I'm not toxic person from stack overflow. I actually use stuff if it works and don't care about the negativity because in the end of the day it all breaks and people make it work with a lot of weird things so only way to proove something is bad is when no one made it work.

Collapse
elmuerte profile image
Michiel Hendriks

Not to forget that PostgreSQL supports JSON and XML as a datatype. So where suitable you can just store JSON or XML in a table.
You can even create indexes on directly on the JSON data. (Or you can use index functions like you can do with XML data.)

Collapse
sirseanofloxley profile image
Sean Allin Newell

Splunk logs - logs and events are inherently non relational and perform much faster.

This does not generally apply to software, and in general, I would agree that non relational is almost never a good long term option for an app with any reasonable domain it is trying to model.

Collapse
rhymes profile image
rhymes • Edited

I would argue that a time series DB like InfluxDB is a better storage for timestamped logs and events than a document DB is.

Collapse
elmuerte profile image
Michiel Hendriks • Edited

A big part of logs are relational. It has a timestamp, host, application, log level, a message, and commonly a "category". The other data is less well structured between the various log event producers, lets call this "meta".
This meta data you could store in a less explicitly structures relation. It's usually a simple key->value structure anyway. For that you could use PostgreSQL's hstore.

But... I do not know if I would use PostgreSQL to store log events. Although PostgreSQL has native sharding these days. Setting up a distributed PostgreSQL farm is way more complicated. Log events are a good candidate for eventually consistent. Or even, never consistent is also acceptable. It is mostly appending entries, pruning old records, and occasionally performing a query. ACID is also no strong requirement.

Thread Thread
sirseanofloxley profile image
Sean Allin Newell • Edited

Nah, none of those fields are relational imo. Relational fields are like UserId, EventId, ParentId, etc. Pointers to other things.

If you want to build a relational logging platform be my guest, but I suspect there's a good reason why most of the big players in logging use NoSql or InfluxDB.

Collapse
chiefnoah profile image
Noah Pederson

Logs are indeed a case where I think this applies. MongoDB is absolutely the wrong tool to use for that though, there are much better data stores out there that properly handle timeseries data well.

Collapse
marcosvafilho profile image
Marcos Filho

You will regret it later if you chose the former.

Quite literally my very short experience with NoSQL. It was awesome at first, honestly, but quickly turned out to be a huge mess for the 2 (two) projects I got involved.

I'm sure I could have been involved in better NoSQL situations but so far I just didn't have that luck.

Collapse
cjbrooks12 profile image
Casey Brooks • Edited

100% this.

So many devs (including myself) got burned when MongoDB came out because it was touted as a silver-bullet data store, when in reality it was created for an entirely different problem than what most projects needed. By the time devs realized MongoDB was causing pain like major performance bottlenecks and difficulties accessing the data they needed, they were already in too deep to fix the issue.

Real-world data is relational, pretty much always. While it's pretty common for single records to have hierarchical data (which is what makes non-relational DBs sound appealing), it's very uncommon for entire datasets to be limited to a bunch of singular records of purely hierarchical data. You will end up making multiple collections and trying to reference them just like relational models, except that you'll have zero help from the database in querying them efficiently.

I'm a strong advocate that you should never start your project with a non-relational data store. It's much easier to add one later on (which will probably never happen) than it is to move from non-relational to relational.

Forem Open with the Forem app