re: MongoDB has no use case VIEW POST

TOP OF THREAD FULL DISCUSSION
re: You mean, from the point of view of scaling? Otherwise, I guess MongoDB too would need some tuning that only an experienced DBA can do? Largely t...

Sorry for the late reply!

or you could use a document store - leave the objects in the database as they are and in your model class in your code you just add the birthday -field with a NULL value as the default.

I don't really see how missing fields are better than fields with NULL values. Ultimately, in the code, they'll both respond to logical comparisons equally nicely. 👀

If you make a script that runs through all the entries in your DB and does that in code, then you have much more control over the rate and other such details.

Again, I see no reason why we can't do this in a relational database.

Your distributed lock example, though, hits right home. I don't think this is something we'll ever manage in something like MySQL. Even read locks are tricky, from what I gather.

Personally I find that the DB choice matters fairly little in the end [...]

Honestly, I'm beginning to think that way, and the more I dive into discussions by senior developers, the more I see that foreign keys and other "nice" constraints are actually a stumbling block in the long run.

In the end, though, I have a nightmare scenario that keeps me from embracing document databases. Let's take the classic example of a blog: you store the posts and user's comments as an aggregate, and it makes a lot of sense. Imagine, though, that the management wants to know which comments were posted between 4 pm and 8 pm on a certain date. If it's a document store, I'll have to scan through all the aggregate fields. 😭

BUT, I'm also not sure if a SQL BETWEEN is going to be very useful as the number of rows hits millions.

Thoughts? :-)

I don't really see how missing fields are better than fields with NULL values. Ultimately, in the code, they'll both respond to logical comparisons equally nicely. 👀
If you make a script that runs through all the entries in your DB and does that in code, then you have much more control over the rate and other such details.

Missing value does not always equal NULL, and you might want your default to be different from NULL, while having the field nullable. Lots of cases. Also it's not as much about "not being able to", as I believe nowadays MOST SQL databases can do live schema changes without the whole database freezing for a long time, but more about habit.. RDBMS users tend to just think differently.

Your distributed lock example, though, hits right home. I don't think this is something we'll ever manage in something like MySQL. Even read locks are tricky, from what I gather.

MySQL can function as a nice host for that to some extent, with GET_LOCK(), but it can't scale past one server easily. Really you want something more like etcd.

Let's take the classic example of a blog: you store the posts and user's comments as an aggregate, and it makes a lot of sense. Imagine, though, that the management wants to know which comments were posted between 4 pm and 8 pm on a certain date. If it's a document store, I'll have to scan through all the aggregate fields.

There is such a thing as the management being wrong, and you, as the expert, being required to tell them they're wrong.

Also, you can design your schema wrong in any database, why is it the fault of the database that you made a mistake? If this is a change in requirements, why is it out of the question to spend development effort to change the data structure?

Lastly, you should pick the right tools for the job, not try to do everything in the one tool you happened to choose - in case of a database if you have read-only replicas for analytics etc. you can pretty safely run even heavy queries there (MongoDB can probably do more complicated queries than MySQL with less effort due to it's aggregation and map-reduce systems), but really when you're starting to do more complicated searches you should use a system designed for complicated searches, such as ElasticSearch.

There's also denormalization for lazy (and smart) people, just save the data twice in different structures, so it can be fetched optimally in your different use-cases.

code of conduct - report abuse