DEV Community

Discussion on: Things I Learned Building an Analytics Engine

Collapse
 
dougblackjr profile image
Doug Black

Thank you! I went with Percona off the bat for this one...it's just soooooo fast.

Tell me more about the key-value solutions! This may be just what I'm looking for!

Collapse
 
bgadrian profile image
Adrian B.G.

At an abstract level:

Getting rid of the relationships, and using simple documents, you can shard better, with specific Storages like Cassandra.

Sharding an SQL, most of the times, it requires to get rid of the relationships and Joins. Even if it does not, it will add an overhead because it will query and group data from different shards, in a cascading effect.

If the "sharding" algorithm has to take into consideration data relationships, and wants to keep data as local as possible, then you will have "Hot" spots and unbalanced shards.

I don't say it is impossible to scale SQL, I say that it will be harder and more expensive, if you can afford Spanner from Google or a big setup of Vintess, or 5-8 big servers behind a Galera go ahead!

Bottom line, if you want to go beyond a few TBs of data, I would suggest rethink your structure in a Columnar way, and less SQLish.