How massive? Why not a relational DB like PostgreSQL? I'm not saying is the right choice but what was the process of elimination?
You can store huge quantities of data, data analysis can be performed (on the live version or even better on a read replica), full text is supported and if it's too limited you can still use Elasticsearch just for search.
It's definitely easier to setup and handle than Cassandra...
How massive? there is no specific number but what I can tell we may be asked to get all what we can get from major newspaper sites like BBC, CNN, ....,etc plus some other blogs and news sites.
Why not PostgreSQL (or relational DB)? Actually, there is no reason and I am currently looking at CitusData as an option. Another option is PostgreSQL-XL.
The concerns to the relational are the size of data, how it easy to scale and add new nodes, and high availability which are provided by NoSql databases by default. That is why we give NoSql DBs a higher periority.
How massive? there is no specific number but what I can tell we may be asked to get all what we can get from major newspaper sites like BBC, CNN, ....,etc plus some other blogs and news sites.
I would consider an alternative to PostgreSQL in the hundreds of millions but even then, it depends on what you do with the data :D
Why not PostgreSQL (or relational DB)? Actually, there is no reason and I am currently looking at CitusData as an option.
I've heard about it from some colleagues, check what limitations you have because it's not exactly like PostgreSQL. Just checked the website, they have been purchased by Microsoft eheh
The concerns to the relational are the size of data, how it easy to scale and add new nodes, and high availability which are provided by NoSql databases by default. That is why we give NoSql DBs a higher periority.
Gotcha, obviously keep in mind the tradeoffs.
In any case I would separate the search, due to size requirements, from the "single source of truth" DB
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
How massive? Why not a relational DB like PostgreSQL? I'm not saying is the right choice but what was the process of elimination?
You can store huge quantities of data, data analysis can be performed (on the live version or even better on a read replica), full text is supported and if it's too limited you can still use Elasticsearch just for search.
It's definitely easier to setup and handle than Cassandra...
How massive? there is no specific number but what I can tell we may be asked to get all what we can get from major newspaper sites like BBC, CNN, ....,etc plus some other blogs and news sites.
Why not PostgreSQL (or relational DB)? Actually, there is no reason and I am currently looking at CitusData as an option. Another option is PostgreSQL-XL.
The concerns to the relational are the size of data, how it easy to scale and add new nodes, and high availability which are provided by NoSql databases by default. That is why we give NoSql DBs a higher periority.
I would consider an alternative to PostgreSQL in the hundreds of millions but even then, it depends on what you do with the data :D
I've heard about it from some colleagues, check what limitations you have because it's not exactly like PostgreSQL. Just checked the website, they have been purchased by Microsoft eheh
Gotcha, obviously keep in mind the tradeoffs.
In any case I would separate the search, due to size requirements, from the "single source of truth" DB