DEV Community

Discussion on: What are the most suitable datastores for storing a huge number of articles and news?

Collapse
 
devfanooos profile image
FaN000s

How massive? there is no specific number but what I can tell we may be asked to get all what we can get from major newspaper sites like BBC, CNN, ....,etc plus some other blogs and news sites.

Why not PostgreSQL (or relational DB)? Actually, there is no reason and I am currently looking at CitusData as an option. Another option is PostgreSQL-XL.

The concerns to the relational are the size of data, how it easy to scale and add new nodes, and high availability which are provided by NoSql databases by default. That is why we give NoSql DBs a higher periority.

Collapse
 
rhymes profile image
rhymes

How massive? there is no specific number but what I can tell we may be asked to get all what we can get from major newspaper sites like BBC, CNN, ....,etc plus some other blogs and news sites.

I would consider an alternative to PostgreSQL in the hundreds of millions but even then, it depends on what you do with the data :D

Why not PostgreSQL (or relational DB)? Actually, there is no reason and I am currently looking at CitusData as an option.

I've heard about it from some colleagues, check what limitations you have because it's not exactly like PostgreSQL. Just checked the website, they have been purchased by Microsoft eheh

The concerns to the relational are the size of data, how it easy to scale and add new nodes, and high availability which are provided by NoSql databases by default. That is why we give NoSql DBs a higher periority.

Gotcha, obviously keep in mind the tradeoffs.

In any case I would separate the search, due to size requirements, from the "single source of truth" DB