re: What are the most suitable datastores for storing a huge number of articles and news? VIEW POST


IF you have a good team of experienced SysAdmins/Data Engineers to maintain the clusters:

To do it right and for long term I would choose 3 solutions for 3 problems.
Long term storage something horizontal scalable with replication (cassandra/kafka with streams maybe? )
A nice alternative would be S3 documents.

From "the source of truth" you can move data with NiFi or other solutions to other platforms that can change over time. This is the trick.

ElasticSearch is one option for text search.

Real-time analytics/aggregation: Apache Beam/Spark/Flink.

Once a month heavy duty analytics and discovery: a special database you can put tons of data, extract the report and close it (BigQuery, AWS Athena, Aurora..)

ELSE / you do not have a big team of SRE and DevOps:
managed solutions, I would suggest Google Cloud.


I think we are in a trap :) :) :)

We do not have a team of experienced SysAdmins/Data Engineers and I do not think storing the data outside our data center will be an acceptable choice :) :) :) :).


I would suggest looking for another job, but hey, thats just my trivial oppinion 😀

This situation usually is a signal for a lot more company wide issues.

code of conduct - report abuse