If you don't benefit from a cluster for the transformation (which you should definitely investigate), you could write an application on the basis of Akka Streams.
It features multiple Apis to build computation streams and graphs. They provide many transformation operations with different levels of power. If you need even more flexibility, you can use actors as a last resort.
Many connectors are available via Alpakka, so there's a good chance that integration with your origins/targets is quite easy. developer.lightbend.com/docs/alpak...
If you can justify running your solution on a cluster, Apache Spark might be what you're looking for. Once you have access to your data in form of an RDD, DataFrame or DataSet, you can treat it almost like a collection or a sql table.
You have a multitude of functional operations available, some of which are specifically designed to run on a cluster and minimize shuffling (transferring large amounts of data between nodes).
If you don't benefit from a cluster for the transformation (which you should definitely investigate), you could write an application on the basis of Akka Streams.
doc.akka.io/docs/akka/2.5.4/scala/...
It features multiple Apis to build computation streams and graphs. They provide many transformation operations with different levels of power. If you need even more flexibility, you can use actors as a last resort.
Many connectors are available via Alpakka, so there's a good chance that integration with your origins/targets is quite easy.
developer.lightbend.com/docs/alpak...
If you can justify running your solution on a cluster, Apache Spark might be what you're looking for. Once you have access to your data in form of an RDD, DataFrame or DataSet, you can treat it almost like a collection or a sql table.
You have a multitude of functional operations available, some of which are specifically designed to run on a cluster and minimize shuffling (transferring large amounts of data between nodes).
spark.apache.org/