DEV Community

Michel's fanboi
Michel's fanboi

Posted on

Path to become a junior+ data engineer?

Henlo!

I'm an I.T. student and I'd like to work as a data engineer but I'm like a fish lost in an ocean of big data tools.

First of, I've got a strong Web background, mainly doing back-end stuff such as building and deploying kind of micro-services around the internet. But what I like most is to work with data, Big Data.

But I don't know where to start. Today I'm quite confident with Apache Beam, SQL/NoSQL, Messaging Queues, Cloud solutions... but I feel like it's nothing compared to the great diversity of Big Data tools.

Should I go for Open-Source stuff such as Kafka, Cassandra, HDFS etc, or should I focus on the Cloud side (Cloud Dataflow, AWS EMR, Pub/Sub, Kinesis...) ?

I'd appreciate any help ;)

Discussion (1)

Collapse
kerriop profile image
Dmitry

Try to setup your first hadoop cluster(powered by azure/aws), then use clustered database(hive or another) for your regular tasks, then you'll get the basics of big data tools