DEV Community 👩‍💻👨‍💻

Michel's fanboi
Michel's fanboi

Posted on

Path to become a junior+ data engineer?


I'm an I.T. student and I'd like to work as a data engineer but I'm like a fish lost in an ocean of big data tools.

First of, I've got a strong Web background, mainly doing back-end stuff such as building and deploying kind of micro-services around the internet. But what I like most is to work with data, Big Data.

But I don't know where to start. Today I'm quite confident with Apache Beam, SQL/NoSQL, Messaging Queues, Cloud solutions... but I feel like it's nothing compared to the great diversity of Big Data tools.

Should I go for Open-Source stuff such as Kafka, Cassandra, HDFS etc, or should I focus on the Cloud side (Cloud Dataflow, AWS EMR, Pub/Sub, Kinesis...) ?

I'd appreciate any help ;)

Top comments (1)

kerriop profile image

Try to setup your first hadoop cluster(powered by azure/aws), then use clustered database(hive or another) for your regular tasks, then you'll get the basics of big data tools

⬇️ The only reason people scroll to the bottom... because they want to read more. Create an account to bookmark, comment, and react to articles that interest you.