DEV Community

Cover image for The Data Engineer Roadmap πŸ—Ί
Bobby Iliev
Bobby Iliev

Posted on β€’ Originally published at devdojo.com

35 5 1

The Data Engineer Roadmap πŸ—Ί

Introduction

This has been inspired by the Full Stack Developer's Roadmap post written by @ender_minyard πŸ™Œ

With the ever growing data volumes and demands, the data engineering career has been one of the fastest growing jobs for the past few years.

According to the 2021 Stack Overflow survey, data engineers are one of the top 5 highest paid professionals right after SREs and DevOps engineers:

If you are looking to become a data engineer, here are some resources for data engineering that you can save for later.

Table Of Contents

  • πŸ’» Fundamentals
  • πŸ‘©β€πŸ’» Programming basics
  • πŸ§ͺ Testing
  • πŸ“Š Database Fundamentals
  • 🏠 Data warehouses
  • πŸ“¦ Object storage
  • ⚑ Data processing
  • πŸ“© Messaging
  • πŸ’½ Cluster computing
  • ⏲ Workflow Scheduling
  • πŸ“Ί Monitoring data pipelines
  • πŸ‘¨β€πŸ’» Infrastructure as Code
  • πŸ›« CI/CD

πŸ’» Fundamentals

Having a solid understanding of the Linux operating system could be a must in many IT related roles. You are going to benefit a lot if you know the basics of the following:

πŸ‘©β€πŸ’» Programming basics

As with any IT related role it is essential to have fundamental knowledge of programming in general. The programming language itself does not matter that much, but you need to have good understanding of programming paradigms and best practices.

πŸ§ͺ Testing

πŸ“Š Database Fundamentals

Having a solid understanding of SQL, data normalization and ACID transactions is a must for all data engineers.

Relational Databases

Non-relational databases

🏠 Data warehouses

πŸ“¦ Object storage

⚑ Data processing

Batch

Hybrid

Streaming

πŸ“© Messaging

πŸ’½ Cluster computing

⏲ Workflow Scheduling

πŸ“Ί Monitoring data pipelines

πŸ‘¨β€πŸ’» Infrastructure as Code

πŸ›« CI/CD

Conclusion

This was inspired by the Data Engineer Roadmap open source repository here:

https://github.com/datastacktv/data-engineer-roadmap

I wanted to build upon the roadmap and provide a list of resources for each topic.

Let me know if I've missed anything! Hope you find this useful and make sure to keep learning πŸ™Œ

You can follow me on Twitter at: @bobbyiliev_

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry πŸ•’

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more β†’

Top comments (4)

Collapse
 
ruanbekker profile image
Ruan Bekker β€’

Bobby, you never fail to impress! Amazing post! πŸŽ‰

Collapse
 
bobbyiliev profile image
Bobby Iliev β€’

Thank you Ruan! Really appreciate this πŸ™Œ

Collapse
 
mccurcio profile image
Matt Curcio β€’

Really excellent content!
Thanks

Collapse
 
bobbyiliev profile image
Bobby Iliev β€’

Thank you! Really appreciate this!

Billboard image

Try REST API Generation for MS SQL Server.

DreamFactory generates live REST APIs from database schemas with standardized endpoints for tables, views, and procedures in OpenAPI format. We support on-prem deployment with firewall security and include RBAC for secure, granular security controls.

See more!

πŸ‘‹ Kindness is contagious

Please leave a ❀️ or a friendly comment on this post if you found it helpful!

Okay