DEV Community

Cover image for Data Engineering
Muhammed Jimoh
Muhammed Jimoh

Posted on

Data Engineering

Week5 of the Data Engineering Zoomcamp featuring Sejal Vaidya, Ankush Khanna, and Alexey Grigorev by #DataTalksClub.

Week 5 repo ➑️ Github

πŸ›  Tools
🧡 Apache Spark
🧡 Apache Hadoop
🧡 Google VMs

Apache Airflow

🏹Week 5 (Batch Processing) Summary:

🎯 Streaming vs. Batch Processing: Advantages and Disadvantages.
🎯 Theoretical and Practical understanding of Batching Processing.
🎯 Bash scripting
🎯 Anatomy of Spark
🎯 RDDS
🎯 Connecting Spark to BigQuery
🎯 Setting up Dataproc cluster

Apache Spark

What a beautiful Monday to begin the week.

Top comments (0)