Week5 of the Data Engineering Zoomcamp featuring Sejal Vaidya, Ankush Khanna, and Alexey Grigorev by #DataTalksClub.
Week 5 repo ➡️ Github
🛠 Tools
🧵 Apache Spark
🧵 Apache Hadoop
🧵 Google VMs
🏹Week 5 (Batch Processing) Summary:
🎯 Streaming vs. Batch Processing: Advantages and Disadvantages.
🎯 Theoretical and Practical understanding of Batching Processing.
🎯 Bash scripting
🎯 Anatomy of Spark
🎯 RDDS
🎯 Connecting Spark to BigQuery
🎯 Setting up Dataproc cluster
What a beautiful Monday to begin the week.
Top comments (0)