Week5 of the Data Engineering Zoomcamp featuring Sejal Vaidya, Ankush Khanna, and Alexey Grigorev by #DataTalksClub.
Week 5 repo β‘οΈ Github
π Tools
π§΅ Apache Spark
π§΅ Apache Hadoop
π§΅ Google VMs
πΉWeek 5 (Batch Processing) Summary:
π― Streaming vs. Batch Processing: Advantages and Disadvantages.
π― Theoretical and Practical understanding of Batching Processing.
π― Bash scripting
π― Anatomy of Spark
π― RDDS
π― Connecting Spark to BigQuery
π― Setting up Dataproc cluster
What a beautiful Monday to begin the week.


Top comments (0)