Week5 of the Data Engineering Zoomcamp featuring Sejal Vaidya, Ankush Khanna, and Alexey Grigorev by #DataTalksClub.
Week 5 repo β‘οΈ Github
π Tools
𧡠Apache Spark
𧡠Apache Hadoop
𧡠Google VMs
πΉWeek 5 (Batch Processing) Summary:
π― Streaming vs. Batch Processing: Advantages and Disadvantages.
π― Theoretical and Practical understanding of Batching Processing.
π― Bash scripting
π― Anatomy of Spark
π― RDDS
π― Connecting Spark to BigQuery
π― Setting up Dataproc cluster
What a beautiful Monday to begin the week.
Top comments (0)