DEV Community

Cover image for Notes on Data Engineering Zoomcamp 2025 - Launch Stream
Pizofreude
Pizofreude

Posted on

2

Notes on Data Engineering Zoomcamp 2025 - Launch Stream

Overview:

  • Course Edition: Fourth edition of the Data Engineering Zoomcamp.
  • Purpose of Stream: Introduction to the course, syllabus, logistics, team members, and Q&A session.
  • Key Topics Covered:
    • Course structure and syllabus.
    • Introduction to team members.
    • Tools and platforms for learning and communication.
    • Logistics and homework submissions.
    • Importance of community learning.

Course Team:

  1. Victoria:
    • Works at DT Hub and was part of the first Data Engineering Zoomcamp cohort.
    • Covers Analytics Engineering in the course.
  2. Alexey:
    • Founder of DataTalks Club.
    • Previously a data scientist with significant experience in data engineering tools.
    • Covers Docker and Spark modules.
  3. Michael:
    • Senior Data Analyst and teaching assistant for the past two years.
    • Content creator and runs a YouTube channel called "Data Slinger."
    • Assists with content and troubleshooting.
  4. Bruno:
    • Senior Data Engineer at Intuition Machines.
    • Extensive experience in data engineering.
    • Provides guidance and support to participants.
  5. Will and Anna:
    • Work at Castra.
    • Cover Workflow Orchestration (Module 2).
  6. Zach:
    • Staff Data Engineer and instructor in another data engineering bootcamp.
    • Focuses on advanced topics like Flink.
    • Founder of dataexpert.io
  7. Anush and Seal:
    • Part of the original team that launched the Zoomcamp initiative.
    • Anush remains active in supporting the community.

Course Syllabus:

  1. Module 1: Introduction to Docker and Google Cloud setup.
    • Google Cloud offers $300 in free credits for first-time users.
    • AFAIK, GCP offers two type of free trial for its cloud services:
      • Free Trial with $300 credit which requires billing details (all features available)
      • Sandbox option without requirement of billing details (limited features)
    • Since the Sandbox options does allows user to use the services required by this course, I will start with Sandbox option and will consider the Free Trial should the limited features of Sandbox doesn’t comply with the course full requirements.
  2. Module 2: Workflow Orchestration.
    • Covers orchestration tools such as Prefect and Airflow.
  3. Module 3: Data Warehousing.
    • Emphasis on BigQuery and PostgreSQL.
  4. Module 4: Analytics Engineering.
    • Introduction to DBT (Data Build Tool) for SQL transformations.
  5. Module 5: Spark.
    • Focus on distributed data processing.
  6. Module 6: Stream Processing.
    • Includes tools like Kafka and Flink (TBA by Zach).
  7. Workshop: Data Ingestion with DLT (Delta Live Tables).

Logistics:

  • Content Delivery:
  • Homework:
  • Community Support:
    • Slack is the primary communication platform.
    • Participants encouraged to use threads for organized discussions.
    • Learning in public (e.g., posting progress on LinkedIn) is recommended.

Learning in Public:

  • Benefits:
    • Helps participants build their personal brand.
    • Encourages networking and community engagement.
    • Demonstrates growth and dedication.
  • Examples:
    • Sharing project updates or lessons learned on LinkedIn.

Tools and Recommendations:

  • Google Cloud Platform (GCP):
    • Recommended for the course due to its ease of use and free credits.
    • AWS and Azure are also options, but GCP is more straightforward.
  • Additional Tools:
    • Participants are encouraged to explore other platforms and tools beyond the syllabus, such as data governance and scripting with Makefiles and Bash.

Q&A Highlights:

  1. Career Preparation:
    • Prepares participants for roles in data engineering and analytics engineering.
    • Emphasizes project-based learning.
  2. AI and Data Engineering:
    • AI is unlikely to replace data engineers but may enhance productivity.
    • LLM Zoomcamp is pretty much AI for Data Engineering, highly recommended after completing DE Zoomcamp.
  3. Key Advice for Success:
    • Consistency in learning and building projects.
    • Active participation in the community.
    • Sharing work publicly to stand out.
  4. Beginner-Friendly:
    • Suitable for those new to data engineering, even without prior software engineering experience. Chicken and egg problem πŸ˜‰

Final Notes:

  • Contributions:
    • Participants encouraged to contribute to open-source projects via the "Open Source Spotlight" on the YouTube channel.
  • Focus on Projects:
    • Priority given to delivering projects rather than homework for certification.
  • Future Learning Opportunities:
    • Check out related courses like the LLM Zoomcamp for AI-focused topics.

Motivational Message:

  • Stay consistent, actively participate, and leverage the community for support.

Billboard image

Imagine monitoring that's actually built for developers

Join Vercel, CrowdStrike, and thousands of other teams that trust Checkly to streamline monitor creation and configuration with Monitoring as Code.

Start Monitoring

Top comments (0)

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

πŸ‘‹ Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay