DEV Community

# dataengineering

Posts

πŸ‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How Pandas Simplifies ETL Data Cleaning

How Pandas Simplifies ETL Data Cleaning

Comments
6 min read
Git Branching and Merging: A Data Engineer's Guide to Efficient Code Management

Git Branching and Merging: A Data Engineer's Guide to Efficient Code Management

Comments
5 min read
Stop Using CSVs in Big Data: Here's Why You Should Learn Apache Iceberg

Stop Using CSVs in Big Data: Here's Why You Should Learn Apache Iceberg

Comments
1 min read
5 Beginner-Friendly Projects to Learn Data Engineering (Using Free Tools)

5 Beginner-Friendly Projects to Learn Data Engineering (Using Free Tools)

Comments
1 min read
What is Exploratory Data Analysis?

What is Exploratory Data Analysis?

Comments
1 min read
Number Non-Null Values in Order within the Group β€” From SQL to SPL #17

Number Non-Null Values in Order within the Group β€” From SQL to SPL #17

6
Comments 1
2 min read
πŸ•ΈοΈ Why Web Scraping in Python Is a Must-Have Skill in 2025 🐍

πŸ•ΈοΈ Why Web Scraping in Python Is a Must-Have Skill in 2025 🐍

Comments
1 min read
🚿 Data Cleaning in Python, 2025 Edition

🚿 Data Cleaning in Python, 2025 Edition

Comments
1 min read
What is Data Scraping?

What is Data Scraping?

Comments
2 min read
InsightFlow Part 1: Building an Integrated Retail & Economic Data Pipeline - Project Introduction

InsightFlow Part 1: Building an Integrated Retail & Economic Data Pipeline - Project Introduction

Comments
4 min read
πŸš€Lakehouses Demystified: The Future of Data is Here!

πŸš€Lakehouses Demystified: The Future of Data is Here!

1
Comments 1
3 min read
Understanding Data Pipelines: The Backbone of Modern Data Systems

Understanding Data Pipelines: The Backbone of Modern Data Systems

1
Comments
3 min read
Time of YAML/JSON for data engineer

Time of YAML/JSON for data engineer

1
Comments
2 min read
Distributed Model Serving Patterns

Distributed Model Serving Patterns

Comments
4 min read
Building Automated Data Reports from Supabase with GitHub Actions and R Markdown

Building Automated Data Reports from Supabase with GitHub Actions and R Markdown

Comments
12 min read
Matplotlib For Data Visualization

Matplotlib For Data Visualization

Comments
1 min read
Getting Values from Multiple Format Strings to Multiple Records β€” From SQL to SPL #16

Getting Values from Multiple Format Strings to Multiple Records β€” From SQL to SPL #16

6
Comments 1
2 min read
How to treat secure data on lakehouse

How to treat secure data on lakehouse

1
Comments
3 min read
Data Engineering: The Hero Behind Smart Data Decisions

Data Engineering: The Hero Behind Smart Data Decisions

Comments
1 min read
The Ethics of Data Science: Balancing Innovation and Privacy

The Ethics of Data Science: Balancing Innovation and Privacy

1
Comments
4 min read
Building a Self-Optimizing Data Pipeline

Building a Self-Optimizing Data Pipeline

Comments
3 min read
Automating Cryptocurrency Data with Python, Apache Airflow and PostgreSQL

Automating Cryptocurrency Data with Python, Apache Airflow and PostgreSQL

Comments
3 min read
Architecting High-Performance Data Pipelines with Modern ETL | Spiral Mantra

Architecting High-Performance Data Pipelines with Modern ETL | Spiral Mantra

Comments
1 min read
Creating a new Airbyte connector from scratch

Creating a new Airbyte connector from scratch

Comments
4 min read
How to Optimize SQL Queries for Speed and Efficiency

How to Optimize SQL Queries for Speed and Efficiency

2
Comments
5 min read
loading...