DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Reading CSVs with varying column counts that pandas cannot read using DuckDB

Reading CSVs with varying column counts that pandas cannot read using DuckDB

1
Comments
3 min read
Working with Apache to automate collection of Weather data for Kenya’s major Agricultural Areas

Working with Apache to automate collection of Weather data for Kenya’s major Agricultural Areas

Comments
5 min read
Building a Data Career: The Skills That Truly Matter

Building a Data Career: The Skills That Truly Matter

10
Comments
5 min read
You Can't Trust COUNT and SUM: Scalable Data Validation with Merkle Trees

You Can't Trust COUNT and SUM: Scalable Data Validation with Merkle Trees

2
Comments 1
8 min read
Unable to emit metadata to DataHub GMS with Airflow - a solution

Unable to emit metadata to DataHub GMS with Airflow - a solution

Comments
4 min read
Snowflake RBAC 101 – Episode 2: Role Hierarchies & Least Privilege

Snowflake RBAC 101 – Episode 2: Role Hierarchies & Least Privilege

Comments
1 min read
Lightweight ETL with AWS Glue Python Shell, DuckDB, and PyIceberg

Lightweight ETL with AWS Glue Python Shell, DuckDB, and PyIceberg

5
Comments 1
7 min read
PyIceberg on AWS Lambda: Comparing GlueCatalog and REST Catalog Access Methods

PyIceberg on AWS Lambda: Comparing GlueCatalog and REST Catalog Access Methods

2
Comments
3 min read
The Rise of Real-Time Data: Why Batch Might Be Fading

The Rise of Real-Time Data: Why Batch Might Be Fading

10
Comments
3 min read
📚 A Complete Guide to Data Science Courses: How to Choose, What to Learn, and Where to Begin

📚 A Complete Guide to Data Science Courses: How to Choose, What to Learn, and Where to Begin

Comments
5 min read
Engineering with SOLID, DRY, KISS, YAGNI and GRASP

Engineering with SOLID, DRY, KISS, YAGNI and GRASP

1
Comments
16 min read
Three Formats Walk into a Lakehouse: Iceberg, Delta and Hudi in a Local Setup You Can Run on Your Laptop

Three Formats Walk into a Lakehouse: Iceberg, Delta and Hudi in a Local Setup You Can Run on Your Laptop

10
Comments 4
16 min read
Which is Best for Real Time Dashboards: Airbyte, Fivetran, or Estuary

Which is Best for Real Time Dashboards: Airbyte, Fivetran, or Estuary

1
Comments
6 min read
Personal Picks: Data Product News (July 9, 2025)

Personal Picks: Data Product News (July 9, 2025)

Comments
6 min read
Key Concepts Every Data Engineer Should Master

Key Concepts Every Data Engineer Should Master

4
Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.