DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
SQL: Doing GROUP BY in CsvPath

SQL: Doing GROUP BY in CsvPath

Comments
5 min read
🔥 Day 3: RDDs - The Foundation of Spark

🔥 Day 3: RDDs - The Foundation of Spark

Comments
2 min read
🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified

🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified

Comments
2 min read
The Developer's Guide to Normalizing Historical Airline Flight Data for Machine Learning

The Developer's Guide to Normalizing Historical Airline Flight Data for Machine Learning

Comments
6 min read
Overview of Real-Time Data Synchronization from MySQL to VeloDB

Overview of Real-Time Data Synchronization from MySQL to VeloDB

5
Comments
5 min read
Stop Writing df.describe(): Automate EDA with D-Tale (The Lazy Engineer's Way)

Stop Writing df.describe(): Automate EDA with D-Tale (The Lazy Engineer's Way)

Comments
3 min read
CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

Comments
5 min read
🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

Comments
2 min read
When models suggest deprecated Pandas APIs: a small mistake that cascades

When models suggest deprecated Pandas APIs: a small mistake that cascades

Comments
3 min read
Marmot: Data catalog without the complex infrastructure

Marmot: Data catalog without the complex infrastructure

1
Comments
3 min read
TDD for dbt: unit testing the way it should be

TDD for dbt: unit testing the way it should be

2
Comments
12 min read
When code-gen suggests deprecated Pandas APIs: a case study in subtle breakage

When code-gen suggests deprecated Pandas APIs: a case study in subtle breakage

Comments
3 min read
Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Comments
5 min read
HackerRank 'The Pads' MySQL

HackerRank 'The Pads' MySQL

Comments
3 min read
🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API

🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.