DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building a Robust Data Observability Framework to Ensure Data Quality and Integrity

Building a Robust Data Observability Framework to Ensure Data Quality and Integrity

1
Comments 1
7 min read
Benchmarking Multimodal AI Workloads: Daft vs Spark vs Ray Data

Benchmarking Multimodal AI Workloads: Daft vs Spark vs Ray Data

10
Comments
1 min read
All About Change Data Capture CDC

All About Change Data Capture CDC

1
Comments
6 min read
🚀 Day 17 of My Python Learning Journey

🚀 Day 17 of My Python Learning Journey

Comments
1 min read
Sagas vs ACID Transactions: Ensuring Reliability in Distributed Architectures

Sagas vs ACID Transactions: Ensuring Reliability in Distributed Architectures

1
Comments
11 min read
JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

Comments 1
6 min read
A Beginner’s Journey with PostgreSQL

A Beginner’s Journey with PostgreSQL

2
Comments
3 min read
Break Through Data Silos: Practices of Multi-cloud Observability Integration Based on Object Storage Service (OSS)

Break Through Data Silos: Practices of Multi-cloud Observability Integration Based on Object Storage Service (OSS)

Comments
12 min read
Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration

Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration

1
Comments
9 min read
Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comments
4 min read
Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

1
Comments
3 min read
Apache Gravitino 1.0.0 — From Metadata Management to Contextual Engineering

Apache Gravitino 1.0.0 — From Metadata Management to Contextual Engineering

1
Comments
7 min read
Apache Kafka in Data engineering

Apache Kafka in Data engineering

6
Comments 1
1 min read
How I Built a MongoDB Archiving System for Crawled Data

How I Built a MongoDB Archiving System for Crawled Data

1
Comments 2
7 min read
🧭System Design Roadmap for Data Engineers

🧭System Design Roadmap for Data Engineers

4
Comments
3 min read
Orchestrating and Observing Data Pipelines with Airflow, PostgreSQL, and Polar

Orchestrating and Observing Data Pipelines with Airflow, PostgreSQL, and Polar

2
Comments
3 min read
💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

2
Comments
2 min read
(Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle

(Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle

Comments
6 min read
Building Distributed Systems with Ray—Just Like Running a Restaurant

Building Distributed Systems with Ray—Just Like Running a Restaurant

1
Comments
7 min read
The State of Apache Iceberg v4 - October 2025 Edition

The State of Apache Iceberg v4 - October 2025 Edition

3
Comments
6 min read
ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases

ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases

2
Comments
10 min read
Data Automation: A Deep Dive

Data Automation: A Deep Dive

1
Comments
5 min read
Why Data Partitioning Is Harder Than It Looks

Why Data Partitioning Is Harder Than It Looks

1
Comments
2 min read
Part 2: Snowflake's Autonomous Future

Part 2: Snowflake's Autonomous Future

Comments
8 min read
Collecting Africa’s Energy Insights:

Collecting Africa’s Energy Insights:

3
Comments
4 min read
loading...