DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Part 2: Snowflake's Autonomous Future

Part 2: Snowflake's Autonomous Future

Comments
8 min read
Collecting Africa’s Energy Insights:

Collecting Africa’s Energy Insights:

3
Comments
4 min read
Real-Time Fraud Detection Using Apache Flink

Real-Time Fraud Detection Using Apache Flink

Comments
1 min read
Making JSON Compression Searchable — SEE (Schema-Aware Encoding)

Making JSON Compression Searchable — SEE (Schema-Aware Encoding)

1
Comments
2 min read
Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Comments
5 min read
Apache Iceberg Dev List Digest (Sept 15–19, 2025)

Apache Iceberg Dev List Digest (Sept 15–19, 2025)

Comments
3 min read
Data Engineering with Docker: A Hands-On Guide to Containerization

Data Engineering with Docker: A Hands-On Guide to Containerization

7
Comments 2
3 min read
Mastering MLflow: Managing the Full ML Lifecycle

Mastering MLflow: Managing the Full ML Lifecycle

2
Comments
9 min read
From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks

From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks

3
Comments 1
10 min read
Understanding the Basics of Linux Operating System

Understanding the Basics of Linux Operating System

Comments
1 min read
Why you need to learn Apache Airflow - right now

Why you need to learn Apache Airflow - right now

Comments
3 min read
Building a True Dual-Destination Analytics Pipeline: Real-Time Streaming with S3 Backup and Recovery

Building a True Dual-Destination Analytics Pipeline: Real-Time Streaming with S3 Backup and Recovery

1
Comments
8 min read
Apache Kafka Deep Dive: Concepts, Applications, and Production

Apache Kafka Deep Dive: Concepts, Applications, and Production

Comments
4 min read
Automating NASA’s Astronomy Picture of the Day with Airflow

Automating NASA’s Astronomy Picture of the Day with Airflow

Comments
6 min read
Building Modern Data Systems: Event-Driven Architecture, Messaging Queues, Batch Processing, ETL & ELT

Building Modern Data Systems: Event-Driven Architecture, Messaging Queues, Batch Processing, ETL & ELT

2
Comments
11 min read
A Dive into Apache Iceberg™'s Metadata

A Dive into Apache Iceberg™'s Metadata

Comments
4 min read
Building an Automated YouTube Analytics Dashboard with Airflow, PySpark, MinIO, PostgreSQL & Grafana

Building an Automated YouTube Analytics Dashboard with Airflow, PySpark, MinIO, PostgreSQL & Grafana

7
Comments
5 min read
Composable Analytics with Agents: Leveraging Virtual Datasets and the Semantic Layer

Composable Analytics with Agents: Leveraging Virtual Datasets and the Semantic Layer

1
Comments
3 min read
When to Choose Scala Over Python for Apache Spark: A Performance-Driven Analysis

When to Choose Scala Over Python for Apache Spark: A Performance-Driven Analysis

1
Comments
4 min read
⚽ The Data XI: Building a Modern Football Data Platform — Chapter 1: Taming the Data Beast

⚽ The Data XI: Building a Modern Football Data Platform — Chapter 1: Taming the Data Beast

2
Comments 1
3 min read
📊 Understanding 6 Common Data Formats in Data Analytics

📊 Understanding 6 Common Data Formats in Data Analytics

Comments
4 min read
Introduction to Apache Kafka for Beginners

Introduction to Apache Kafka for Beginners

1
Comments
5 min read
Apache Kafka — Deep Dive: Core Concepts, Data-Engineering Applications, and Real-World Production Practices

Apache Kafka — Deep Dive: Core Concepts, Data-Engineering Applications, and Real-World Production Practices

1
Comments
4 min read
Apache Iceberg Dev List Digest August 25-29

Apache Iceberg Dev List Digest August 25-29

Comments
5 min read
The Blueprint of a Data Team: Roles, Responsibilities, and Specializations

The Blueprint of a Data Team: Roles, Responsibilities, and Specializations

2
Comments
10 min read
loading...