DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches

Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches

Comments
6 min read
Introduction to Apache Airflow

Introduction to Apache Airflow

1
Comments
4 min read
Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

2
Comments
8 min read
Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025

Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025

2
Comments
10 min read
Personal Picks: Data Product News (October 1, 2025)

Personal Picks: Data Product News (October 1, 2025)

Comments
7 min read
TikTok Data Engineer Full 3-Round Interview

TikTok Data Engineer Full 3-Round Interview

3
Comments
4 min read
Building Real-Time Data Pipelines from PostgreSQL Using Flink CDC

Building Real-Time Data Pipelines from PostgreSQL Using Flink CDC

Comments
5 min read
How to Convert Excel to CSV in Python using Spire.XLS for Python

How to Convert Excel to CSV in Python using Spire.XLS for Python

Comments
4 min read
Building a Sales Database in PostgreSQL — Schema, Data & JOIN Examples

Building a Sales Database in PostgreSQL — Schema, Data & JOIN Examples

3
Comments
6 min read
Git Integration in Microsoft Fabric

Git Integration in Microsoft Fabric

3
Comments
3 min read
Get Started with Fastest SQL Query Engine - Presto C++ (Prestissimo): Beginner Friendly Setup Guide with Docker.

Get Started with Fastest SQL Query Engine - Presto C++ (Prestissimo): Beginner Friendly Setup Guide with Docker.

Comments
5 min read
10 Best Platforms to Learn Data Analytics in 2026

10 Best Platforms to Learn Data Analytics in 2026

1
Comments
4 min read
Apache Zookeeper: O coordenador de sistemas distribuídos

Apache Zookeeper: O coordenador de sistemas distribuídos

Comments
8 min read
Data Ingestion Types Explained: Finding the Right Model for Your Data Pipeline

Data Ingestion Types Explained: Finding the Right Model for Your Data Pipeline

2
Comments
3 min read
Debezium: Capturando mudanças de dados em tempo real

Debezium: Capturando mudanças de dados em tempo real

Comments
3 min read
Change Data Capture (CDC): Capturando mudanças em tempo real

Change Data Capture (CDC): Capturando mudanças em tempo real

Comments
4 min read
Streams de Dados: Processamento de Informações em Tempo Real

Streams de Dados: Processamento de Informações em Tempo Real

Comments
3 min read
Designing Data-Intensive Applications — Chapter 1: Reliable, Scalable, and Maintainable Applications

Designing Data-Intensive Applications — Chapter 1: Reliable, Scalable, and Maintainable Applications

6
Comments
4 min read
Big Data Analytics with PySpark: A Beginner-Friendly Guide

Big Data Analytics with PySpark: A Beginner-Friendly Guide

1
Comments
4 min read
Usando Funções de Ordem Superior (Higher-Order Functions - HOFs)

Usando Funções de Ordem Superior (Higher-Order Functions - HOFs)

Comments
4 min read
Change Data Capture (CDC) in Data Engineering: Concepts, Tools, and Real-World Implementation Strategies

Change Data Capture (CDC) in Data Engineering: Concepts, Tools, and Real-World Implementation Strategies

1
Comments
5 min read
A Beginner’s Guide to Big Data Analytics with Apache Spark and PySpark

A Beginner’s Guide to Big Data Analytics with Apache Spark and PySpark

Comments
4 min read
Azure Data Factory — The Conveyor Belt of Data in the Cloud

Azure Data Factory — The Conveyor Belt of Data in the Cloud

Comments 1
5 min read
Apache Kafka Deep Dive: Core Concepts, Data Engineering Applications, and Real-World Production Practices

Apache Kafka Deep Dive: Core Concepts, Data Engineering Applications, and Real-World Production Practices

Comments
4 min read
Self-Adapting Data Pipelines: The Intelligent Future of Data Engineering

Self-Adapting Data Pipelines: The Intelligent Future of Data Engineering

5
Comments
17 min read
loading...