DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The State of Apache Iceberg, Polaris, and Arrow: November 5-11

The State of Apache Iceberg, Polaris, and Arrow: November 5-11

Comments
5 min read
Understanding Kafka Lag: Why It Happens and How to Fix It

Understanding Kafka Lag: Why It Happens and How to Fix It

Comments
4 min read
Right Approach to JSON Log Analysis: A Hands-on Guide to Efficient Practices with Alibaba Cloud SLS

Right Approach to JSON Log Analysis: A Hands-on Guide to Efficient Practices with Alibaba Cloud SLS

Comments
7 min read
Understanding reasons behind Kafka lag and how to minimize it.

Understanding reasons behind Kafka lag and how to minimize it.

Comments
3 min read
Why Idempotence Is So Important in Data Engineering

Why Idempotence Is So Important in Data Engineering

Comments
6 min read
Reducing Consumer Lag in Apache Kafka

Reducing Consumer Lag in Apache Kafka

5
Comments
3 min read
🚀 Day 1: Introduction to Apache Spark

🚀 Day 1: Introduction to Apache Spark

1
Comments
2 min read
The Future of Data Pipelines: How AI Is Redefining ETL Forever

The Future of Data Pipelines: How AI Is Redefining ETL Forever

1
Comments
4 min read
Shine in Your Next Data Engineering Interview with Pandas

Shine in Your Next Data Engineering Interview with Pandas

Comments
10 min read
Writes, 3 ways: Postgres, Apache Kafka® and Apache Iceberg™

Writes, 3 ways: Postgres, Apache Kafka® and Apache Iceberg™

1
Comments
10 min read
Building Custom MCP Servers with Python: A Data Engineer's Guide 🛠️

Building Custom MCP Servers with Python: A Data Engineer's Guide 🛠️

Comments 1
5 min read
The Next Era of Databases: When Queries Write Themselves

The Next Era of Databases: When Queries Write Themselves

Comments
4 min read
Why 71,000 Data Engineers Read My Article: What I Learned About Technical Writing

Why 71,000 Data Engineers Read My Article: What I Learned About Technical Writing

4
Comments 1
6 min read
Transforming Tableau Performance: How Optimized Data Logic Cut Dashboard Load Time by 98.9%

Transforming Tableau Performance: How Optimized Data Logic Cut Dashboard Load Time by 98.9%

5
Comments
8 min read
rec

rec

Comments
2 min read
How We Cut LLM Batch Inference Time in Half with Dynamic Prefix Bucketing

How We Cut LLM Batch Inference Time in Half with Dynamic Prefix Bucketing

5
Comments
13 min read
Automating EL pipeline using Azure Functions(Python)

Automating EL pipeline using Azure Functions(Python)

1
Comments
4 min read
Top Open-Source Data Engineering Tools- Unravelling the Best in 2026

Top Open-Source Data Engineering Tools- Unravelling the Best in 2026

1
Comments
10 min read
Let's say you have a data lake

Let's say you have a data lake

Comments
3 min read
Azure Data Factory for ETL

Azure Data Factory for ETL

Comments
5 min read
FlightPath Server Has Landed

FlightPath Server Has Landed

Comments
1 min read
How to Avoid Common Data Management Pitfalls in Enterprise RAG Systems: A Guide to Effective Governance and Observability

How to Avoid Common Data Management Pitfalls in Enterprise RAG Systems: A Guide to Effective Governance and Observability

Comments
5 min read
🔥 Day 6: Essential PySpark DataFrame Transformations

🔥 Day 6: Essential PySpark DataFrame Transformations

Comments
2 min read
Entrenando a Prize (Parte 2): Creando un Dataset de 5,500 líneas por $0.08 USD

Entrenando a Prize (Parte 2): Creando un Dataset de 5,500 líneas por $0.08 USD

Comments
5 min read
Building Self-Healing, Reliable Data Pipelines That Think

Building Self-Healing, Reliable Data Pipelines That Think

Comments
4 min read
loading...