DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Comments
2 min read
🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

3
Comments 10
2 min read
Useful Python Libraries for AI/ML

Useful Python Libraries for AI/ML

2
Comments
1 min read
Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

11
Comments
16 min read
Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

1
Comments
3 min read
Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Comments
2 min read
Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Comments
7 min read
Elasticsearch: Finding Missing Documents between 2 indices

Elasticsearch: Finding Missing Documents between 2 indices

3
Comments
3 min read
Breaking Into Data Science: A Comprehensive Guide for Aspiring Data Scientists

Breaking Into Data Science: A Comprehensive Guide for Aspiring Data Scientists

Comments
5 min read
"Data Engineering 101: A Beginner's Guide"

"Data Engineering 101: A Beginner's Guide"

3
Comments
3 min read
Understanding the Polaris Iceberg Catalog and Its Architecture

Understanding the Polaris Iceberg Catalog and Its Architecture

2
Comments
8 min read
Automatically Update BigQuery View Schema Changes

Automatically Update BigQuery View Schema Changes

3
Comments
5 min read
How I contributed my first data pipeline to the open source.

How I contributed my first data pipeline to the open source.

1
Comments
3 min read
On Orchestrators: You Are All Right, But You Are All Wrong Too

On Orchestrators: You Are All Right, But You Are All Wrong Too

1
Comments
10 min read
Data Engineer and Databricks

Data Engineer and Databricks

1
Comments
3 min read
What is the REST API Source toolkit?

What is the REST API Source toolkit?

1
Comments
7 min read
Working with Parquet files in Java using Carpet

Working with Parquet files in Java using Carpet

1
Comments
6 min read
HNG STAGE ZERO: ANALYZING RETAIL SALES DATA AT FIRST GLANCE

HNG STAGE ZERO: ANALYZING RETAIL SALES DATA AT FIRST GLANCE

Comments
3 min read
🪄 Debezium: the magic behind data capture & async replication (for free)

🪄 Debezium: the magic behind data capture & async replication (for free)

Comments 2
2 min read
Ways to load data in DW from External Data Source

Ways to load data in DW from External Data Source

1
Comments
6 min read
Apache Doris Job Scheduler for Task Automation

Apache Doris Job Scheduler for Task Automation

1
Comments
6 min read
Tracking Health with Data Engineering - Chapter 1: Meal Optimization

Tracking Health with Data Engineering - Chapter 1: Meal Optimization

Comments
6 min read
Software OR Hardware Raid: What's Better In 2024?

Software OR Hardware Raid: What's Better In 2024?

4
Comments
7 min read
Aggregation in GROUP BY vs. Window Functions Using OVER()

Aggregation in GROUP BY vs. Window Functions Using OVER()

3
Comments
3 min read
Azure Synapse Analytics Security: Access Control

Azure Synapse Analytics Security: Access Control

2
Comments
7 min read
loading...