DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Clustering vs Partitioning your Apache Iceberg Tables

Clustering vs Partitioning your Apache Iceberg Tables

7
Comments
10 min read
From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

1
Comments
12 min read
Database generated events: LiveSync’s database connector vs CDC

Database generated events: LiveSync’s database connector vs CDC

4
Comments
5 min read
The Data Professions

The Data Professions

1
Comments
3 min read
MySQL: Using and Enhancing `DATETIME` and `TIMESTAMP`

MySQL: Using and Enhancing `DATETIME` and `TIMESTAMP`

1
Comments
3 min read
Working with Parquet files in Java using Carpet

Working with Parquet files in Java using Carpet

1
Comments 1
6 min read
Analyzing Svenskalag Data using DBT and DuckDB

Analyzing Svenskalag Data using DBT and DuckDB

1
Comments
4 min read
How I've implemented the Medallion architecture using Apache Spark and Apache Hdoop

How I've implemented the Medallion architecture using Apache Spark and Apache Hdoop

12
Comments
6 min read
Working with Dates and Times in SQL: Tips and Tricks

Working with Dates and Times in SQL: Tips and Tricks

Comments
3 min read
FastAPI for Data Applications: From Concept to Creation. Part I

FastAPI for Data Applications: From Concept to Creation. Part I

4
Comments
5 min read
Bridging Backend and Data Engineering: Communicating Through Events

Bridging Backend and Data Engineering: Communicating Through Events

Comments
1 min read
Usando Consultas de Percolação do Elasticsearch, Netflix Aperfeiçoa Buscas Reversas Eficientemente

Usando Consultas de Percolação do Elasticsearch, Netflix Aperfeiçoa Buscas Reversas Eficientemente

1
Comments
3 min read
How to setup resources for k8s pod

How to setup resources for k8s pod

2
Comments
3 min read
Multi-tenant workload isolation in Apache Doris: a better balance between isolation and utilization

Multi-tenant workload isolation in Apache Doris: a better balance between isolation and utilization

3
Comments
9 min read
Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

1
Comments
13 min read
Difference between Data Analysts, Data Scientists, and Data Engineers

Difference between Data Analysts, Data Scientists, and Data Engineers

Comments 1
1 min read
What is Data Ethics?

What is Data Ethics?

Comments
8 min read
Converting .shp files to CSV with GeoPandas

Converting .shp files to CSV with GeoPandas

17
Comments 1
2 min read
Apache Iceberg and Data Lakehouse Partitioning

Apache Iceberg and Data Lakehouse Partitioning

8
Comments 1
7 min read
Data warehouse vs data lake

Data warehouse vs data lake

1
Comments
8 min read
SQL Convertor for Easy Migration from Presto, Trino, ClickHouse, and Hive to Apache Doris

SQL Convertor for Easy Migration from Presto, Trino, ClickHouse, and Hive to Apache Doris

Comments
4 min read
Python Projects with SQL: Strategies for Effective Query Management

Python Projects with SQL: Strategies for Effective Query Management

16
Comments 2
9 min read
Apache Spark 101

Apache Spark 101

2
Comments
7 min read
PySpark: missing value

PySpark: missing value

Comments
2 min read
How I Try To Keep Up With The Data Tech World (A List of Data Blogs)

How I Try To Keep Up With The Data Tech World (A List of Data Blogs)

1
Comments 1
5 min read
loading...