DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Leveraging Python's Pattern Matching and Comprehensions for Data Analytics

Leveraging Python's Pattern Matching and Comprehensions for Data Analytics

1
Comments
12 min read
Why Apache Spark RDD is immutable?

Why Apache Spark RDD is immutable?

Comments
3 min read
Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes

Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes

1
Comments
19 min read
Data Modeling - Entities and Events

Data Modeling - Entities and Events

2
Comments
6 min read
Data Engineering in Observability: The Backbone of Modern Monitoring

Data Engineering in Observability: The Backbone of Modern Monitoring

1
Comments
5 min read
Análise de dados de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka

Análise de dados de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka

6
Comments
8 min read
Oracle to Snowflake Migration: Steps, Challenges & Best Practices

Oracle to Snowflake Migration: Steps, Challenges & Best Practices

2
Comments
3 min read
Data Engineering in 2024: Innovations and Trends Shaping the Future

Data Engineering in 2024: Innovations and Trends Shaping the Future

8
Comments 2
13 min read
My journey learning Apache Spark

My journey learning Apache Spark

1
Comments
2 min read
AWS DATA ENGINEER - 101

AWS DATA ENGINEER - 101

3
Comments
2 min read
The Journey From a CSV File to Apache Hive Table

The Journey From a CSV File to Apache Hive Table

3
Comments
6 min read
Capítulo 2 - Modelos de Datos y Lenguajes de Consulta

Capítulo 2 - Modelos de Datos y Lenguajes de Consulta

2
Comments
7 min read
All About Parquet Part 05 - Compression Techniques in Parquet

All About Parquet Part 05 - Compression Techniques in Parquet

16
Comments
5 min read
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet

All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet

17
Comments
6 min read
All About Parquet Part 02 - Parquet's Columnar Storage Model

All About Parquet Part 02 - Parquet's Columnar Storage Model

2
Comments
4 min read
All About Parquet Part 09 - Parquet in Data Lake Architectures

All About Parquet Part 09 - Parquet in Data Lake Architectures

1
Comments
5 min read
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

3
Comments
6 min read
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

5
Comments
5 min read
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns

All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns

4
Comments
5 min read
All About Parquet Part 01 - An Introduction

All About Parquet Part 01 - An Introduction

2
Comments
4 min read
All About Parquet Part 08 - Reading and Writing Parquet Files in Python

All About Parquet Part 08 - Reading and Writing Parquet Files in Python

34
Comments
5 min read
All About Parquet Part 04 - Schema Evolution in Parquet

All About Parquet Part 04 - Schema Evolution in Parquet

7
Comments 1
5 min read
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

3
Comments
5 min read
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

1
Comments
3 min read
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

8
Comments
15 min read
loading...