DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

3
Comments
6 min read
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

5
Comments
5 min read
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

3
Comments
5 min read
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

1
Comments
3 min read
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

8
Comments
15 min read
Building a Big Data Playground Sandbox for Learning

Building a Big Data Playground Sandbox for Learning

9
Comments
5 min read
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.

Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.

6
Comments
4 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files

Capture Browser XHR/Fetch API Response Automatically into JSON Files

Comments
1 min read
The True Cost of Poor Data Quality: Why It Matters and How to Improve It

The True Cost of Poor Data Quality: Why It Matters and How to Improve It

3
Comments
6 min read
From ETL and ELT to Reverse ETL

From ETL and ELT to Reverse ETL

Comments 1
4 min read
Explaining the History of Data Lakehouse

Explaining the History of Data Lakehouse

1
Comments
2 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Comments
1 min read
O que é Engenharia de Dados?

O que é Engenharia de Dados?

3
Comments
1 min read
How SQL Spatial Data Solves Real-World Problems

How SQL Spatial Data Solves Real-World Problems

Comments
6 min read
Explaining CDC (Change Data Capture)

Explaining CDC (Change Data Capture)

Comments
1 min read
Handling Outliers 101: Why the IQR Method is Your Go-To Tool

Handling Outliers 101: Why the IQR Method is Your Go-To Tool

2
Comments
3 min read
Go vs Python for File Processing: A Performance and Architecture Perspective

Go vs Python for File Processing: A Performance and Architecture Perspective

2
Comments 2
5 min read
Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook

Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook

6
Comments
13 min read
Data Analysis: The Unsung Hero of Modern Business

Data Analysis: The Unsung Hero of Modern Business

Comments
2 min read
Analyzing Airbnb Listings in Chicago: A Power BI Dashboard Project

Analyzing Airbnb Listings in Chicago: A Power BI Dashboard Project

1
Comments
4 min read
Python 101: Introduction to Python as a Data Analytics Tool

Python 101: Introduction to Python as a Data Analytics Tool

Comments
3 min read
Ultimate Directory of Apache Iceberg Resources

Ultimate Directory of Apache Iceberg Resources

1
Comments
14 min read
Change Data Capture (CDC) when there is no CDC

Change Data Capture (CDC) when there is no CDC

3
Comments
11 min read
Understanding OLTP and Choosing the Right Database

Understanding OLTP and Choosing the Right Database

2
Comments
6 min read
The Ultimate Guide to Data Engineering

The Ultimate Guide to Data Engineering

Comments
2 min read
loading...