DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The Simplest Data Architecture

The Simplest Data Architecture

1
Comments
21 min read
🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

Comments
2 min read
ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

Comments
3 min read
End-to-End AWS KMS Encryption and Decryption Tutorial

End-to-End AWS KMS Encryption and Decryption Tutorial

2
Comments
3 min read
Working with Gigantic Google BigQuery Partitioned Tables in DBT

Working with Gigantic Google BigQuery Partitioned Tables in DBT

2
Comments
3 min read
Cogumelos MĂĄgicos: explorando e tratando dados nulos com Mage

Cogumelos MĂĄgicos: explorando e tratando dados nulos com Mage

Comments
6 min read
Apache Airflow

Apache Airflow

2
Comments
4 min read
Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Comments
2 min read
The Must-Have Features of Modern Data Transformation Tools

The Must-Have Features of Modern Data Transformation Tools

Comments
6 min read
An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

4
Comments
4 min read
Data Pipeline Techniques in Action

Data Pipeline Techniques in Action

1
Comments
1 min read
From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

1
Comments
8 min read
Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

1
Comments 1
3 min read
One Minute: DatAasee

One Minute: DatAasee

1
Comments
1 min read
Data Security Strategy Beyond Access Control: Data Encryption

Data Security Strategy Beyond Access Control: Data Encryption

2
Comments
5 min read
The Power of Data Analytics – Transforming Businesses with Insights

The Power of Data Analytics – Transforming Businesses with Insights

Comments
5 min read
A beginner's guide to data engineering concepts, tools, and responsibilities.

A beginner's guide to data engineering concepts, tools, and responsibilities.

Comments
1 min read
Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

4
Comments
4 min read
A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

Comments
1 min read
Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Comments
2 min read
Building a data science career as a beginner. How can you do it?

Building a data science career as a beginner. How can you do it?

Comments
4 min read
Hiring Alert!

Hiring Alert!

Comments
1 min read
Understanding Apache Iceberg Delete Files

Understanding Apache Iceberg Delete Files

4
Comments
4 min read
Top 5 Things You Should Know About Spark

Top 5 Things You Should Know About Spark

1
Comments
3 min read
Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Comments
2 min read
PySpark optimization techniques

PySpark optimization techniques

1
Comments
4 min read
Avoid These Top 10 Mistakes When Using Apache Spark

Avoid These Top 10 Mistakes When Using Apache Spark

3
Comments
8 min read
Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

1
Comments
5 min read
Understanding the Apache Iceberg Manifest File

Understanding the Apache Iceberg Manifest File

3
Comments
7 min read
RoadMap to Data-Analytics 2024!

RoadMap to Data-Analytics 2024!

3
Comments
2 min read
DBT and Software Engineering

DBT and Software Engineering

4
Comments
7 min read
Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Comments
3 min read
The Developer’s Guide to Real-Time Data Platforms!

The Developer’s Guide to Real-Time Data Platforms!

9
Comments
6 min read
Understanding Apache Iceberg's metadata.json file

Understanding Apache Iceberg's metadata.json file

4
Comments
7 min read
🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

5
Comments
1 min read
🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

5
Comments
2 min read
Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Comments
6 min read
Feature Engineering: The Ultimate Guide

Feature Engineering: The Ultimate Guide

1
Comments
2 min read
🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

2
Comments 2
3 min read
What Apache Iceberg REST Catalog is and isn't

What Apache Iceberg REST Catalog is and isn't

10
Comments
3 min read
Transforming Data Engineering: A Business Domain Approach with Data Mesh

Transforming Data Engineering: A Business Domain Approach with Data Mesh

Comments
5 min read
Speeding Up Data on AWS: From Ingestion to Insights

Speeding Up Data on AWS: From Ingestion to Insights

4
Comments
11 min read
การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

Comments
1 min read
The Ultimate Guide to Data Analytics: Techniques and Tools.

The Ultimate Guide to Data Analytics: Techniques and Tools.

Comments
3 min read
Building an Agnostic Data Pipeline: Pros and Cons

Building an Agnostic Data Pipeline: Pros and Cons

1
Comments
4 min read
🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

3
Comments 10
2 min read
Useful Python Libraries for AI/ML

Useful Python Libraries for AI/ML

2
Comments
1 min read
Understanding RAID Levels: A Comprehensive Guide to RAID 0, 1, 5, 6, 10, and Beyond

Understanding RAID Levels: A Comprehensive Guide to RAID 0, 1, 5, 6, 10, and Beyond

6
Comments
9 min read
Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

11
Comments
16 min read
Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

1
Comments
3 min read
Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Comments
7 min read
Elasticsearch: Finding Missing Documents between 2 indices

Elasticsearch: Finding Missing Documents between 2 indices

3
Comments
3 min read
Breaking Into Data Science: A Comprehensive Guide for Aspiring Data Scientists

Breaking Into Data Science: A Comprehensive Guide for Aspiring Data Scientists

Comments
5 min read
"Data Engineering 101: A Beginner's Guide"

"Data Engineering 101: A Beginner's Guide"

3
Comments
3 min read
Understanding the Polaris Iceberg Catalog and Its Architecture

Understanding the Polaris Iceberg Catalog and Its Architecture

2
Comments
8 min read
Automatically Update BigQuery View Schema Changes

Automatically Update BigQuery View Schema Changes

3
Comments
5 min read
How I contributed my first data pipeline to the open source.

How I contributed my first data pipeline to the open source.

1
Comments
3 min read
On Orchestrators: You Are All Right, But You Are All Wrong Too

On Orchestrators: You Are All Right, But You Are All Wrong Too

1
Comments
10 min read
From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

Comments
12 min read
Data Engineer and Databricks

Data Engineer and Databricks

1
Comments
3 min read
loading...