DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

Comments
4 min read
Apache Airflow

Apache Airflow

1
Comments
4 min read
Clear Link Between DevSecOps and Data Engineering

Clear Link Between DevSecOps and Data Engineering

Comments
1 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files

Capture Browser XHR/Fetch API Response Automatically into JSON Files

Comments
1 min read
One Minute: DatAasee

One Minute: DatAasee

Comments
1 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Comments
1 min read
Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

Comments
4 min read
Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

1
Comments
3 min read
How SQL Spatial Data Solves Real-World Problems

How SQL Spatial Data Solves Real-World Problems

Comments
6 min read
Secure Data Stack: Navigating Adoption Challenges of Data Encryption

Secure Data Stack: Navigating Adoption Challenges of Data Encryption

1
Comments
5 min read
The Ultimate Guide to Data Engineering

The Ultimate Guide to Data Engineering

Comments
2 min read
Understanding the Apache Iceberg Manifest File

Understanding the Apache Iceberg Manifest File

Comments
7 min read
Evolution of Data Sharding Towards Automation and Flexibility

Evolution of Data Sharding Towards Automation and Flexibility

Comments
15 min read
The Power of Data Analytics – Transforming Businesses with Insights

The Power of Data Analytics – Transforming Businesses with Insights

Comments
5 min read
Top 5 Things You Should Know About Spark

Top 5 Things You Should Know About Spark

1
Comments
3 min read
Understanding Apache Iceberg's metadata.json file

Understanding Apache Iceberg's metadata.json file

1
Comments
7 min read
Strategy Recall Platform

Strategy Recall Platform

Comments
13 min read
🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

Comments
2 min read
ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

Comments
3 min read
End-to-End AWS KMS Encryption and Decryption Tutorial

End-to-End AWS KMS Encryption and Decryption Tutorial

2
Comments
3 min read
Cogumelos Mágicos: explorando e tratando dados nulos com Mage

Cogumelos Mágicos: explorando e tratando dados nulos com Mage

Comments
6 min read
Understanding Apache Iceberg Delete Files

Understanding Apache Iceberg Delete Files

1
Comments
4 min read
Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Comments
2 min read
The Must-Have Features of Modern Data Transformation Tools

The Must-Have Features of Modern Data Transformation Tools

Comments
6 min read
Data Pipeline Techniques in Action

Data Pipeline Techniques in Action

1
Comments
1 min read
From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

1
Comments
8 min read
Useful Python Libraries for AI/ML

Useful Python Libraries for AI/ML

Comments
1 min read
🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

Comments 2
3 min read
Data Security Strategy Beyond Access Control: Data Encryption

Data Security Strategy Beyond Access Control: Data Encryption

2
Comments
5 min read
A beginner's guide to data engineering concepts, tools, and responsibilities.

A beginner's guide to data engineering concepts, tools, and responsibilities.

Comments
1 min read
Avoid These Top 10 Mistakes When Using Apache Spark

Avoid These Top 10 Mistakes When Using Apache Spark

4
Comments
8 min read
A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

Comments
1 min read
Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Comments
2 min read
Building a data science career as a beginner. How can you do it?

Building a data science career as a beginner. How can you do it?

Comments
4 min read
Hiring Alert!

Hiring Alert!

Comments
1 min read
Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Comments
2 min read
PySpark optimization techniques

PySpark optimization techniques

1
Comments
4 min read
Data Engineer and Databricks

Data Engineer and Databricks

Comments
3 min read
Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

1
Comments
5 min read
Unlocking the Potential of Data with Azure Data Engineers

Unlocking the Potential of Data with Azure Data Engineers

1
Comments
3 min read
RoadMap to Data-Analytics 2024!

RoadMap to Data-Analytics 2024!

3
Comments
2 min read
DBT and Software Engineering

DBT and Software Engineering

3
Comments
7 min read
Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Comments
3 min read
The Developer’s Guide to Real-Time Data Platforms!

The Developer’s Guide to Real-Time Data Platforms!

7
Comments
6 min read
🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

5
Comments
2 min read
🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

5
Comments
1 min read
Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Comments
6 min read
Feature Engineering: The Ultimate Guide

Feature Engineering: The Ultimate Guide

1
Comments
2 min read
What Apache Iceberg REST Catalog is and isn't

What Apache Iceberg REST Catalog is and isn't

9
Comments
3 min read
Transforming Data Engineering: A Business Domain Approach with Data Mesh

Transforming Data Engineering: A Business Domain Approach with Data Mesh

Comments
5 min read
Speeding Up Data on AWS: From Ingestion to Insights

Speeding Up Data on AWS: From Ingestion to Insights

4
Comments
11 min read
การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

Comments
1 min read
The Ultimate Guide to Data Analytics: Techniques and Tools.

The Ultimate Guide to Data Analytics: Techniques and Tools.

Comments
3 min read
Building an Agnostic Data Pipeline: Pros and Cons

Building an Agnostic Data Pipeline: Pros and Cons

Comments
4 min read
🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

🐚 My Pacific Dataviz Challenge 2024 submission : violence & graphdatascience

3
Comments 9
2 min read
Apache Doris for log and time series data analysis in NetEase, why not Elasticsearch and InfluxDB?

Apache Doris for log and time series data analysis in NetEase, why not Elasticsearch and InfluxDB?

Comments
9 min read
Understanding RAID Levels: A Comprehensive Guide to RAID 0, 1, 5, 6, 10, and Beyond

Understanding RAID Levels: A Comprehensive Guide to RAID 0, 1, 5, 6, 10, and Beyond

6
Comments
9 min read
Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

Understanding Your Data: The Essentials of Exploratory Data Analysis (EDA)

1
Comments
3 min read
Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

Engenharia de Dados com Scala: masterizando o processamento de dados em tempo real com Apache Flink e Google Pub/Sub

5
Comments
16 min read
Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Data Lakehouse 101: The Who, What and Why of Data Lakehouses

Comments
7 min read
loading...