DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Introduction to Python for Data Engineering

Introduction to Python for Data Engineering

4
Comments
5 min read
Kubernetes Was Never Designed for Batch Jobs

Kubernetes Was Never Designed for Batch Jobs

3
Comments 2
17 min read
Data Engineering 102: Introduction to Python for Data Engineering.

Data Engineering 102: Introduction to Python for Data Engineering.

6
Comments
10 min read
Introduction to Python for Data Engineering

Introduction to Python for Data Engineering

4
Comments
7 min read
INTRODUCTION TO PYTHON FOR DATA ENGINEERING

INTRODUCTION TO PYTHON FOR DATA ENGINEERING

Comments
4 min read
DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.

DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.

5
Comments
2 min read
Fundamentos da Engenharia de Dados

Fundamentos da Engenharia de Dados

6
Comments
9 min read
Data Engineering 101: Introduction to Data Engineering

Data Engineering 101: Introduction to Data Engineering

5
Comments
2 min read
Online SQL Client for low code data management

Online SQL Client for low code data management

5
Comments 1
5 min read
Data Engineering 101: Introduction to Data Engineering.

Data Engineering 101: Introduction to Data Engineering.

7
Comments 1
6 min read
Introduction to data engineering

Introduction to data engineering

5
Comments
4 min read
Create Jira Ticket on Prefect Task Failure

Create Jira Ticket on Prefect Task Failure

1
Comments
2 min read
Hash Personal Identifiable Information (PII) in your ELT pipelines

Hash Personal Identifiable Information (PII) in your ELT pipelines

3
Comments
3 min read
Difference Between Data Engineer and Data Scientist?

Difference Between Data Engineer and Data Scientist?

7
Comments
3 min read
Learning Workflow Schedulers (Oozie)

Learning Workflow Schedulers (Oozie)

2
Comments
5 min read
Solving AttributeError: 'float' object has no attribute 'rint'

Solving AttributeError: 'float' object has no attribute 'rint'

5
Comments
2 min read
[Spark-k8s] — Getting started # Part 1

[Spark-k8s] — Getting started # Part 1

3
Comments
4 min read
Websites to find Dataset for your Data Engineering projects.

Websites to find Dataset for your Data Engineering projects.

5
Comments
1 min read
Data engineers must-see: The future trend of big data cloud services

Data engineers must-see: The future trend of big data cloud services

8
Comments 1
8 min read
Data Engineering Projects for Beginners

Data Engineering Projects for Beginners

24
Comments 2
2 min read
Data Pipelines with Apache Airflow - Book Review

Data Pipelines with Apache Airflow - Book Review

8
Comments
2 min read
ETL vs Interactive Queries: The Case for Both

ETL vs Interactive Queries: The Case for Both

6
Comments
8 min read
Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow

Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow

10
Comments
4 min read
Parsing logs from multiple data sources with Ahana and Cube

Parsing logs from multiple data sources with Ahana and Cube

14
Comments
24 min read
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

7
Comments
3 min read
What is the Lakehouse, the latest Direction of Big Data Architecture?

What is the Lakehouse, the latest Direction of Big Data Architecture?

9
Comments
10 min read
Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL

Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL

7
Comments
6 min read
Using dbt for Transformation Tasks on BigQuery

Using dbt for Transformation Tasks on BigQuery

10
Comments 1
4 min read
Docker and Kubernetes

Docker and Kubernetes

6
Comments
3 min read
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset

How to Use Apache Airflow to Get 1000+ Files From a Public Dataset

8
Comments
10 min read
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)

ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)

9
Comments
5 min read
What is Azure Synapse Analytics?

What is Azure Synapse Analytics?

4
Comments
7 min read
Design concept of a best opensource project about big data and data lakehouse

Design concept of a best opensource project about big data and data lakehouse

9
Comments
9 min read
When To Build vs. Buy Data Pipelines

When To Build vs. Buy Data Pipelines

3
Comments
6 min read
How to prepare for the GCP Professional Data Engineer certification

How to prepare for the GCP Professional Data Engineer certification

31
Comments 4
8 min read
Details of 4 best opensource projects about big data you should try out(Ⅰ)

Details of 4 best opensource projects about big data you should try out(Ⅰ)

8
Comments
5 min read
Debezium Change Data Capture without Kafka Connect

Debezium Change Data Capture without Kafka Connect

10
Comments 1
8 min read
Building GCS Buckets and BigQuery Tables with Terraform

Building GCS Buckets and BigQuery Tables with Terraform

4
Comments
4 min read
Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes

Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes

5
Comments
4 min read
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

8
Comments
5 min read
Considerations when performing ETL

Considerations when performing ETL

4
Comments
3 min read
4 best opensource projects about big data you should try out

4 best opensource projects about big data you should try out

16
Comments 3
3 min read
Preparing for Professional Cloud Data Engineer Certification (March 2022)

Preparing for Professional Cloud Data Engineer Certification (March 2022)

3
Comments 2
12 min read
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

8
Comments
2 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer

[OPINIÃO] Construindo uma Carreira como Data Engineer

2
Comments
2 min read
What is Data Profiling?

What is Data Profiling?

2
Comments
1 min read
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

6
Comments
2 min read
Data architecture models

Data architecture models

3
Comments
6 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

6
Comments
3 min read
Enabling the Customer Data Stack: RudderStack Series B Funding

Enabling the Customer Data Stack: RudderStack Series B Funding

2
Comments
1 min read
Kestra, infinitely scalable open source orchestration and scheduling platform.

Kestra, infinitely scalable open source orchestration and scheduling platform.

4
Comments
6 min read
Modern data warehouse patterns: ELT with Snowflake variants

Modern data warehouse patterns: ELT with Snowflake variants

9
Comments
6 min read
Standing on the shoulders of giants. Part one: Airflow

Standing on the shoulders of giants. Part one: Airflow

7
Comments
5 min read
Data Engineering in Julia

Data Engineering in Julia

4
Comments 1
1 min read
How Engineering Teams Use RudderStack to Support Marketing

How Engineering Teams Use RudderStack to Support Marketing

6
Comments
7 min read
Why It’s Hard for Engineering to Support Marketing

Why It’s Hard for Engineering to Support Marketing

2
Comments
3 min read
Introduction to Data Engineering

Introduction to Data Engineering

2
Comments
5 min read
Extract csv data and load it to PostgreSQL using Meltano ELT

Extract csv data and load it to PostgreSQL using Meltano ELT

9
Comments
6 min read
Data Engineering Pipeline with AWS Step Functions, CodeBuild and Dagster

Data Engineering Pipeline with AWS Step Functions, CodeBuild and Dagster

9
Comments 4
10 min read
What Is Event-Driven Machine Learning?

What Is Event-Driven Machine Learning?

6
Comments
4 min read
loading...