DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Docker and Kubernetes

Docker and Kubernetes

6
Comments
3 min read
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset

How to Use Apache Airflow to Get 1000+ Files From a Public Dataset

8
Comments
10 min read
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)

ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)

10
Comments
5 min read
What is Azure Synapse Analytics?

What is Azure Synapse Analytics?

4
Comments
7 min read
Design concept of a best opensource project about big data and data lakehouse

Design concept of a best opensource project about big data and data lakehouse

9
Comments
9 min read
When To Build vs. Buy Data Pipelines

When To Build vs. Buy Data Pipelines

3
Comments
6 min read
Details of 4 best opensource projects about big data you should try out(Ⅰ)

Details of 4 best opensource projects about big data you should try out(Ⅰ)

8
Comments
5 min read
Debezium Change Data Capture without Kafka Connect

Debezium Change Data Capture without Kafka Connect

10
Comments 1
8 min read
Building GCS Buckets and BigQuery Tables with Terraform

Building GCS Buckets and BigQuery Tables with Terraform

4
Comments
4 min read
Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes

Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes

5
Comments
4 min read
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

8
Comments
5 min read
Considerations when performing ETL

Considerations when performing ETL

4
Comments
3 min read
4 best opensource projects about big data you should try out

4 best opensource projects about big data you should try out

16
Comments 3
3 min read
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

8
Comments
2 min read
Preparing for Professional Cloud Data Engineer Certification (March 2022)

Preparing for Professional Cloud Data Engineer Certification (March 2022)

3
Comments 4
12 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer

[OPINIÃO] Construindo uma Carreira como Data Engineer

2
Comments
2 min read
What is Data Profiling?

What is Data Profiling?

2
Comments
1 min read
How to prepare for the GCP Professional Data Engineer certification

How to prepare for the GCP Professional Data Engineer certification

35
Comments 7
8 min read
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

6
Comments
2 min read
Data architecture models

Data architecture models

4
Comments
6 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

6
Comments
3 min read
Enabling the Customer Data Stack: RudderStack Series B Funding

Enabling the Customer Data Stack: RudderStack Series B Funding

2
Comments
1 min read
Kestra, infinitely scalable open source orchestration and scheduling platform.

Kestra, infinitely scalable open source orchestration and scheduling platform.

5
Comments
6 min read
Modern data warehouse patterns: ELT with Snowflake variants

Modern data warehouse patterns: ELT with Snowflake variants

9
Comments
6 min read
Standing on the shoulders of giants. Part one: Airflow

Standing on the shoulders of giants. Part one: Airflow

7
Comments
5 min read
loading...