Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
DEV Community
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
[Spark-k8s] — Getting started # Part 1
Tiago Xavier
Tiago Xavier
Tiago Xavier
Follow
Jul 19 '22
[Spark-k8s] — Getting started # Part 1
#
spark
#
kubernetes
#
dataengineering
2
reactions
Comments
Add Comment
4 min read
Websites to find Dataset for your Data Engineering projects.
SAIFULLAH🇮🇳
SAIFULLAH🇮🇳
SAIFULLAH🇮🇳
Follow
Jul 17 '22
Websites to find Dataset for your Data Engineering projects.
#
datascience
#
dataengineering
#
database
#
dataset
5
reactions
Comments
Add Comment
1 min read
Data engineers must-see: The future trend of big data cloud services
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Jun 26 '22
Data engineers must-see: The future trend of big data cloud services
#
database
#
dataengineering
#
bigdata
#
opensource
8
reactions
Comments
1
comment
8 min read
Data Engineering Projects for Beginners
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Follow
Jun 15 '22
Data Engineering Projects for Beginners
#
database
#
dataengineering
#
python
24
reactions
Comments
2
comments
2 min read
Data Pipelines with Apache Airflow - Book Review
Albert Ulysses
Albert Ulysses
Albert Ulysses
Follow
Jun 13 '22
Data Pipelines with Apache Airflow - Book Review
#
python
#
dataengineering
#
books
#
bigdata
8
reactions
Comments
Add Comment
2 min read
ETL vs Interactive Queries: The Case for Both
Monica Miller
Monica Miller
Monica Miller
Follow
Jun 13 '22
ETL vs Interactive Queries: The Case for Both
#
database
#
dataengineering
#
datascience
#
etl
6
reactions
Comments
Add Comment
8 min read
Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow
Nkwam Philip
Nkwam Philip
Nkwam Philip
Follow
Jun 9 '22
Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow
#
database
#
googlecloud
#
dataengineering
#
cloud
10
reactions
Comments
Add Comment
4 min read
Parsing logs from multiple data sources with Ahana and Cube
Bartosz Mikulski
Bartosz Mikulski
Bartosz Mikulski
Follow
for
Cube
Jun 8 '22
Parsing logs from multiple data sources with Ahana and Cube
#
ahana
#
presto
#
dataengineering
14
reactions
Comments
Add Comment
24 min read
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics
DMetaSoul
DMetaSoul
DMetaSoul
Follow
May 29 '22
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics
#
opensource
#
database
#
dataengineering
#
bigdata
7
reactions
Comments
Add Comment
3 min read
What is the Lakehouse, the latest Direction of Big Data Architecture?
DMetaSoul
DMetaSoul
DMetaSoul
Follow
May 14 '22
What is the Lakehouse, the latest Direction of Big Data Architecture?
#
opensource
#
dataengineering
#
bigdata
#
database
9
reactions
Comments
Add Comment
10 min read
Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Apr 4 '22
Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL
#
datascience
#
eventstreaming
#
dataengineering
#
customerdataplatform
7
reactions
Comments
Add Comment
6 min read
Using dbt for Transformation Tasks on BigQuery
Cem Keskin
Cem Keskin
Cem Keskin
Follow
May 2 '22
Using dbt for Transformation Tasks on BigQuery
#
dezoomcamp
#
dataengineering
10
reactions
Comments
1
comment
4 min read
Docker and Kubernetes
Abdullah Paracha
Abdullah Paracha
Abdullah Paracha
Follow
for
AWS Community Builders
Apr 27 '22
Docker and Kubernetes
#
devops
#
dataengineering
#
softwareengineer
#
programming
6
reactions
Comments
Add Comment
3 min read
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset
Cem Keskin
Cem Keskin
Cem Keskin
Follow
Apr 24 '22
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset
#
dezoomcamp
#
dataengineering
#
airflow
#
publicdatasets
8
reactions
Comments
Add Comment
10 min read
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)
Rubens Barbosa
Rubens Barbosa
Rubens Barbosa
Follow
Apr 24 '22
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)
#
etl
#
dataengineering
#
python
#
azure
8
reactions
Comments
Add Comment
5 min read
What is Azure Synapse Analytics?
Matt Eland
Matt Eland
Matt Eland
Follow
Apr 22 '22
What is Azure Synapse Analytics?
#
data
#
dataengineering
#
azure
#
cloud
4
reactions
Comments
Add Comment
7 min read
Design concept of a best opensource project about big data and data lakehouse
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Apr 16 '22
Design concept of a best opensource project about big data and data lakehouse
#
opensource
#
dataengineering
#
bigdata
#
datascience
9
reactions
Comments
Add Comment
9 min read
When To Build vs. Buy Data Pipelines
Team RudderStack
Team RudderStack
Team RudderStack
Follow
Apr 11 '22
When To Build vs. Buy Data Pipelines
#
cdp
#
datascience
#
datapipeline
#
dataengineering
3
reactions
Comments
Add Comment
6 min read
How to prepare for the GCP Professional Data Engineer certification
Gabriel Luz
Gabriel Luz
Gabriel Luz
Follow
May 2 '22
How to prepare for the GCP Professional Data Engineer certification
#
googlecloud
#
dataengineering
#
gcp
#
bigdata
28
reactions
Comments
4
comments
8 min read
Details of 4 best opensource projects about big data you should try out(Ⅰ)
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Apr 7 '22
Details of 4 best opensource projects about big data you should try out(Ⅰ)
#
opensource
#
dataengineering
#
bigdata
#
spark
8
reactions
Comments
Add Comment
5 min read
Debezium Change Data Capture without Kafka Connect
Ludovic DEHON
Ludovic DEHON
Ludovic DEHON
Follow
for
Kestra
Apr 5 '22
Debezium Change Data Capture without Kafka Connect
#
database
#
dataengineering
#
opensource
#
etl
10
reactions
Comments
1
comment
8 min read
Building GCS Buckets and BigQuery Tables with Terraform
Cem Keskin
Cem Keskin
Cem Keskin
Follow
Apr 3 '22
Building GCS Buckets and BigQuery Tables with Terraform
#
dezoomcamp
#
dataengineering
#
bigquery
#
gcp
3
reactions
Comments
Add Comment
4 min read
Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes
Rodel E. Dagumampan
Rodel E. Dagumampan
Rodel E. Dagumampan
Follow
Feb 22 '22
Released yuniql v1.2.25. Multi-tenant support, Oracle and largest set of bug fixes
#
yuniql
#
database
#
devops
#
dataengineering
5
reactions
Comments
Add Comment
4 min read
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Mar 25 '22
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment
#
opensource
#
dataengineering
#
bigdata
#
spark
8
reactions
Comments
Add Comment
5 min read
Considerations when performing ETL
Raphael Gutierrez
Raphael Gutierrez
Raphael Gutierrez
Follow
Mar 25 '22
Considerations when performing ETL
#
dataengineering
#
etl
4
reactions
Comments
Add Comment
3 min read
4 best opensource projects about big data you should try out
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Mar 24 '22
4 best opensource projects about big data you should try out
#
opensource
#
dataengineering
#
bigdata
#
spark
16
reactions
Comments
3
comments
3 min read
Preparing for Professional Cloud Data Engineer Certification (March 2022)
David Cox
David Cox
David Cox
Follow
Mar 23 '22
Preparing for Professional Cloud Data Engineer Certification (March 2022)
#
googlecloud
#
cloud
#
dataengineering
3
reactions
Comments
2
comments
12 min read
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Mar 17 '22
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions
#
dataengineering
#
opensource
#
bigdata
#
programming
8
reactions
Comments
Add Comment
2 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer
Lis R. Barreto
Lis R. Barreto
Lis R. Barreto
Follow
Mar 9 '22
[OPINIÃO] Construindo uma Carreira como Data Engineer
#
bigdata
#
dataengineering
#
tips
2
reactions
Comments
Add Comment
2 min read
What is Data Profiling?
Preeti Hemant
Preeti Hemant
Preeti Hemant
Follow
Mar 5 '22
What is Data Profiling?
#
data
#
dataengineering
#
shortread
#
analyticsengineering
2
reactions
Comments
Add Comment
1 min read
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!
Lis R. Barreto
Lis R. Barreto
Lis R. Barreto
Follow
Feb 24 '22
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!
#
bigdata
#
dados
#
dataengineering
#
career
6
reactions
Comments
Add Comment
2 min read
Data architecture models
Preeti Hemant
Preeti Hemant
Preeti Hemant
Follow
Feb 23 '22
Data architecture models
#
datamesh
#
data
#
dataengineering
#
etl
3
reactions
Comments
Add Comment
6 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças
Lis R. Barreto
Lis R. Barreto
Lis R. Barreto
Follow
Feb 18 '22
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças
#
bigdata
#
data
#
dataengineering
6
reactions
Comments
Add Comment
3 min read
Enabling the Customer Data Stack: RudderStack Series B Funding
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Feb 14 '22
Enabling the Customer Data Stack: RudderStack Series B Funding
#
seriesb
#
cdp
#
rudderstack
#
dataengineering
2
reactions
Comments
Add Comment
1 min read
Kestra, infinitely scalable open source orchestration and scheduling platform.
Ludovic DEHON
Ludovic DEHON
Ludovic DEHON
Follow
for
Kestra
Feb 2 '22
Kestra, infinitely scalable open source orchestration and scheduling platform.
#
dataengineering
#
data
#
etl
#
opensource
4
reactions
Comments
Add Comment
6 min read
Modern data warehouse patterns: ELT with Snowflake variants
biellls
biellls
biellls
Follow
Jan 29 '22
Modern data warehouse patterns: ELT with Snowflake variants
#
snowflake
#
dataengineering
#
etl
#
elt
9
reactions
Comments
Add Comment
6 min read
Standing on the shoulders of giants. Part one: Airflow
biellls
biellls
biellls
Follow
Jan 21 '22
Standing on the shoulders of giants. Part one: Airflow
#
airflow
#
dataengineering
#
python
#
data
7
reactions
Comments
Add Comment
5 min read
Data Engineering in Julia
Logan Kilpatrick
Logan Kilpatrick
Logan Kilpatrick
Follow
Jan 29 '22
Data Engineering in Julia
#
machinelearning
#
julia
#
dataengineering
#
julialang
4
reactions
Comments
1
comment
1 min read
How Engineering Teams Use RudderStack to Support Marketing
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Jan 3 '22
How Engineering Teams Use RudderStack to Support Marketing
#
dataengineering
#
marketing
#
dataanalytics
#
datawarehouse
6
reactions
Comments
Add Comment
7 min read
Data Engineering Pipeline with AWS Step Functions, CodeBuild and Dagster
Tomas
Tomas
Tomas
Follow
Dec 30 '21
Data Engineering Pipeline with AWS Step Functions, CodeBuild and Dagster
#
aws
#
dagster
#
dataengineering
#
python
9
reactions
Comments
4
comments
10 min read
Why It’s Hard for Engineering to Support Marketing
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Dec 28 '21
Why It’s Hard for Engineering to Support Marketing
#
dataengineering
#
etl
#
marketing
#
dataanlytics
2
reactions
Comments
Add Comment
3 min read
Introduction to Data Engineering
OLABAYO BALOGUN
OLABAYO BALOGUN
OLABAYO BALOGUN
Follow
Dec 23 '21
Introduction to Data Engineering
#
dataengineering
#
programming
#
datascience
#
database
2
reactions
Comments
Add Comment
5 min read
Extract csv data and load it to PostgreSQL using Meltano ELT
Jorge PM
Jorge PM
Jorge PM
Follow
Dec 22 '21
Extract csv data and load it to PostgreSQL using Meltano ELT
#
python
#
meltano
#
dataengineering
#
elt
9
reactions
Comments
Add Comment
6 min read
What Is Event-Driven Machine Learning?
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Dec 17 '21
What Is Event-Driven Machine Learning?
#
eventdriven
#
machinelearning
#
cdp
#
dataengineering
6
reactions
Comments
Add Comment
4 min read
Host a fully persisted Apache NiFi service with docker
Cribber
Cribber
Cribber
Follow
Nov 10 '21
Host a fully persisted Apache NiFi service with docker
#
docker
#
apache
#
nifi
#
dataengineering
3
reactions
Comments
Add Comment
1 min read
Relational data models
Barbara
Barbara
Barbara
Follow
Nov 29 '21
Relational data models
#
sql
#
dataengineering
#
learn
#
sketchnote
5
reactions
Comments
Add Comment
2 min read
Implementing Graceful Shutdown in Go
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 26 '21
Implementing Graceful Shutdown in Go
#
go
#
dataengineering
#
sre
#
database
15
reactions
Comments
5
comments
14 min read
Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 25 '21
Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics
#
privacy
#
analytics
#
dataanalytics
#
dataengineering
4
reactions
Comments
1
comment
10 min read
RudderStack + Blendo: Better Together
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 25 '21
RudderStack + Blendo: Better Together
#
blendo
#
dataengineering
#
etl
#
cdp
2
reactions
Comments
Add Comment
7 min read
Web Scraping Sprott U Fund with BS4 in 10 Lines of Code
CincyBC
CincyBC
CincyBC
Follow
Nov 24 '21
Web Scraping Sprott U Fund with BS4 in 10 Lines of Code
#
python
#
beautifulsoup
#
dataengineering
30
reactions
Comments
Add Comment
3 min read
RudderStack’s Licensing Explained
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 24 '21
RudderStack’s Licensing Explained
#
cdp
#
dataengineering
#
rudderstack
#
privacy
3
reactions
Comments
Add Comment
4 min read
Introducing RudderStack's New, High-performance JavaScript SDK
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 22 '21
Introducing RudderStack's New, High-performance JavaScript SDK
#
javascript
#
sdk
#
rudderstack
#
dataengineering
2
reactions
Comments
Add Comment
3 min read
The Open Source Story - Open Sourcing RudderStack Blog and Docs
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 19 '21
The Open Source Story - Open Sourcing RudderStack Blog and Docs
#
opensource
#
gatsby
#
cdp
#
dataengineering
3
reactions
Comments
Add Comment
5 min read
4 Reasons Why Data Engineers Hate Google Tag Manager
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 15 '21
4 Reasons Why Data Engineers Hate Google Tag Manager
#
dataengineering
#
googletagmanager
#
dataanlytics
3
reactions
Comments
Add Comment
6 min read
Overcoming the Limitations of Client-Side Form Tracking With Webhooks
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Nov 2 '21
Overcoming the Limitations of Client-Side Form Tracking With Webhooks
#
webhook
#
javascript
#
dataengineering
#
jamstack
4
reactions
Comments
Add Comment
6 min read
The Data Engineering Megatrend: A Brief History
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Oct 28 '21
The Data Engineering Megatrend: A Brief History
#
datascience
#
dataengineering
#
database
#
rudderstack
2
reactions
Comments
Add Comment
7 min read
RudderStack Product News Vol. #013 - Destinations Re-design and New Integrations
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Oct 19 '21
RudderStack Product News Vol. #013 - Destinations Re-design and New Integrations
#
ui
#
dataengineering
#
privacy
2
reactions
Comments
Add Comment
2 min read
Data Engineering:Extract, Transform,and Load Using Talend Open Studio.
WanjohiChristopher
WanjohiChristopher
WanjohiChristopher
Follow
Oct 29 '21
Data Engineering:Extract, Transform,and Load Using Talend Open Studio.
#
dataengineering
#
etl
#
python
#
talend
21
reactions
Comments
1
comment
3 min read
Stream Your Database Changes with Change Data Capture: Part Two
Taron Foxworth
Taron Foxworth
Taron Foxworth
Follow
Sep 2 '21
Stream Your Database Changes with Change Data Capture: Part Two
#
database
#
cdc
#
changedatacapture
#
dataengineering
6
reactions
Comments
Add Comment
10 min read
Why the Cloud SaaS Tools Used by Marketing, Sales, and Product Teams Create Data Silos
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Aug 30 '21
Why the Cloud SaaS Tools Used by Marketing, Sales, and Product Teams Create Data Silos
#
cloudsaas
#
dataengineering
#
datascience
#
cdp
3
reactions
Comments
Add Comment
5 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account