Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
dataengineering
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
🌍 Automating Africa’s Energy Data Collection Using Python, Playwright(+Why Playwright ?), and MongoDB (2000–2024)
John Wakaba
John Wakaba
John Wakaba
Follow
Nov 4 '25
🌍 Automating Africa’s Energy Data Collection Using Python, Playwright(+Why Playwright ?), and MongoDB (2000–2024)
#
dataengineering
#
mongodb
#
python
#
automation
5
 reactions
Comments
Add Comment
5 min read
From 8 Minutes to 40 Seconds: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout
Byron Hsieh
Byron Hsieh
Byron Hsieh
Follow
Nov 6 '25
From 8 Minutes to 40 Seconds: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout
#
git
#
devops
#
dataengineering
#
azure
Comments
Add Comment
5 min read
Create a Microsoft Fabric Lakehouse
lotanna obianefo
lotanna obianefo
lotanna obianefo
Follow
Oct 2 '25
Create a Microsoft Fabric Lakehouse
#
database
#
dataengineering
#
cloudcomputing
#
datascience
5
 reactions
Comments
Add Comment
6 min read
Core Concepts of Kafka
Farhan Khan
Farhan Khan
Farhan Khan
Follow
Oct 2 '25
Core Concepts of Kafka
#
architecture
#
beginners
#
dataengineering
Comments
Add Comment
8 min read
From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks.
elisha lukalia
elisha lukalia
elisha lukalia
Follow
Oct 2 '25
From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks.
#
dataengineering
#
automation
#
tutorial
#
sql
Comments
Add Comment
9 min read
Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches
Leanid Herasimau
Leanid Herasimau
Leanid Herasimau
Follow
Oct 3 '25
Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches
#
architecture
#
backend
#
dataengineering
Comments
Add Comment
6 min read
Introduction to Apache Airflow
John Kioko
John Kioko
John Kioko
Follow
Oct 6 '25
Introduction to Apache Airflow
#
dataengineering
#
beginners
#
learning
#
python
1
 reaction
Comments
Add Comment
4 min read
Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK
André Paris
André Paris
André Paris
Follow
Oct 14 '25
Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK
#
aws
#
dataengineering
#
typescript
#
postgres
2
 reactions
Comments
Add Comment
8 min read
From Postgres to Iceberg
Brian Misachi
Brian Misachi
Brian Misachi
Follow
Nov 5 '25
From Postgres to Iceberg
#
database
#
postgres
#
dataengineering
1
 reaction
Comments
Add Comment
11 min read
Real-Time Cryptocurrency Data Pipeline
Lagat Josiah
Lagat Josiah
Lagat Josiah
Follow
Nov 4 '25
Real-Time Cryptocurrency Data Pipeline
#
cryptocurrency
#
dataengineering
#
python
#
monitoring
Comments
Add Comment
12 min read
Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025
Kuldeep Paul
Kuldeep Paul
Kuldeep Paul
Follow
Oct 14 '25
Synthetic Data for RAG: Safe Generation, Deduplication, and Drift-Aware Curation in 2025
#
dataengineering
#
rag
#
testing
#
llm
2
 reactions
Comments
Add Comment
10 min read
Personal Picks: Data Product News (October 1, 2025)
Sagara
Sagara
Sagara
Follow
Oct 1 '25
Personal Picks: Data Product News (October 1, 2025)
#
dataengineering
#
snowflake
#
databricks
Comments
Add Comment
7 min read
1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL
Apache Doris
Apache Doris
Apache Doris
Follow
Nov 4 '25
1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL
#
bigdata
#
database
#
olap
#
dataengineering
6
 reactions
Comments
Add Comment
7 min read
SQL: is there a better way to code this?
David Kershaw
David Kershaw
David Kershaw
Follow
Nov 4 '25
SQL: is there a better way to code this?
#
sql
#
csv
#
database
#
dataengineering
Comments
1
 comment
1 min read
Building Real-Time Data Pipelines from PostgreSQL Using Flink CDC
Lakshmi Narayana Rasalay
Lakshmi Narayana Rasalay
Lakshmi Narayana Rasalay
Follow
Oct 4 '25
Building Real-Time Data Pipelines from PostgreSQL Using Flink CDC
#
architecture
#
dataengineering
#
postgres
Comments
Add Comment
5 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account