Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
bigdata
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
From Raw Claims and Clinical Data to PCORnet CDM: End-to-End ETL on Snowflake
SciForce
SciForce
SciForce
Follow
Dec 4 '25
From Raw Claims and Clinical Data to PCORnet CDM: End-to-End ETL on Snowflake
#
ai
#
healthcare
#
datascience
#
bigdata
Comments
Add Comment
7 min read
GSoC Student Crushes It! The Inside Story Behind the OIDC Upgrade for Apache DolphinScheduler
Chen Debra
Chen Debra
Chen Debra
Follow
Dec 4 '25
GSoC Student Crushes It! The Inside Story Behind the OIDC Upgrade for Apache DolphinScheduler
#
apachedolphinscheduler
#
opensource
#
google
#
bigdata
Comments
Add Comment
10 min read
🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified
Sandeep
Sandeep
Sandeep
Follow
Dec 4 '25
🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified
#
spark
#
dataengineering
#
bigdata
#
python
Comments
Add Comment
2 min read
🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally
Sandeep
Sandeep
Sandeep
Follow
Dec 2 '25
🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally
#
spark
#
python
#
dataengineering
#
bigdata
Comments
Add Comment
2 min read
🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API
Sandeep
Sandeep
Sandeep
Follow
Dec 5 '25
🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API
#
dataengineering
#
python
#
spark
#
bigdata
Comments
Add Comment
2 min read
Day 8: Accelerating Spark Joins - Broadcast, Shuffle Optimization & Skew Handling
Sandeep
Sandeep
Sandeep
Follow
Dec 9 '25
Day 8: Accelerating Spark Joins - Broadcast, Shuffle Optimization & Skew Handling
#
dataengineering
#
python
#
spark
#
bigdata
Comments
Add Comment
2 min read
From Raw to Refined: Data Pipeline Architecture at Scale
Pradeep Kalluri
Pradeep Kalluri
Pradeep Kalluri
Follow
Nov 22 '25
From Raw to Refined: Data Pipeline Architecture at Scale
#
dataengineering
#
bigdata
#
python
#
dataquality
Comments
Add Comment
12 min read
Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks
Pradeep Kalluri
Pradeep Kalluri
Pradeep Kalluri
Follow
Nov 24 '25
Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks
#
dataengineering
#
dataquality
#
bigdata
#
python
Comments
Add Comment
8 min read
Building Real-Time Lakehouse with S3 Tables, AWS Glue, and Apache Doris
Apache Doris
Apache Doris
Apache Doris
Follow
Nov 21 '25
Building Real-Time Lakehouse with S3 Tables, AWS Glue, and Apache Doris
#
bigdata
#
lakehouse
#
database
#
apachedoris
Comments
Add Comment
3 min read
The Big Data Showdown: Apache Spark vs. Hadoop in 2026
Tech Croc
Tech Croc
Tech Croc
Follow
Dec 25 '25
The Big Data Showdown: Apache Spark vs. Hadoop in 2026
#
kafka
#
bigdata
#
googlecloud
#
datascience
5
 reactions
Comments
Add Comment
4 min read
Starting My Dev.to Journey: Learning, Building & Sharing
Vishnu Garuda
Vishnu Garuda
Vishnu Garuda
Follow
Nov 21 '25
Starting My Dev.to Journey: Learning, Building & Sharing
#
introduction
#
coding
#
datascience
#
bigdata
Comments
Add Comment
1 min read
10x Query Performance Improvement: The Design and Implementation of the New Unique Key
Apache Doris
Apache Doris
Apache Doris
Follow
Nov 20 '25
10x Query Performance Improvement: The Design and Implementation of the New Unique Key
#
bigdata
#
olap
#
database
#
apachedoris
Comments
Add Comment
30 min read
How Does Apache SeaTunnel Convert CDC Streams to Append-Only Mode?
Apache SeaTunnel
Apache SeaTunnel
Apache SeaTunnel
Follow
Nov 20 '25
How Does Apache SeaTunnel Convert CDC Streams to Append-Only Mode?
#
apacheseatunnel
#
bigdata
#
opensource
#
developer
Comments
Add Comment
4 min read
6 Essential Data Formats in Cloud Analytics: A Complete Guide with Examples
Raj Shriwastava
Raj Shriwastava
Raj Shriwastava
Follow
Nov 18 '25
6 Essential Data Formats in Cloud Analytics: A Complete Guide with Examples
#
cloud
#
bigdata
#
analytics
#
database
Comments
Add Comment
5 min read
Final Project Report 2| Apache SeaTunnel Adds Metalake Support
Apache SeaTunnel
Apache SeaTunnel
Apache SeaTunnel
Follow
Nov 14 '25
Final Project Report 2| Apache SeaTunnel Adds Metalake Support
#
apacheseatunnel
#
opensource
#
webdev
#
bigdata
Comments
Add Comment
4 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account