DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks

Data Quality at Scale: Why Your Pipeline Needs More Than Green Checkmarks

Comments
8 min read
From Raw to Refined: Data Pipeline Architecture at Scale

From Raw to Refined: Data Pipeline Architecture at Scale

Comments
12 min read
Building Real-Time Lakehouse with S3 Tables, AWS Glue, and Apache Doris

Building Real-Time Lakehouse with S3 Tables, AWS Glue, and Apache Doris

Comments
3 min read
Starting My Dev.to Journey: Learning, Building & Sharing

Starting My Dev.to Journey: Learning, Building & Sharing

Comments
1 min read
10x Query Performance Improvement: The Design and Implementation of the New Unique Key

10x Query Performance Improvement: The Design and Implementation of the New Unique Key

Comments
30 min read
How Does Apache SeaTunnel Convert CDC Streams to Append-Only Mode?

How Does Apache SeaTunnel Convert CDC Streams to Append-Only Mode?

Comments
4 min read
6 Essential Data Formats in Cloud Analytics: A Complete Guide with Examples

6 Essential Data Formats in Cloud Analytics: A Complete Guide with Examples

Comments
5 min read
Why Parquet Is Everywhere - And What Makes It Actually Fast?

Why Parquet Is Everywhere - And What Makes It Actually Fast?

2
Comments
3 min read
Final Project Report 2| Apache SeaTunnel Adds Metalake Support

Final Project Report 2| Apache SeaTunnel Adds Metalake Support

Comments
4 min read
Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine

Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine

Comments
4 min read
Enabling Continuous Deployment with Amazon Elastic Container Service and Infrastructure as Code

Enabling Continuous Deployment with Amazon Elastic Container Service and Infrastructure as Code

Comments
6 min read
From DataWareHouses to BigData Systems: What and Why - Questions that nobody asks, but you should!

From DataWareHouses to BigData Systems: What and Why - Questions that nobody asks, but you should!

Comments
6 min read
Migration Case: From Azkaban to DolphinScheduler

Migration Case: From Azkaban to DolphinScheduler

Comments
4 min read
1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL

1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL

5
Comments
7 min read
The data lakehouse evolution

The data lakehouse evolution

Comments
11 min read
How to build real-time user-facing analytics with Kafka + Flink + Doris

How to build real-time user-facing analytics with Kafka + Flink + Doris

4
Comments
9 min read
Apache DolphinScheduler 3.3.2 Released! Major Updates in Performance and Stability

Apache DolphinScheduler 3.3.2 Released! Major Updates in Performance and Stability

Comments
3 min read
📊🔍 OpenSearch Dashboards: Optimizing Massive Data Queries (Big Data) with Asynchronous Search

📊🔍 OpenSearch Dashboards: Optimizing Massive Data Queries (Big Data) with Asynchronous Search

3
Comments
2 min read
🐝 Why Hive Exists - And Why Its Complexity Is Actually Necessary

🐝 Why Hive Exists - And Why Its Complexity Is Actually Necessary

2
Comments
3 min read
Building a Universal Lakehouse Catalog: Beyond Iceberg Tables

Building a Universal Lakehouse Catalog: Beyond Iceberg Tables

Comments
10 min read
(1) Emerging Data Lakehouse Handbook (2025): Concepts and Design of Data Warehouse Layering

(1) Emerging Data Lakehouse Handbook (2025): Concepts and Design of Data Warehouse Layering

Comments
5 min read
Tutorial: Intro to Apache Iceberg with Apache Polaris and Apache Spark

Tutorial: Intro to Apache Iceberg with Apache Polaris and Apache Spark

Comments
20 min read
Apache Spark সহজভাবে জানি

Apache Spark সহজভাবে জানি

1
Comments
1 min read
Fueling the Future: How Big Data and AI are Unlocking Green Hydrogen's Potential

Fueling the Future: How Big Data and AI are Unlocking Green Hydrogen's Potential

5
Comments
6 min read
Code Green: How Big Data and AI are Engineering a Sustainable Planet

Code Green: How Big Data and AI are Engineering a Sustainable Planet

Comments
8 min read
loading...