DEV Community

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How to fuzzy-match 1M rows with dbt in under 10 minutes (2026 guide)

How to fuzzy-match 1M rows with dbt in under 10 minutes (2026 guide)

Comments 1
4 min read
How to Choose Between Serverless and Dedicated Compute in Databricks

How to Choose Between Serverless and Dedicated Compute in Databricks

3
Comments
3 min read
Orchestrating Our Way Out of Chaos: How I Compared Airflow, Prefect, and Dagster (and Picked What to Ship)

Orchestrating Our Way Out of Chaos: How I Compared Airflow, Prefect, and Dagster (and Picked What to Ship)

2
Comments
6 min read
Part 3 | How Does Scheduling Actually “Start Running”?

Part 3 | How Does Scheduling Actually “Start Running”?

4
Comments
5 min read
How to Implement Data Modelling in Power BI

How to Implement Data Modelling in Power BI

2
Comments
2 min read
The future of Data Engineering in Databricks - From Pipelines to Intent

The future of Data Engineering in Databricks - From Pipelines to Intent

2
Comments
2 min read
Designing a Cross-Cloud Data Plane with Apache Iceberg

Designing a Cross-Cloud Data Plane with Apache Iceberg

2
Comments
5 min read
How to Size a Spark Cluster. And How Not To.

How to Size a Spark Cluster. And How Not To.

2
Comments
6 min read
How I Built a Big Data Survival Guide - Because My Semester Was Not Surviving Me

How I Built a Big Data Survival Guide - Because My Semester Was Not Surviving Me

2
Comments 1
3 min read
build-my-own-datalake: Improve metadata with caching

build-my-own-datalake: Improve metadata with caching

4
Comments
19 min read
From TB-Scale MongoDB to Doris: 5 Critical Challenges and Fixes with Apache SeaTunnel

From TB-Scale MongoDB to Doris: 5 Critical Challenges and Fixes with Apache SeaTunnel

4
Comments
9 min read
(I) An Overview of Data Warehouses and Data Lakes

(I) An Overview of Data Warehouses and Data Lakes

3
Comments
4 min read
Fuzzy-match millions of rows in Databricks (2026)

Fuzzy-match millions of rows in Databricks (2026)

9
Comments
5 min read
Overview of Real-Time Data Synchronization from PostgreSQL to VeloDB

Overview of Real-Time Data Synchronization from PostgreSQL to VeloDB

Comments
5 min read
Bigtable vs BigQuery: What’s the difference? (2026 Guide)

Bigtable vs BigQuery: What’s the difference? (2026 Guide)

5
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.