DEV Community

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How I Built a Big Data Survival Guide - Because My Semester Was Not Surviving Me

How I Built a Big Data Survival Guide - Because My Semester Was Not Surviving Me

2
Comments 1
3 min read
build-my-own-datalake: Improve metadata with caching

build-my-own-datalake: Improve metadata with caching

4
Comments
19 min read
From TB-Scale MongoDB to Doris: 5 Critical Challenges and Fixes with Apache SeaTunnel

From TB-Scale MongoDB to Doris: 5 Critical Challenges and Fixes with Apache SeaTunnel

4
Comments
9 min read
(I) An Overview of Data Warehouses and Data Lakes

(I) An Overview of Data Warehouses and Data Lakes

3
Comments
4 min read
Fuzzy-match millions of rows in Databricks (2026)

Fuzzy-match millions of rows in Databricks (2026)

9
Comments
5 min read
Overview of Real-Time Data Synchronization from PostgreSQL to VeloDB

Overview of Real-Time Data Synchronization from PostgreSQL to VeloDB

Comments
5 min read
Bigtable vs BigQuery: What’s the difference? (2026 Guide)

Bigtable vs BigQuery: What’s the difference? (2026 Guide)

5
Comments
4 min read
SeaTunnel CDC Explained: A Layman’s Guide

SeaTunnel CDC Explained: A Layman’s Guide

Comments
7 min read
Deep Dive into SeaTunnel Metadata Caching: The Underlying Logic Supporting Tens of Thousands of Concurrent Tasks

Deep Dive into SeaTunnel Metadata Caching: The Underlying Logic Supporting Tens of Thousands of Concurrent Tasks

Comments
5 min read
Schemas and Data Modelling in Power BI

Schemas and Data Modelling in Power BI

6
Comments
7 min read
How to Auto-Label your Segmentation Dataset with SAM3

How to Auto-Label your Segmentation Dataset with SAM3

1
Comments
10 min read
A Real-World Approach to Splitting Analytics Workloads Between Databricks and Trino

A Real-World Approach to Splitting Analytics Workloads Between Databricks and Trino

1
Comments
2 min read
BigQuery Sharing: An Underrated Data Exchange Platform You Should Know

BigQuery Sharing: An Underrated Data Exchange Platform You Should Know

Comments 1
4 min read
Why Apache Ozone is the Preferred Object Store for Big Data

Why Apache Ozone is the Preferred Object Store for Big Data

Comments
3 min read
Part 1 | A Scheduler Is More Than Just a “Timer”

Part 1 | A Scheduler Is More Than Just a “Timer”

1
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.