DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
SQL Filtering and Sorting with Real-life Examples

SQL Filtering and Sorting with Real-life Examples

Comments
4 min read
Construyendo una aplicación con Change Data Capture (CDC) utilizando Debezium, Kafka y NiFi

Construyendo una aplicación con Change Data Capture (CDC) utilizando Debezium, Kafka y NiFi

Comments
3 min read
5 effektive Methoden, um Bilder aus Webseiten zu extrahieren

5 effektive Methoden, um Bilder aus Webseiten zu extrahieren

Comments
3 min read
Goodbye Kafka: Build a Low-Cost User Analysis System

Goodbye Kafka: Build a Low-Cost User Analysis System

Comments
5 min read
Query 1B Rows in PostgreSQL >25x Faster with Squirrels!

Query 1B Rows in PostgreSQL >25x Faster with Squirrels!

Comments 8
5 min read
Introduction to Hadoop:)

Introduction to Hadoop:)

6
Comments
10 min read
The Heart of DolphinScheduler: In-Depth Analysis of the Quartz Scheduling Framework

The Heart of DolphinScheduler: In-Depth Analysis of the Quartz Scheduling Framework

8
Comments
3 min read
Big Data

Big Data

Comments
1 min read
Introduction to Data lakes: The future of big data storage

Introduction to Data lakes: The future of big data storage

10
Comments
2 min read
The Apache Iceberg™ Small File Problem

The Apache Iceberg™ Small File Problem

5
Comments
3 min read
System Design 09 - Data Partitioning: Dividing to Conquer Big Data

System Design 09 - Data Partitioning: Dividing to Conquer Big Data

Comments
2 min read
Understanding Star Schema vs. Snowflake Schema

Understanding Star Schema vs. Snowflake Schema

Comments
1 min read
Introduction to Messaging Systems with Kafka

Introduction to Messaging Systems with Kafka

Comments
16 min read
Best Practices for Data Security in Big Data Projects

Best Practices for Data Security in Big Data Projects

Comments
6 min read
🚀 Unlock the Power of ORC File Format 📊

🚀 Unlock the Power of ORC File Format 📊

5
Comments
1 min read
SeaTunnel-Powered Data Integration: How 58 Group Handles Over 500 Billion+ Data Points Daily

SeaTunnel-Powered Data Integration: How 58 Group Handles Over 500 Billion+ Data Points Daily

5
Comments 2
5 min read
5 Big Data Use Cases that Retailers Fail to Use for Actionable Insights

5 Big Data Use Cases that Retailers Fail to Use for Actionable Insights

Comments
3 min read
From ETL and ELT to Reverse ETL

From ETL and ELT to Reverse ETL

Comments
4 min read
Introduction to Big Data Analysis

Introduction to Big Data Analysis

8
Comments
13 min read
How Big Data is Powering the Internet of Things (IoT) Revolution - MasTech InfoTrellis

How Big Data is Powering the Internet of Things (IoT) Revolution - MasTech InfoTrellis

Comments
4 min read
Why Scala is the Best Choice for Big Data Applications: Advantages Over Java and Python

Why Scala is the Best Choice for Big Data Applications: Advantages Over Java and Python

Comments
6 min read
Processando 20 milhões de registros em menos de 5 segundos com Apache Hive.

Processando 20 milhões de registros em menos de 5 segundos com Apache Hive.

10
Comments
8 min read
SeaTunnel Community Monthly Report For September

SeaTunnel Community Monthly Report For September

Comments
14 min read
Effizientes Scrapen von JavaScript-Webseiten

Effizientes Scrapen von JavaScript-Webseiten

Comments
3 min read
Tracking Data Over Time: Slowly Changing Dimensions (SCD)

Tracking Data Over Time: Slowly Changing Dimensions (SCD)

Comments
6 min read
Big Data Challenges and Solutions: Navigating the Complex Landscape

Big Data Challenges and Solutions: Navigating the Complex Landscape

Comments
7 min read
Fünf Schritte zum Scraping mehrerer Bilder mit Python

Fünf Schritte zum Scraping mehrerer Bilder mit Python

Comments
2 min read
Simplifying Real-Time Data Ingestion with Apache NiFi

Simplifying Real-Time Data Ingestion with Apache NiFi

1
Comments
3 min read
Introduction to Big Data

Introduction to Big Data

5
Comments 2
2 min read
Why Apache Spark RDD is immutable?

Why Apache Spark RDD is immutable?

Comments
3 min read
Reducing Delivery Times and Costs: How Machine Learning Optimizes Delivery Routes Efficiently

Reducing Delivery Times and Costs: How Machine Learning Optimizes Delivery Routes Efficiently

1
Comments 1
3 min read
Hands-on introduction to Apache Iceberg

Hands-on introduction to Apache Iceberg

8
Comments 2
8 min read
Embarking on the Big Query Quest: Exploring the Depths of its Inner Workings

Embarking on the Big Query Quest: Exploring the Depths of its Inner Workings

Comments
5 min read
The Journey From a CSV File to Apache Hive Table

The Journey From a CSV File to Apache Hive Table

5
Comments
6 min read
How to Become an Apache SeaTunnel Committer?

How to Become an Apache SeaTunnel Committer?

1
Comments
4 min read
Building a Big Data Playground Sandbox for Learning

Building a Big Data Playground Sandbox for Learning

5
Comments
5 min read
Big Data Storage Trends and Insights

Big Data Storage Trends and Insights

Comments
7 min read
Data Analysis: The Power of Big Data and Analytics in Decision Making 📊

Data Analysis: The Power of Big Data and Analytics in Decision Making 📊

Comments
3 min read
Cassandra vs. MongoDB: Choosing the Right NoSQL Database

Cassandra vs. MongoDB: Choosing the Right NoSQL Database

Comments
3 min read
Which Data Synchronization Method is More Senior?

Which Data Synchronization Method is More Senior?

1
Comments
8 min read
Journey Through Spark SQL

Journey Through Spark SQL

Comments
11 min read
Scala vs. Java: The Superior Choice for Big Data and Machine Learning

Scala vs. Java: The Superior Choice for Big Data and Machine Learning

1
Comments 1
11 min read
Understanding Data Schemas

Understanding Data Schemas

Comments
5 min read
The Ultimate Guide to Data Analytics: Unlocking the Power of Data

The Ultimate Guide to Data Analytics: Unlocking the Power of Data

Comments
3 min read
Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Comments
5 min read
Optimize ETL Processes with Apache Iceberg: A Game Changer

Optimize ETL Processes with Apache Iceberg: A Game Changer

Comments
4 min read
The Must-Have Features of Modern Data Transformation Tools

The Must-Have Features of Modern Data Transformation Tools

Comments
6 min read
An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

4
Comments
4 min read
To Index Data is To Sort Data

To Index Data is To Sort Data

8
Comments
5 min read
How to install Apache Kafka on Ubuntu with KRaft Mode (without Zookeeper): A Step-by-Step Guide

How to install Apache Kafka on Ubuntu with KRaft Mode (without Zookeeper): A Step-by-Step Guide

Comments
10 min read
Using ReAct Agents LLMs to Draw Insights from Tabular Data

Using ReAct Agents LLMs to Draw Insights from Tabular Data

8
Comments
7 min read
Data Driven Dreams: Building My Data Science Career

Data Driven Dreams: Building My Data Science Career

Comments
4 min read
A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

Comments
1 min read
Optimizing Transformations in Pentaho: Case Study

Optimizing Transformations in Pentaho: Case Study

Comments
3 min read
Loading data to Google Big Query using Dataproc workflow templates and cloud Schedule

Loading data to Google Big Query using Dataproc workflow templates and cloud Schedule

2
Comments
12 min read
Data Visualisation Basics

Data Visualisation Basics

9
Comments
7 min read
Connecting AI with Excel - Talk to Your Spreadsheets

Connecting AI with Excel - Talk to Your Spreadsheets

2
Comments
6 min read
Demystifying Data Science: A Beginner’s Guide!

Demystifying Data Science: A Beginner’s Guide!

Comments
3 min read
Data Lakes vs. Data Warehouses: Choosing the Right Big Data Architecture

Data Lakes vs. Data Warehouses: Choosing the Right Big Data Architecture

1
Comments
4 min read
How to Install Hadoop on Ubuntu: A Step-by-Step Guide

How to Install Hadoop on Ubuntu: A Step-by-Step Guide

1
Comments
10 min read
loading...