DEV Community

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Reducing Delivery Times and Costs: How Machine Learning Optimizes Delivery Routes Efficiently

Reducing Delivery Times and Costs: How Machine Learning Optimizes Delivery Routes Efficiently

1
Comments 1
3 min read
Best Practices for Data Security in Big Data Projects

Best Practices for Data Security in Big Data Projects

Comments
6 min read
Big Data Storage Trends and Insights

Big Data Storage Trends and Insights

Comments
7 min read
5 Big Data Use Cases that Retailers Fail to Use for Actionable Insights

5 Big Data Use Cases that Retailers Fail to Use for Actionable Insights

Comments
3 min read
Hands-on introduction to Apache Iceberg

Hands-on introduction to Apache Iceberg

4
Comments 1
8 min read
From ETL and ELT to Reverse ETL

From ETL and ELT to Reverse ETL

Comments
4 min read
How Big Data is Powering the Internet of Things (IoT) Revolution - MasTech InfoTrellis

How Big Data is Powering the Internet of Things (IoT) Revolution - MasTech InfoTrellis

Comments
4 min read
Building a Big Data Playground Sandbox for Learning

Building a Big Data Playground Sandbox for Learning

4
Comments
5 min read
Why Scala is the Best Choice for Big Data Applications: Advantages Over Java and Python

Why Scala is the Best Choice for Big Data Applications: Advantages Over Java and Python

Comments
6 min read
SeaTunnel Community Monthly Report For September

SeaTunnel Community Monthly Report For September

Comments
14 min read
Tracking Data Over Time: Slowly Changing Dimensions (SCD)

Tracking Data Over Time: Slowly Changing Dimensions (SCD)

Comments
6 min read
Big Data Challenges and Solutions: Navigating the Complex Landscape

Big Data Challenges and Solutions: Navigating the Complex Landscape

Comments
7 min read
The Journey From a CSV File to Apache Hive Table

The Journey From a CSV File to Apache Hive Table

6
Comments
6 min read
Why Apache Spark RDD is immutable?

Why Apache Spark RDD is immutable?

Comments
3 min read
Embarking on the Big Query Quest: Exploring the Depths of its Inner Workings

Embarking on the Big Query Quest: Exploring the Depths of its Inner Workings

Comments
5 min read
How to Become an Apache SeaTunnel Committer?

How to Become an Apache SeaTunnel Committer?

1
Comments
4 min read
Data Analysis: The Power of Big Data and Analytics in Decision Making đź“Š

Data Analysis: The Power of Big Data and Analytics in Decision Making đź“Š

Comments
3 min read
Cassandra vs. MongoDB: Choosing the Right NoSQL Database

Cassandra vs. MongoDB: Choosing the Right NoSQL Database

Comments
3 min read
Which Data Synchronization Method is More Senior?

Which Data Synchronization Method is More Senior?

1
Comments
8 min read
Journey Through Spark SQL

Journey Through Spark SQL

Comments
11 min read
Connecting AI with Excel - Talk to Your Spreadsheets

Connecting AI with Excel - Talk to Your Spreadsheets

1
Comments
6 min read
Scala vs. Java: The Superior Choice for Big Data and Machine Learning

Scala vs. Java: The Superior Choice for Big Data and Machine Learning

1
Comments 1
11 min read
Understanding Data Schemas

Understanding Data Schemas

Comments
5 min read
The Ultimate Guide to Data Analytics: Unlocking the Power of Data

The Ultimate Guide to Data Analytics: Unlocking the Power of Data

Comments
3 min read
Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Comments
5 min read
Optimize ETL Processes with Apache Iceberg: A Game Changer

Optimize ETL Processes with Apache Iceberg: A Game Changer

Comments
4 min read
Data Visualisation Basics

Data Visualisation Basics

8
Comments
7 min read
The Must-Have Features of Modern Data Transformation Tools

The Must-Have Features of Modern Data Transformation Tools

Comments
6 min read
An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

Comments
4 min read
To Index Data is To Sort Data

To Index Data is To Sort Data

8
Comments
5 min read
How to install Apache Kafka on Ubuntu with KRaft Mode (without Zookeeper): A Step-by-Step Guide

How to install Apache Kafka on Ubuntu with KRaft Mode (without Zookeeper): A Step-by-Step Guide

Comments
10 min read
Using ReAct Agents LLMs to Draw Insights from Tabular Data

Using ReAct Agents LLMs to Draw Insights from Tabular Data

4
Comments
7 min read
Data Driven Dreams: Building My Data Science Career

Data Driven Dreams: Building My Data Science Career

Comments
4 min read
A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

Comments
1 min read
Optimizing Transformations in Pentaho: Case Study

Optimizing Transformations in Pentaho: Case Study

Comments
3 min read
Loading data to Google Big Query using Dataproc workflow templates and cloud Schedule

Loading data to Google Big Query using Dataproc workflow templates and cloud Schedule

2
Comments
12 min read
Demystifying Data Science: A Beginner’s Guide!

Demystifying Data Science: A Beginner’s Guide!

Comments
3 min read
Data Lakes vs. Data Warehouses: Choosing the Right Big Data Architecture

Data Lakes vs. Data Warehouses: Choosing the Right Big Data Architecture

1
Comments
4 min read
How to Install Hadoop on Ubuntu: A Step-by-Step Guide

How to Install Hadoop on Ubuntu: A Step-by-Step Guide

Comments
10 min read
🤔 Is It Possible to Achieve 100% Test Automation?

🤔 Is It Possible to Achieve 100% Test Automation?

Comments
2 min read
Data ingestion – definition, types and best practices

Data ingestion – definition, types and best practices

Comments
8 min read
How to Handle Databases with Billions of Records

How to Handle Databases with Billions of Records

2
Comments
1 min read
Effective Strategies for Scaling Databases: Enhancing Performance for Growing Data Needs

Effective Strategies for Scaling Databases: Enhancing Performance for Growing Data Needs

4
Comments
5 min read
Databricks - Variant Type Analysis

Databricks - Variant Type Analysis

Comments
7 min read
Working with Parquet files in Java using Carpet

Working with Parquet files in Java using Carpet

1
Comments
6 min read
Optimizing ETL Processes for Efficient Data Loading in EDWs

Optimizing ETL Processes for Efficient Data Loading in EDWs

Comments
4 min read
Patient-Centered Care and Data Integration in Population Health Management

Patient-Centered Care and Data Integration in Population Health Management

Comments
4 min read
The Basics of Big Data: What You Need to Know

The Basics of Big Data: What You Need to Know

Comments
3 min read
Why Apache Doris is the Best Open Source Alternative to Rockset

Why Apache Doris is the Best Open Source Alternative to Rockset

3
Comments
3 min read
Introduction to Apache Hadoop & MapReduce

Introduction to Apache Hadoop & MapReduce

5
Comments
3 min read
Blazingly-Fast Serialization: Apache Fury 0.5.1 released

Blazingly-Fast Serialization: Apache Fury 0.5.1 released

Comments
3 min read
Metadata for win — Apache Parquet

Metadata for win — Apache Parquet

Comments
5 min read
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comments
3 min read
Advanced Insights into Automated Data Processing Tools

Advanced Insights into Automated Data Processing Tools

1
Comments
4 min read
Real-Time Sentiment Analysis using PySpark and FastAPI

Real-Time Sentiment Analysis using PySpark and FastAPI

2
Comments
1 min read
How to Build an API with Strong Security Measures

How to Build an API with Strong Security Measures

Comments
4 min read
Documenting Rate Limits and Throttling in REST APIs

Documenting Rate Limits and Throttling in REST APIs

Comments
5 min read
GraphQL API Design Best Practices for Efficient Data Management

GraphQL API Design Best Practices for Efficient Data Management

Comments
5 min read
The current Lakehouse is like a false proposition

The current Lakehouse is like a false proposition

6
Comments 1
10 min read
Is distributed technology the panacea for big data processing?

Is distributed technology the panacea for big data processing?

7
Comments 1
10 min read
loading...