DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
🏆How to master 📊 Big Data pipelines with Taipy and PySpark 🐍

🏆How to master 📊 Big Data pipelines with Taipy and PySpark 🐍

218
Comments 8
9 min read
Big data models 📊 vs. Computer memory 💾

Big data models 📊 vs. Computer memory 💾

186
Comments 3
11 min read
Event Tracking and Analytics via Ruby on Rails, DynamoDB (with Streams), Kinesis Firehose and Athena and CloudWatch Dashboard! 21:24

Event Tracking and Analytics via Ruby on Rails, DynamoDB (with Streams), Kinesis Firehose and Athena and CloudWatch Dashboard!

88
Comments
13 min read
Here is why you need a message broker

Here is why you need a message broker

57
Comments 4
7 min read
Data-Powered Accessibility: How to Build Inclusive Product for Any User Need

Data-Powered Accessibility: How to Build Inclusive Product for Any User Need

48
Comments
7 min read
Exploratory Data Analysis Using Python

Exploratory Data Analysis Using Python

44
Comments 1
5 min read
Machine Learning Lifecycle Process

Machine Learning Lifecycle Process

43
Comments
4 min read
Starting your Journey with Big Data Analytics

Starting your Journey with Big Data Analytics

37
Comments
4 min read
The Unbiased Guide to Choosing the Right BI Tool

The Unbiased Guide to Choosing the Right BI Tool

37
Comments 1
5 min read
Creating a Spark Standalone Cluster with Docker and docker-compose(2021 update)

Creating a Spark Standalone Cluster with Docker and docker-compose(2021 update)

35
Comments 4
7 min read
Apache Spark, Hive, and Spring Boot — Testing Guide

Apache Spark, Hive, and Spring Boot — Testing Guide

35
Comments 4
18 min read
Best Online Courses for Data Engineers In 2021

Best Online Courses for Data Engineers In 2021

32
Comments
7 min read
Data Engineering — Complete Reference Guide From A-Z [2019]

Data Engineering — Complete Reference Guide From A-Z [2019]

30
Comments
16 min read
What Are ETLs And Why We Use Them

What Are ETLs And Why We Use Them

30
Comments 2
14 min read
Automation and Machine Learning: A Match Made In Heaven

Automation and Machine Learning: A Match Made In Heaven

30
Comments 3
5 min read
AWS Data Lake with Terraform - Part 1 of 6

AWS Data Lake with Terraform - Part 1 of 6

28
Comments
4 min read
Guide - AWS Glue and PySpark

Guide - AWS Glue and PySpark

26
Comments
14 min read
UPSERTS and DELETES using AWS Glue and Delta Lake

UPSERTS and DELETES using AWS Glue and Delta Lake

25
Comments 4
10 min read
Extending Business Intelligence Features of Kibana

Extending Business Intelligence Features of Kibana

22
Comments 1
4 min read
SQL-based INSERTS, DELETES and UPSERTS in S3 using AWS Glue 3.0 and Delta Lake

SQL-based INSERTS, DELETES and UPSERTS in S3 using AWS Glue 3.0 and Delta Lake

21
Comments 8
8 min read
Event Streaming and AWS Kinesis

Event Streaming and AWS Kinesis

21
Comments
4 min read
7 Real-Time Data Streaming Tools You Should Consider On Your Next Project

7 Real-Time Data Streaming Tools You Should Consider On Your Next Project

21
Comments 1
9 min read
The Big Data Bravura: Introducing Apache Spark

The Big Data Bravura: Introducing Apache Spark

21
Comments 2
3 min read
Data Analyst vs Business Analyst

Data Analyst vs Business Analyst

20
Comments 5
4 min read
Hadoop Installation on Windows 10 using WSL

Hadoop Installation on Windows 10 using WSL

20
Comments
7 min read
Life Beyond Kafka with Apache Pulsar

Life Beyond Kafka with Apache Pulsar

19
Comments
4 min read
How to prepare for the GCP Professional Data Engineer certification

How to prepare for the GCP Professional Data Engineer certification

19
Comments 1
8 min read
Elasticsearch as a primary database?

Elasticsearch as a primary database?

19
Comments
2 min read
Simplifying ETL Pipelines with SQL: Three Tips for Data Processing

Simplifying ETL Pipelines with SQL: Three Tips for Data Processing

18
Comments
3 min read
Performance capabilities of data warehouses and how Cube can help

Performance capabilities of data warehouses and how Cube can help

18
Comments
18 min read
Unboxing a Database-How Databases Work Internally

Unboxing a Database-How Databases Work Internally

17
Comments 4
11 min read
Deep Data Dive with Kusto for Azure Data Explorer and Log Analytics

Deep Data Dive with Kusto for Azure Data Explorer and Log Analytics

17
Comments 1
7 min read
AWS Data Lake with Terraform - Part 2 of 6

AWS Data Lake with Terraform - Part 2 of 6

17
Comments
2 min read
There will be 175 Zettabytes of data in the world by 2025. Where will we store it?

There will be 175 Zettabytes of data in the world by 2025. Where will we store it?

17
Comments 2
1 min read
3 Ways To Improve Your Data Science Teams Efficiency

3 Ways To Improve Your Data Science Teams Efficiency

17
Comments
7 min read
Data lakes are hard

Data lakes are hard

17
Comments
4 min read
Top Technology Telegram Channels for IT professionals

Top Technology Telegram Channels for IT professionals

16
Comments
2 min read
4 best opensource projects about big data you should try out

4 best opensource projects about big data you should try out

16
Comments 3
3 min read
AWS Certified Big Data: Specialty study blueprint

AWS Certified Big Data: Specialty study blueprint

16
Comments
18 min read
Cube Cloud Deep Dive: Starting a New Cube App

Cube Cloud Deep Dive: Starting a New Cube App

16
Comments
9 min read
Dynamic way doing ETL through Pyspark

Dynamic way doing ETL through Pyspark

16
Comments 2
4 min read
What Is Trino And Why Is It Great At Processing Big Data

What Is Trino And Why Is It Great At Processing Big Data

15
Comments
7 min read
Spark. Anatomy of Spark application

Spark. Anatomy of Spark application

15
Comments
6 min read
Simulate IoT sensor, use Kafka to process data in real-time, save to Elasticsearch

Simulate IoT sensor, use Kafka to process data in real-time, save to Elasticsearch

15
Comments
4 min read
Kafka Getting Started - Kafka Series - Part 2

Kafka Getting Started - Kafka Series - Part 2

15
Comments
4 min read
ETLs vs ELTs: Why are ELTs Disrupting the Data Market?

ETLs vs ELTs: Why are ELTs Disrupting the Data Market?

15
Comments
8 min read
Big Data in Cloud Computing - AWS

Big Data in Cloud Computing - AWS

14
Comments
2 min read
Cleaning And Normalizing Data Using AWS Glue DataBrew

Cleaning And Normalizing Data Using AWS Glue DataBrew

14
Comments 2
9 min read
PySpark and Parquet - Analysis

PySpark and Parquet - Analysis

14
Comments
3 min read
Basic introduction to Big data

Basic introduction to Big data

14
Comments
3 min read
Building an Apache ECharts dashboard with React and Cube

Building an Apache ECharts dashboard with React and Cube

14
Comments
11 min read
Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

13
Comments
9 min read
Building Hadoop native libraries on Mac in 2019

Building Hadoop native libraries on Mac in 2019

13
Comments 18
5 min read
How discord manage 300M socket connection

How discord manage 300M socket connection

13
Comments
2 min read
Tutorial: Getting started with Azure Data Explorer using the Go SDK

Tutorial: Getting started with Azure Data Explorer using the Go SDK

13
Comments
9 min read
AWS: Redshift – quick start and SQL-workbench connection configuration

AWS: Redshift – quick start and SQL-workbench connection configuration

13
Comments
4 min read
Data Engineering skills

Data Engineering skills

13
Comments 1
3 min read
Azure Blob Storage with Pyspark

Azure Blob Storage with Pyspark

12
Comments 1
2 min read
Using PySpark and AWS Glue to analyze multi-line log files

Using PySpark and AWS Glue to analyze multi-line log files

12
Comments 1
5 min read
Getting started with Spark

Getting started with Spark

12
Comments 2
6 min read
loading...