DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
A Look at the Long-Lasting Java and Big Data Relationship (With a List of Resources Data Scientists Can Use for Java Learning)

A Look at the Long-Lasting Java and Big Data Relationship (With a List of Resources Data Scientists Can Use for Java Learning)

5
Comments 1
8 min read
BIG DATA COURSE

BIG DATA COURSE

3
Comments
3 min read
What In The World Is Dremio And Why Is It Valued At 1 Billion Dollars?

What In The World Is Dremio And Why Is It Valued At 1 Billion Dollars?

5
Comments
7 min read
The ugly truth of the CDP

The ugly truth of the CDP

4
Comments
1 min read
Spark MLlib for Big data and Machine learning

Spark MLlib for Big data and Machine learning

8
Comments
4 min read
The Unbiased Guide to Choosing the Right BI Tool

The Unbiased Guide to Choosing the Right BI Tool

37
Comments 1
5 min read
Optimize Data Lake layout using Clustering in Apache Hudi

Optimize Data Lake layout using Clustering in Apache Hudi

2
Comments
6 min read
Aprendiendo Spark: #1 Introducción

Aprendiendo Spark: #1 Introducción

11
Comments
3 min read
Using Your Own Apache Spark/Hudi Versions With AWS EMR

Using Your Own Apache Spark/Hudi Versions With AWS EMR

4
Comments
2 min read
What is Chaos Engineering: Theory, Principles & Benefits

What is Chaos Engineering: Theory, Principles & Benefits

3
Comments
6 min read
Kinesis Data Streams vs. Kinesis Firehose Delivery Streams

Kinesis Data Streams vs. Kinesis Firehose Delivery Streams

7
Comments
3 min read
5 Best Hadoop Tutorials to Start in 2024

5 Best Hadoop Tutorials to Start in 2024

9
Comments
7 min read
写给女朋友的 SQL 教程——数据模型

写给女朋友的 SQL 教程——数据模型

2
Comments
1 min read
Right Sizing Snowflake Warehouses / Compute

Right Sizing Snowflake Warehouses / Compute

2
Comments
3 min read
What Is Big Data?

What Is Big Data?

3
Comments
6 min read
Hadoop Installation on Windows 10 using WSL

Hadoop Installation on Windows 10 using WSL

29
Comments 1
7 min read
Here is a python ORM/Driver for InfluxDB : Influxable

Here is a python ORM/Driver for InfluxDB : Influxable

8
Comments
2 min read
Spark on Kubernetes Made Easy - How Data Mechanics Improves on the Open-Source version

Spark on Kubernetes Made Easy - How Data Mechanics Improves on the Open-Source version

7
Comments
5 min read
Data Analytics on AWS — What, Why & How

Data Analytics on AWS — What, Why & How

11
Comments
13 min read
Obstacles on the road to automation: Why self-driving cars still need to overcome the big data hurdle

Obstacles on the road to automation: Why self-driving cars still need to overcome the big data hurdle

2
Comments
5 min read
Event Driven Data Pipelines in AWS

Event Driven Data Pipelines in AWS

5
Comments
9 min read
Data Analyst vs Business Analyst

Data Analyst vs Business Analyst

21
Comments 6
4 min read
5 Reasons Why Big Data Analytics is the Best Career Move

5 Reasons Why Big Data Analytics is the Best Career Move

2
Comments
4 min read
Automation and Machine Learning: A Match Made In Heaven

Automation and Machine Learning: A Match Made In Heaven

30
Comments 3
5 min read
Trying to grow an open-source ETL project with PHP

Trying to grow an open-source ETL project with PHP

4
Comments
1 min read
3 Ways To Improve Your Data Science Teams Efficiency

3 Ways To Improve Your Data Science Teams Efficiency

17
Comments
7 min read
Apache Spark Java Tutorial: Simplest Guide to Get Started

Apache Spark Java Tutorial: Simplest Guide to Get Started

10
Comments
3 min read
Simulate IoT sensor, use Kafka to process data in real-time, save to Elasticsearch

Simulate IoT sensor, use Kafka to process data in real-time, save to Elasticsearch

15
Comments
4 min read
Change Data Capture from PostgreSQL to Azure Data Explorer using Kafka Connect

Change Data Capture from PostgreSQL to Azure Data Explorer using Kafka Connect

8
Comments
17 min read
S3 vs HDFS

S3 vs HDFS

4
Comments 3
1 min read
Top Hadoop Interview Questions

Top Hadoop Interview Questions

5
Comments
2 min read
What Are ETLs And Why We Use Them

What Are ETLs And Why We Use Them

30
Comments 2
14 min read
Predicting machine failures with distributed computing (Spark, AWS EMR, and DL)

Predicting machine failures with distributed computing (Spark, AWS EMR, and DL)

9
Comments
10 min read
Introduction to Data Pipelines

Introduction to Data Pipelines

2
Comments 1
4 min read
Enterprise Digital Transformation Guide in the Post Covid World

Enterprise Digital Transformation Guide in the Post Covid World

2
Comments 1
4 min read
Dark Data and why it matters in Big Data

Dark Data and why it matters in Big Data

2
Comments
3 min read
Please ELI5 big data and privacy concerns, and possible black hacks

Please ELI5 big data and privacy concerns, and possible black hacks

2
Comments 3
1 min read
Demystify Apache Spark with Azure Synapse Analytics

Demystify Apache Spark with Azure Synapse Analytics

6
Comments
1 min read
MLOps

MLOps

6
Comments
2 min read
Spark Journey begins...

Spark Journey begins...

8
Comments
3 min read
Data Ingestion into Azure Data Explorer using Kafka Connect on Kubernetes

Data Ingestion into Azure Data Explorer using Kafka Connect on Kubernetes

7
Comments 1
12 min read
Data Scraping and Data Crawling, what are they for?

Data Scraping and Data Crawling, what are they for?

6
Comments 1
5 min read
Transform AWS CloudTrail data using AWS Data Wrangler

Transform AWS CloudTrail data using AWS Data Wrangler

3
Comments
8 min read
Working with nested structures in Spark

Working with nested structures in Spark

7
Comments 1
3 min read
Guide - AWS Glue and PySpark

Guide - AWS Glue and PySpark

27
Comments
14 min read
Intoduction to Apache Spark

Intoduction to Apache Spark

10
Comments
6 min read
Kafka Connect in 60 seconds 01:00

Kafka Connect in 60 seconds

4
Comments
2 min read
Data Governance 101

Data Governance 101

7
Comments
4 min read
Big Data - Testing Strategy

Big Data - Testing Strategy

2
Comments
1 min read
Supply Chain Risk Management with Data Analytics

Supply Chain Risk Management with Data Analytics

2
Comments
2 min read
Tutorial: How to Ingest data from Kafka into Azure Data Explorer

Tutorial: How to Ingest data from Kafka into Azure Data Explorer

12
Comments
10 min read
Streaming data into Kafka S01/E02 - Loading XML file

Streaming data into Kafka S01/E02 - Loading XML file

3
Comments 2
10 min read
Unit Testing Apache Spark Structured Streaming using MemoryStream

Unit Testing Apache Spark Structured Streaming using MemoryStream

7
Comments
4 min read
Exploiting Schema Inference in Apache Spark

Exploiting Schema Inference in Apache Spark

2
Comments
3 min read
Apache Kafka WebSocket data ingestion using Spring Cloud Stream

Apache Kafka WebSocket data ingestion using Spring Cloud Stream

2
Comments
6 min read
Dados & Informações

Dados & Informações

5
Comments
4 min read
How to use Azure Go SDK to manage Azure Data Explorer clusters

How to use Azure Go SDK to manage Azure Data Explorer clusters

6
Comments
9 min read
Tutorial: Getting started with Azure Data Explorer using the Go SDK

Tutorial: Getting started with Azure Data Explorer using the Go SDK

13
Comments
9 min read
How to create a low-cost Apache Spark cluster on Microsoft Azure

How to create a low-cost Apache Spark cluster on Microsoft Azure

7
Comments
4 min read
Hadoop vs Spark: Which is a better framework to select for processing Big Data?

Hadoop vs Spark: Which is a better framework to select for processing Big Data?

6
Comments
5 min read
loading...