DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
My Databricks article compilation of 2019

My Databricks article compilation of 2019

6
Comments
2 min read
Converting CSV to ORC/Parquet fast without a cluster!

Converting CSV to ORC/Parquet fast without a cluster!

7
Comments
6 min read
Cloud Data Fusion, a game-changer for GCP

Cloud Data Fusion, a game-changer for GCP

12
Comments 7
4 min read
6 big data trends and forecasts worthy of attention in 2020

6 big data trends and forecasts worthy of attention in 2020

5
Comments
3 min read
Multi-Class Image Classification With Transfer Learning In PySpark

Multi-Class Image Classification With Transfer Learning In PySpark

11
Comments
9 min read
Working with BigQuery Analytic Functions

Working with BigQuery Analytic Functions

6
Comments
5 min read
Building a Successful Modern Data Analytics Platform in the Cloud

Building a Successful Modern Data Analytics Platform in the Cloud

8
Comments
11 min read
AWS: Redshift – quick start and SQL-workbench connection configuration

AWS: Redshift – quick start and SQL-workbench connection configuration

13
Comments
4 min read
Data Lake vs Data Warehouse

Data Lake vs Data Warehouse

10
Comments
2 min read
Life Beyond Kafka with Apache Pulsar

Life Beyond Kafka with Apache Pulsar

19
Comments
4 min read
Explain MapReduce Like I'm Five

Explain MapReduce Like I'm Five

8
Comments
5 min read
Toward GCP Data Engineer certification

Toward GCP Data Engineer certification

9
Comments
1 min read
Azure Blob Storage with Pyspark

Azure Blob Storage with Pyspark

12
Comments 1
2 min read
Building simple data pipelines in Azure using Cosmos DB, Databricks and Blob Storage

Building simple data pipelines in Azure using Cosmos DB, Databricks and Blob Storage

5
Comments
15 min read
How to handle BigData?

How to handle BigData?

4
Comments 4
2 min read
Big Data file formats explained

Big Data file formats explained

10
Comments
7 min read
Spark. Anatomy of Spark application

Spark. Anatomy of Spark application

17
Comments
6 min read
Categorical Variables and Cardinality

Categorical Variables and Cardinality

5
Comments
1 min read
Event Tracking and Analytics via Ruby on Rails, DynamoDB (with Streams), Kinesis Firehose and Athena and CloudWatch Dashboard! 21:24

Event Tracking and Analytics via Ruby on Rails, DynamoDB (with Streams), Kinesis Firehose and Athena and CloudWatch Dashboard!

88
Comments
13 min read
Book on Advanced Data Structures and Algorithms for Big Data Applications

Book on Advanced Data Structures and Algorithms for Big Data Applications

9
Comments
3 min read
Data Engineering — Complete Reference Guide From A-Z [2019]

Data Engineering — Complete Reference Guide From A-Z [2019]

31
Comments
16 min read
MongoDB Atlas Data Lake

MongoDB Atlas Data Lake

10
Comments
5 min read
How we built a highly scalable distributed state machine

How we built a highly scalable distributed state machine

9
Comments
16 min read
PySpark and Parquet - Analysis

PySpark and Parquet - Analysis

14
Comments 1
3 min read
Creating a proof of concept for Spatial Joins

Creating a proof of concept for Spatial Joins

4
Comments
4 min read
loading...