DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Basic introduction to Big data

Basic introduction to Big data

14
Comments
3 min read
5 Best Practices for Setting Up Your Data Warehouse in the Cloud

5 Best Practices for Setting Up Your Data Warehouse in the Cloud

6
Comments
6 min read
Building Hadoop native libraries on Mac in 2019

Building Hadoop native libraries on Mac in 2019

14
Comments 18
5 min read
Kafka Monitoring in Production - eBook

Kafka Monitoring in Production - eBook

10
Comments
1 min read
Data lakes are hard

Data lakes are hard

17
Comments
4 min read
Become a Pro at Pandas, Python’s data manipulation Library

Become a Pro at Pandas, Python’s data manipulation Library

10
Comments
6 min read
Kafka Getting Started - Kafka Series - Part 2

Kafka Getting Started - Kafka Series - Part 2

15
Comments
4 min read
How Apache Kafka works? Kafka Series - Part 1

How Apache Kafka works? Kafka Series - Part 1

18
Comments 5
3 min read
[Antisèche] Apache Spark : structure d'une application Spark

[Antisèche] Apache Spark : structure d'une application Spark

6
Comments
2 min read
Learn BigData from Google Cloud Platform.

Learn BigData from Google Cloud Platform.

11
Comments
2 min read
Installing, Configuring and Using the Azure Databricks CLI

Installing, Configuring and Using the Azure Databricks CLI

8
Comments
3 min read
Different ways to word count in apache spark

Different ways to word count in apache spark

10
Comments
2 min read
How to Deal with Big Data Analytics Easily?

How to Deal with Big Data Analytics Easily?

8
Comments
9 min read
What is the Future of Big Data Analytics and Hadoop?

What is the Future of Big Data Analytics and Hadoop?

8
Comments
2 min read
How to Process Epic Amounts of Data in NodeJS

How to Process Epic Amounts of Data in NodeJS

108
Comments 1
6 min read
Google BigQuery's Python SDK: Creating Tables Programmatically

Google BigQuery's Python SDK: Creating Tables Programmatically

6
Comments
7 min read
Free Sources for Learning Big Data, Block Chain and IoT

Free Sources for Learning Big Data, Block Chain and IoT

6
Comments
4 min read
From CSVs to Tables: Infer Schema Data Types From Raw Spreadsheets

From CSVs to Tables: Infer Schema Data Types From Raw Spreadsheets

7
Comments
8 min read
Wielding the power of web transparency

Wielding the power of web transparency

15
Comments 1
9 min read
[Video] Visualizing data at scale with Google Data Studio

[Video] Visualizing data at scale with Google Data Studio

7
Comments
1 min read
Big Data Analysis with Hadoop, Spark, and R Shiny

Big Data Analysis with Hadoop, Spark, and R Shiny

31
Comments 1
12 min read
Apache Hadoop - TLS and SSL Notes

Apache Hadoop - TLS and SSL Notes

10
Comments
4 min read
Processing Streaming Twitter Data using Kafka and Spark - Part 2: Creating Kafka Twitter producer

Processing Streaming Twitter Data using Kafka and Spark - Part 2: Creating Kafka Twitter producer

21
Comments 5
7 min read
Processing Streaming Twitter Data using Kafka and Spark — Part 1: Setting Up Kafka Cluster

Processing Streaming Twitter Data using Kafka and Spark — Part 1: Setting Up Kafka Cluster

18
Comments
4 min read
Processing Streaming Twitter Data using Kafka and Spark — The Plan

Processing Streaming Twitter Data using Kafka and Spark — The Plan

11
Comments
2 min read
Streams For the Win: A Performance Comparison of Node.js Methods for Reading Large Datasets (Pt 2)

Streams For the Win: A Performance Comparison of Node.js Methods for Reading Large Datasets (Pt 2)

5
Comments
9 min read
Window Functions in Stream Analytics

Window Functions in Stream Analytics

30
Comments 5
9 min read
What makes code slow to execute

What makes code slow to execute

14
Comments
1 min read
Blockchain: What Is It, How It Works, And What It Means For Big Data

Blockchain: What Is It, How It Works, And What It Means For Big Data

8
Comments
4 min read
Amazon Athena vs AWS Lambda: Comparing two solutions for Big Data Analysis

Amazon Athena vs AWS Lambda: Comparing two solutions for Big Data Analysis

22
Comments 5
8 min read
Super simple and fast delimited CSV data normalization with AWK

Super simple and fast delimited CSV data normalization with AWK

10
Comments
2 min read
Managing and Configuring Clusters within Azure Databricks

Managing and Configuring Clusters within Azure Databricks

11
Comments
9 min read
Streaming Data in Databricks Delta Tables

Streaming Data in Databricks Delta Tables

14
Comments 3
3 min read
Databases and Tables in Azure Databricks

Databases and Tables in Azure Databricks

13
Comments
5 min read
What Is MapReduce?

What Is MapReduce?

47
Comments 3
7 min read
生醫大數據:從集權治理到公眾參與

生醫大數據:從集權治理到公眾參與

18
Comments
2 min read
Expertise and context-based answer rating system for Q&A websites.

Expertise and context-based answer rating system for Q&A websites.

7
Comments
1 min read
Local hadoop on laptop for practice

Local hadoop on laptop for practice

20
Comments
4 min read
Apache Livy - Apache Spark, HDFS, and Kerberos

Apache Livy - Apache Spark, HDFS, and Kerberos

14
Comments
2 min read
Using Hadoop in Azure HDInsight to process Big Data

Using Hadoop in Azure HDInsight to process Big Data

13
Comments
6 min read
Apache HBase - REST API - Atomic Operations

Apache HBase - REST API - Atomic Operations

9
Comments
6 min read
Apache Storm - Topology Permissions

Apache Storm - Topology Permissions

6
Comments
2 min read
Apache Hadoop S3A With Hitachi Content Platform (HCP)

Apache Hadoop S3A With Hitachi Content Platform (HCP)

7
Comments
4 min read
Apache Livy - Simplified Apache Spark Integration

Apache Livy - Simplified Apache Spark Integration

11
Comments
2 min read
Apache Ranger - Hive over HDFS Audit Logs

Apache Ranger - Hive over HDFS Audit Logs

8
Comments
3 min read
Apache Ambari - Custom Alert Dispatch Script

Apache Ambari - Custom Alert Dispatch Script

8
Comments
2 min read
Oracle JDK - Missing Ciphers - libsunec.so

Oracle JDK - Missing Ciphers - libsunec.so

6
Comments
3 min read
Apache Knox - Improved Group Support

Apache Knox - Improved Group Support

8
Comments
3 min read
Apache Knox - Proxying Apache NiFi

Apache Knox - Proxying Apache NiFi

7
Comments
13 min read
HDF - Apache NiFi - Kerberos Errors and useSubjectCredsOnly

HDF - Apache NiFi - Kerberos Errors and useSubjectCredsOnly

2
Comments
3 min read
Learning about the Druid Architecture

Learning about the Druid Architecture

10
Comments
6 min read
NLP Terminology in 5 Minutes

NLP Terminology in 5 Minutes

30
Comments
2 min read
Looking at Challenges of Big Data Testing with Hadoop

Looking at Challenges of Big Data Testing with Hadoop

16
Comments
4 min read
Will Hadoop-based Recommendation Engines Make Search Obsolete?

Will Hadoop-based Recommendation Engines Make Search Obsolete?

11
Comments 1
4 min read
3 Challenges of Building a Big Data Backend for Your Enterprise Mobile App

3 Challenges of Building a Big Data Backend for Your Enterprise Mobile App

22
Comments 2
5 min read
Big Data and NoSQL: A Great Coupling

Big Data and NoSQL: A Great Coupling

18
Comments 2
5 min read
Getting started with stream processing using Apache Flink

Getting started with stream processing using Apache Flink

26
Comments 1
8 min read
Getting started with batch processing using Apache Flink

Getting started with batch processing using Apache Flink

15
Comments
9 min read
Big data applications in the Java application development environment

Big data applications in the Java application development environment

12
Comments 2
4 min read
Apache Spark vs. Apache Flink

Apache Spark vs. Apache Flink

34
Comments 3
6 min read
loading...