DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

8
Comments
2 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer

[OPINIÃO] Construindo uma Carreira como Data Engineer

2
Comments
2 min read
Characteristics of Big Data

Characteristics of Big Data

4
Comments
8 min read
Apache Spark Unit Testing Strategies

Apache Spark Unit Testing Strategies

9
Comments
1 min read
NodeJS - Get data from Redash v6 API

NodeJS - Get data from Redash v6 API

6
Comments
2 min read
Building an Apache ECharts dashboard with React and Cube

Building an Apache ECharts dashboard with React and Cube

14
Comments
11 min read
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

6
Comments
2 min read
What are the best practices while using BigQuery?

What are the best practices while using BigQuery?

11
Comments
2 min read
Building a Bubble Dashboard with Cube

Building a Bubble Dashboard with Cube

9
Comments
14 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

6
Comments
3 min read
Fast Multivalue Look-ups For Huge Data Sets

Fast Multivalue Look-ups For Huge Data Sets

6
Comments
6 min read
Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

4
Comments
1 min read
What is the SingleStore and why should we use it?

What is the SingleStore and why should we use it?

11
Comments 2
3 min read
How to handle nested JSON with Apache Spark

How to handle nested JSON with Apache Spark

3
Comments
3 min read
Machine Learning Lifecycle Process

Machine Learning Lifecycle Process

45
Comments
4 min read
Quill- Most efficient Scala driver for Apache Cassandra and Spark

Quill- Most efficient Scala driver for Apache Cassandra and Spark

2
Comments
4 min read
Presenting ML-based COVID-19 Risk Assessment App Pandemonium

Presenting ML-based COVID-19 Risk Assessment App Pandemonium

4
Comments
3 min read
Cleaning And Normalizing Data Using AWS Glue DataBrew

Cleaning And Normalizing Data Using AWS Glue DataBrew

14
Comments 3
9 min read
Introduction to Apache Spark, SparkQL, and Spark MLib.

Introduction to Apache Spark, SparkQL, and Spark MLib.

12
Comments
15 min read
Data Lake explained

Data Lake explained

6
Comments
4 min read
Introduction to Hive(A SQL layer above Hadoop)

Introduction to Hive(A SQL layer above Hadoop)

8
Comments
9 min read
Build a small TA-Lib container image

Build a small TA-Lib container image

3
Comments
2 min read
SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

5
Comments
5 min read
How to choose a MongoDB shard key

How to choose a MongoDB shard key

8
Comments 1
3 min read
Big Data Open Source Frameworks

Big Data Open Source Frameworks

3
Comments
5 min read
Scala Vs Python Syntax Cheat Sheet

Scala Vs Python Syntax Cheat Sheet

4
Comments
5 min read
Scala For Beginners - Crash Course - Part 2

Scala For Beginners - Crash Course - Part 2

3
Comments
6 min read
Scala For Beginners - Crash Course - Part 5

Scala For Beginners - Crash Course - Part 5

4
Comments
6 min read
Scala For Beginners - Crash Course - Part 3

Scala For Beginners - Crash Course - Part 3

3
Comments
6 min read
Scala For Beginners - Crash Course - Part 4

Scala For Beginners - Crash Course - Part 4

3
Comments
4 min read
Django + Mongodb works slowly

Django + Mongodb works slowly

1
Comments
1 min read
Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

7
Comments
3 min read
Getting started with Spark

Getting started with Spark

12
Comments 2
6 min read
The World Beyond the Docker! $$ :)

The World Beyond the Docker! $$ :)

5
Comments
2 min read
Airbyte: Data Integration / CDC Solution for Modern Data Teams!

Airbyte: Data Integration / CDC Solution for Modern Data Teams!

6
Comments
12 min read
Best extensions for JupyterLab!!

Best extensions for JupyterLab!!

6
Comments
3 min read
Vitess: Easy database deployment, clustering, and scaling!

Vitess: Easy database deployment, clustering, and scaling!

5
Comments
5 min read
Zero to Deployment and Evolution Data Catalog!

Zero to Deployment and Evolution Data Catalog!

4
Comments
6 min read
Build an analytics app with React and Cube.js

Build an analytics app with React and Cube.js

8
Comments
9 min read
Cardinality Counting in Redis

Cardinality Counting in Redis

2
Comments
4 min read
Cube Cloud Deep Dive: Mastering Pre-Aggregations

Cube Cloud Deep Dive: Mastering Pre-Aggregations

6
Comments
11 min read
BigQuery SQL Tip: QUALIFY clause

BigQuery SQL Tip: QUALIFY clause

5
Comments
1 min read
What Is Trino And Why Is It Great At Processing Big Data

What Is Trino And Why Is It Great At Processing Big Data

19
Comments
7 min read
Using PySpark and AWS Glue to analyze multi-line log files

Using PySpark and AWS Glue to analyze multi-line log files

12
Comments 1
5 min read
ETLs vs ELTs: Why are ELTs Disrupting the Data Market?

ETLs vs ELTs: Why are ELTs Disrupting the Data Market?

15
Comments
8 min read
How IoT integration with ERP system can bring business benefits

How IoT integration with ERP system can bring business benefits

2
Comments
4 min read
Bigdata: A problem and a solution

Bigdata: A problem and a solution

1
Comments
4 min read
Cube Cloud Deep Dive: Starting a New Cube App

Cube Cloud Deep Dive: Starting a New Cube App

16
Comments
9 min read
How Zero-Code Data Preparations Tools Enable Better, Faster IT Performance in the Age of Big Data

How Zero-Code Data Preparations Tools Enable Better, Faster IT Performance in the Age of Big Data

2
Comments
6 min read
Build your own data quality rules with AWS Glue DataBrew

Build your own data quality rules with AWS Glue DataBrew

12
Comments
6 min read
Identifying and handling personally identifiable information (PII) ด้วย AWS Glue DataBrew

Identifying and handling personally identifiable information (PII) ด้วย AWS Glue DataBrew

6
Comments
4 min read
Understanding Apache Hive LLAP

Understanding Apache Hive LLAP

3
Comments
7 min read
What Is Crypto and How Does It Work ?

What Is Crypto and How Does It Work ?

9
Comments 7
3 min read
Data lakes: building a serverless data pipeline

Data lakes: building a serverless data pipeline

3
Comments
6 min read
Installing Hadoop on the new M1 Pro and M1 Max MacBook Pro

Installing Hadoop on the new M1 Pro and M1 Max MacBook Pro

9
Comments
8 min read
Meet the Innovators with Krzysztof Nowocin 12:14

Meet the Innovators with Krzysztof Nowocin

2
Comments
6 min read
The Important SQL Queries for Beginners

The Important SQL Queries for Beginners

13
Comments
8 min read
A first update on our AI/ML/Big Data salary survey

A first update on our AI/ML/Big Data salary survey

2
Comments
2 min read
Data Engineering Introduction

Data Engineering Introduction

7
Comments
2 min read
Performance capabilities of data warehouses and how Cube can help

Performance capabilities of data warehouses and how Cube can help

18
Comments
18 min read
loading...