DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Usage Guide:Quickly deploy an intelligent data platform with the One-stop AI development and production platform, AlphaIDE

Usage Guide:Quickly deploy an intelligent data platform with the One-stop AI development and production platform, AlphaIDE

8
Comments
3 min read
Data Pipelines with Apache Airflow - Book Review

Data Pipelines with Apache Airflow - Book Review

6
Comments
2 min read
What is big data analytics?

What is big data analytics?

7
Comments
7 min read
Why Big Data Analytics Is In The Big Picture in Banking Market?

Why Big Data Analytics Is In The Big Picture in Banking Market?

8
Comments 2
4 min read
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

7
Comments
3 min read
What is the Lakehouse, the latest Direction of Big Data Architecture?

What is the Lakehouse, the latest Direction of Big Data Architecture?

9
Comments
10 min read
BigQuery transactions over multiple queries, with sessions

BigQuery transactions over multiple queries, with sessions

12
Comments 2
3 min read
Dynamic way doing ETL through Pyspark

Dynamic way doing ETL through Pyspark

16
Comments 2
4 min read
Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

13
Comments
9 min read
May 9th in Streaming

May 9th in Streaming

6
Comments
1 min read
Build a real-time machine learning sample library using the best open-source project about big data and data lakehouse, LakeSoul

Build a real-time machine learning sample library using the best open-source project about big data and data lakehouse, LakeSoul

11
Comments
7 min read
Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

8
Comments
9 min read
How to prepare for the GCP Professional Data Engineer certification

How to prepare for the GCP Professional Data Engineer certification

19
Comments 1
8 min read
Apache Spark, Hive, and Spring Boot — Testing Guide

Apache Spark, Hive, and Spring Boot — Testing Guide

35
Comments 4
18 min read
Design concept of a best opensource project about big data and data lakehouse

Design concept of a best opensource project about big data and data lakehouse

9
Comments
9 min read
Details of 4 best opensource projects about big data you should try out(Ⅰ)

Details of 4 best opensource projects about big data you should try out(Ⅰ)

8
Comments
5 min read
Create a Hadoop playground with Docker Desktop on Windows in minutes

Create a Hadoop playground with Docker Desktop on Windows in minutes

6
Comments
4 min read
HIVE installation on WSL

HIVE installation on WSL

6
Comments
3 min read
How to create a DIY Inexpensive Cloud Data Lake

How to create a DIY Inexpensive Cloud Data Lake

8
Comments
3 min read
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

8
Comments
5 min read
Big Data in Cloud Computing - AWS

Big Data in Cloud Computing - AWS

14
Comments
2 min read
4 best opensource projects about big data you should try out

4 best opensource projects about big data you should try out

16
Comments 3
3 min read
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

8
Comments
2 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer

[OPINIÃO] Construindo uma Carreira como Data Engineer

2
Comments
2 min read
Characteristics of Big Data

Characteristics of Big Data

4
Comments
8 min read
Apache Spark Unit Testing Strategies

Apache Spark Unit Testing Strategies

7
Comments
3 min read
NodeJS - Get data from Redash v6 API

NodeJS - Get data from Redash v6 API

6
Comments
2 min read
Building an Apache ECharts dashboard with React and Cube

Building an Apache ECharts dashboard with React and Cube

14
Comments
11 min read
[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

[DICA] Adentre o universo da Engenharia de Dados com profissionais brasileiros que se tornaram referência na área!

5
Comments
2 min read
What are the best practices while using BigQuery?

What are the best practices while using BigQuery?

11
Comments
2 min read
Building a Bubble Dashboard with Cube

Building a Bubble Dashboard with Cube

9
Comments
14 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

6
Comments
3 min read
Fast Multivalue Look-ups For Huge Data Sets

Fast Multivalue Look-ups For Huge Data Sets

5
Comments
6 min read
Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

3
Comments
1 min read
What is the SingleStore and why should we use it?

What is the SingleStore and why should we use it?

9
Comments 2
3 min read
How to handle nested JSON with Apache Spark

How to handle nested JSON with Apache Spark

3
Comments
3 min read
Machine Learning Lifecycle Process

Machine Learning Lifecycle Process

43
Comments
4 min read
Quill- Most efficient Scala driver for Apache Cassandra and Spark

Quill- Most efficient Scala driver for Apache Cassandra and Spark

2
Comments
4 min read
Presenting ML-based COVID-19 Risk Assessment App Pandemonium

Presenting ML-based COVID-19 Risk Assessment App Pandemonium

3
Comments
3 min read
Cleaning And Normalizing Data Using AWS Glue DataBrew

Cleaning And Normalizing Data Using AWS Glue DataBrew

14
Comments 2
9 min read
Introduction to Apache Spark, SparkQL, and Spark MLib.

Introduction to Apache Spark, SparkQL, and Spark MLib.

11
Comments
15 min read
Data Lake explained

Data Lake explained

6
Comments
4 min read
Introduction to Hive(A SQL layer above Hadoop)

Introduction to Hive(A SQL layer above Hadoop)

6
Comments
9 min read
Build a small TA-Lib container image

Build a small TA-Lib container image

3
Comments
2 min read
SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

5
Comments
5 min read
How to choose a MongoDB shard key

How to choose a MongoDB shard key

8
Comments 1
3 min read
Scala Vs Python Syntax Cheat Sheet

Scala Vs Python Syntax Cheat Sheet

3
Comments
5 min read
Big Data Open Source Frameworks

Big Data Open Source Frameworks

3
Comments
5 min read
Scala For Beginners - Crash Course - Part 5

Scala For Beginners - Crash Course - Part 5

4
Comments
6 min read
Scala For Beginners - Crash Course - Part 2

Scala For Beginners - Crash Course - Part 2

3
Comments
6 min read
Scala For Beginners - Crash Course - Part 3

Scala For Beginners - Crash Course - Part 3

3
Comments
6 min read
Scala For Beginners - Crash Course - Part 4

Scala For Beginners - Crash Course - Part 4

3
Comments
4 min read
Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

6
Comments
3 min read
Getting started with Spark

Getting started with Spark

12
Comments 2
6 min read
The World Beyond the Docker! $$ :)

The World Beyond the Docker! $$ :)

5
Comments
2 min read
Airbyte: Data Integration / CDC Solution for Modern Data Teams!

Airbyte: Data Integration / CDC Solution for Modern Data Teams!

6
Comments
12 min read
Best extensions for JupyterLab!!

Best extensions for JupyterLab!!

5
Comments
3 min read
Vitess: Easy database deployment, clustering, and scaling!

Vitess: Easy database deployment, clustering, and scaling!

5
Comments
5 min read
Zero to Deployment and Evolution Data Catalog!

Zero to Deployment and Evolution Data Catalog!

4
Comments
6 min read
Build an analytics app with React and Cube.js

Build an analytics app with React and Cube.js

8
Comments
9 min read
loading...