DEV Community

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Optimizing ETL Processes for Efficient Data Loading in EDWs

Optimizing ETL Processes for Efficient Data Loading in EDWs

Comments
4 min read
Patient-Centered Care and Data Integration in Population Health Management

Patient-Centered Care and Data Integration in Population Health Management

Comments
4 min read
The Basics of Big Data: What You Need to Know

The Basics of Big Data: What You Need to Know

Comments
3 min read
Why Apache Doris is the Best Open Source Alternative to Rockset

Why Apache Doris is the Best Open Source Alternative to Rockset

3
Comments
3 min read
Introduction to Apache Hadoop & MapReduce

Introduction to Apache Hadoop & MapReduce

5
Comments
3 min read
Blazingly-Fast Serialization: Apache Fury 0.5.1 released

Blazingly-Fast Serialization: Apache Fury 0.5.1 released

Comments
3 min read
Databricks - Variant Type Analysis

Databricks - Variant Type Analysis

2
Comments
7 min read
Metadata for win — Apache Parquet

Metadata for win — Apache Parquet

Comments
5 min read
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comments
3 min read
Advanced Insights into Automated Data Processing Tools

Advanced Insights into Automated Data Processing Tools

1
Comments
4 min read
How to Build an API with Strong Security Measures

How to Build an API with Strong Security Measures

Comments
4 min read
Documenting Rate Limits and Throttling in REST APIs

Documenting Rate Limits and Throttling in REST APIs

Comments
5 min read
GraphQL API Design Best Practices for Efficient Data Management

GraphQL API Design Best Practices for Efficient Data Management

1
Comments
5 min read
The current Lakehouse is like a false proposition

The current Lakehouse is like a false proposition

6
Comments 1
10 min read
Is distributed technology the panacea for big data processing?

Is distributed technology the panacea for big data processing?

7
Comments 1
10 min read
Big Data: a ferramenta que precisamos.

Big Data: a ferramenta que precisamos.

Comments
2 min read
PySpark: missing value

PySpark: missing value

Comments
2 min read
Cross-cluster replication for read-write separation

Cross-cluster replication for read-write separation

2
Comments
4 min read
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)

Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)

13
Comments
7 min read
Trino & Iceberg Made Easy: A Ready-to-Use Playground

Trino & Iceberg Made Easy: A Ready-to-Use Playground

18
Comments
3 min read
The Role of Data Integration in Healthcare Research and Precision Medicine

The Role of Data Integration in Healthcare Research and Precision Medicine

Comments 1
4 min read
Automating Data Processes for Efficiency and Accuracy

Automating Data Processes for Efficiency and Accuracy

Comments
5 min read
Auto-increment columns in Apache Doris

Auto-increment columns in Apache Doris

Comments
11 min read
What to use parquet or CSV?

What to use parquet or CSV?

21
Comments
3 min read
Accelerating ETL Processes for Timely Business Intelligence

Accelerating ETL Processes for Timely Business Intelligence

Comments
4 min read
loading...