Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
DEV Community
Close
#
bigdata
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Simplest pyspark tutorial
muriuki muriungi erick
muriuki muriungi erick
muriuki muriungi erick
Follow
Apr 19 '23
Simplest pyspark tutorial
#
spark
#
bigdata
#
machinelearning
#
sql
2
reactions
Comments
Add Comment
7 min read
Making Debezium 2.x Support Confluent Schema Registry
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Apr 17 '23
Making Debezium 2.x Support Confluent Schema Registry
#
bigdata
#
architecture
#
docker
#
tutorial
2
reactions
Comments
3
comments
3 min read
Performance Enhancement: Conversion Funnel Analysis
jbx1279
jbx1279
jbx1279
Follow
Apr 2 '23
Performance Enhancement: Conversion Funnel Analysis
#
bigdata
#
sql
#
database
#
programming
Comments
Add Comment
9 min read
Boost Your Testing Strategy: The Coolest Methods to Prioritize A/B Tests Like a Pro! 🎲📊😎
Olga R
Olga R
Olga R
Follow
Apr 11 '23
Boost Your Testing Strategy: The Coolest Methods to Prioritize A/B Tests Like a Pro! 🎲📊😎
#
analytics
#
bigdata
#
productivity
3
reactions
Comments
Add Comment
4 min read
A Comprehensive Comparison of JuiceFS and HDFS for Cloud-Based Big Data Storage
tonybarber2
tonybarber2
tonybarber2
Follow
Apr 7 '23
A Comprehensive Comparison of JuiceFS and HDFS for Cloud-Based Big Data Storage
#
bigdata
#
opensource
#
cloud
1
reaction
Comments
Add Comment
11 min read
Apache Doris be common problem positioning and processing
Lemon
Lemon
Lemon
Follow
Feb 24 '23
Apache Doris be common problem positioning and processing
#
apachedoris
#
doris
#
olap
#
bigdata
1
reaction
Comments
Add Comment
3 min read
How to use docker to compile Apache Doris
Lemon
Lemon
Lemon
Follow
Feb 24 '23
How to use docker to compile Apache Doris
#
apachedoris
#
doris
#
olap
#
bigdata
2
reactions
Comments
Add Comment
3 min read
The Secret to Rapid Scaling: How Scraping Helped These Startups Go From Zero to $1.2+ Trillion
Tomas Laurinavicius
Tomas Laurinavicius
Tomas Laurinavicius
Follow
Mar 28 '23
The Secret to Rapid Scaling: How Scraping Helped These Startups Go From Zero to $1.2+ Trillion
#
startup
#
bigdata
#
scraping
#
datascience
6
reactions
Comments
1
comment
6 min read
Mastering Large-Scale Data Processing: Building a Data Pipeline with ApacheAGE for Efficient Ingestion, Processing, and Analysis
Humza Tareen
Humza Tareen
Humza Tareen
Follow
Mar 25 '23
Mastering Large-Scale Data Processing: Building a Data Pipeline with ApacheAGE for Efficient Ingestion, Processing, and Analysis
#
apacheage
#
postgres
#
datascience
#
bigdata
2
reactions
Comments
Add Comment
2 min read
How we mastered dbt: A true story
Olga Braginskaya
Olga Braginskaya
Olga Braginskaya
Follow
Mar 22 '23
How we mastered dbt: A true story
#
bigdata
#
dataengineering
#
dbt
#
tutorial
7
reactions
Comments
Add Comment
14 min read
Exploration of Spark Executor Memory
Lorenzo Lou
Lorenzo Lou
Lorenzo Lou
Follow
Mar 21 '23
Exploration of Spark Executor Memory
#
spark
#
programming
#
bigdata
Comments
Add Comment
9 min read
GETTING STARTED WITH SENTIMENT ANALYSIS.
BRENDA ATIENO ODHIAMBO
BRENDA ATIENO ODHIAMBO
BRENDA ATIENO ODHIAMBO
Follow
Mar 17 '23
GETTING STARTED WITH SENTIMENT ANALYSIS.
#
python
#
datascience
#
dataanalysis
#
bigdata
2
reactions
Comments
Add Comment
4 min read
Lightweight HTTP API for Big Data on S3
Paulius
Paulius
Paulius
Follow
for
Exacaster
Mar 15 '23
Lightweight HTTP API for Big Data on S3
#
deltalake
#
bigdata
#
opensource
#
s3
3
reactions
Comments
Add Comment
3 min read
How to cope with high-concurrency account query?
jbx1279
jbx1279
jbx1279
Follow
Mar 12 '23
How to cope with high-concurrency account query?
#
database
#
bigdata
#
performance
#
programming
Comments
Add Comment
6 min read
Don't Break the Bank on SQL Queries: BigQuery On-Demand vs Flat-Rate prices. Which Saves You More? 💰😎
Olga R
Olga R
Olga R
Follow
Mar 12 '23
Don't Break the Bank on SQL Queries: BigQuery On-Demand vs Flat-Rate prices. Which Saves You More? 💰😎
#
productivity
#
database
#
sql
#
bigdata
5
reactions
Comments
3
comments
5 min read
Read before-The Ultimate Guide to AWS IoT Core: What it is, How it helps, and Real-World use Cases. Mini-Project-Intro
Augusto Valdivia
Augusto Valdivia
Augusto Valdivia
Follow
for
AWS Community Builders
Mar 12 '23
Read before-The Ultimate Guide to AWS IoT Core: What it is, How it helps, and Real-World use Cases. Mini-Project-Intro
#
awsiotcore
#
terraform
#
iot
#
bigdata
7
reactions
Comments
Add Comment
3 min read
"Features of Data Lake Federated Analysis"_Apache Doris Summit 2022
31:03
SelectDB
SelectDB
SelectDB
Follow
Feb 7 '23
"Features of Data Lake Federated Analysis"_Apache Doris Summit 2022
#
database
#
bigdata
#
datalake
#
opensource
2
reactions
Comments
Add Comment
1 min read
Tencent Data Engineer: Why We Go from ClickHouse to Apache Doris?
Apache Doris
Apache Doris
Apache Doris
Follow
Mar 7 '23
Tencent Data Engineer: Why We Go from ClickHouse to Apache Doris?
#
database
#
datascience
#
bigdata
1
reaction
Comments
Add Comment
11 min read
ClickHouse is fast, esProc SPL is faster
jbx1279
jbx1279
jbx1279
Follow
Feb 27 '23
ClickHouse is fast, esProc SPL is faster
#
bigdata
#
database
#
sql
#
programming
1
reaction
Comments
Add Comment
10 min read
EXPLORATORY DATA ANALYSIS ULTIMATE GUIDE.
BRENDA ATIENO ODHIAMBO
BRENDA ATIENO ODHIAMBO
BRENDA ATIENO ODHIAMBO
Follow
Feb 24 '23
EXPLORATORY DATA ANALYSIS ULTIMATE GUIDE.
#
python
#
datascience
#
dataanalysis
#
bigdata
1
reaction
Comments
Add Comment
3 min read
Importando Funções Python do Repos para o Notebook do Databricks
romerito
romerito
romerito
Follow
Feb 10 '23
Importando Funções Python do Repos para o Notebook do Databricks
#
spark
#
bigdata
#
programming
#
python
Comments
Add Comment
3 min read
How To Deal With a Database With Billions of Records
DbVisualizer
DbVisualizer
DbVisualizer
Follow
Feb 20 '23
How To Deal With a Database With Billions of Records
#
bigdata
2
reactions
Comments
Add Comment
6 min read
Amazon Redshift: What, Why, and How
Vikas Solegaonkar
Vikas Solegaonkar
Vikas Solegaonkar
Follow
for
AWS Community Builders
Feb 20 '23
Amazon Redshift: What, Why, and How
#
redshift
#
aws
#
bigdata
#
database
2
reactions
Comments
1
comment
5 min read
Hadoop/Spark is too heavy, esProc SPL is light
jbx1279
jbx1279
jbx1279
Follow
Feb 6 '23
Hadoop/Spark is too heavy, esProc SPL is light
#
bigdata
#
database
#
programming
Comments
Add Comment
12 min read
What Is Deep Learning? Deep Learning Algorithms Take Center Stage
Kate Baker
Kate Baker
Kate Baker
Follow
Feb 15 '23
What Is Deep Learning? Deep Learning Algorithms Take Center Stage
#
deeplearning
#
machinelearning
#
bigdata
#
ai
1
reaction
Comments
Add Comment
4 min read
How working/install Pig with Notebooks?
Lucas M. Ríos
Lucas M. Ríos
Lucas M. Ríos
Follow
Jan 27 '23
How working/install Pig with Notebooks?
#
bigdata
#
datascience
#
opensource
#
analytics
1
reaction
Comments
Add Comment
4 min read
Why we use Terraform for BigQuery
Nelis Goeminne
Nelis Goeminne
Nelis Goeminne
Follow
for
Lighthouse
Jan 24 '23
Why we use Terraform for BigQuery
#
terraform
#
googlecloud
#
bigdata
5
reactions
Comments
Add Comment
6 min read
#011 Databricks explained for busy engineers | Databricks quick start | Databricks Data Security
Kemal Cholovich
Kemal Cholovich
Kemal Cholovich
Follow
Jan 23 '23
#011 Databricks explained for busy engineers | Databricks quick start | Databricks Data Security
#
databricks
#
bigdata
2
reactions
Comments
Add Comment
2 min read
Apache Kafka — The Big Data Messaging tool
Gursimar Singh
Gursimar Singh
Gursimar Singh
Follow
Jan 22 '23
Apache Kafka — The Big Data Messaging tool
#
bigdata
#
beginners
#
programming
#
tutorial
11
reactions
Comments
1
comment
10 min read
DataWarehouse and BigQuery
Ruma Sinha
Ruma Sinha
Ruma Sinha
Follow
Jan 16 '23
DataWarehouse and BigQuery
#
bigquery
#
datawarehouse
#
bigdata
#
distributedcomputing
1
reaction
Comments
Add Comment
4 min read
How working/install Spark with Notebooks?
Lucas M. Ríos
Lucas M. Ríos
Lucas M. Ríos
Follow
Jan 16 '23
How working/install Spark with Notebooks?
#
python
#
datascience
#
bigdata
#
cloud
3
reactions
Comments
Add Comment
3 min read
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark
PaulOpu
PaulOpu
PaulOpu
Follow
Jan 14 '23
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark
#
python
#
pyspark
#
bigdata
#
dataengineering
2
reactions
Comments
Add Comment
4 min read
Type of data in hadoop
shubham mishra
shubham mishra
shubham mishra
Follow
Jan 14 '23
Type of data in hadoop
#
bigdata
#
hadoop
#
datascience
2
reactions
Comments
Add Comment
2 min read
The impasse of SQL performance optimizing
jbx1279
jbx1279
jbx1279
Follow
Jan 13 '23
The impasse of SQL performance optimizing
#
database
#
bigdata
#
sql
#
programming
1
reaction
Comments
Add Comment
9 min read
Data Pipeline: From ETL to EL plus T
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Jan 9 '23
Data Pipeline: From ETL to EL plus T
#
bigdata
#
tutorial
#
architecture
#
datascience
Comments
Add Comment
4 min read
How working/install Hadoop with Notebooks?
Lucas M. Ríos
Lucas M. Ríos
Lucas M. Ríos
Follow
Jan 7 '23
How working/install Hadoop with Notebooks?
#
python
#
datascience
#
bigdata
#
cloud
4
reactions
Comments
Add Comment
4 min read
SeaTunnel Zeta engine, the first choice for massive data synchronization, is officially released!
Apache SeaTunnel
Apache SeaTunnel
Apache SeaTunnel
Follow
Jan 6 '23
SeaTunnel Zeta engine, the first choice for massive data synchronization, is officially released!
#
seatunnel
#
opensource
#
bigdata
2
reactions
Comments
Add Comment
8 min read
Design considerations for large data import
Angha Ramdohokar
Angha Ramdohokar
Angha Ramdohokar
Follow
Jan 2 '23
Design considerations for large data import
#
bigdata
#
design
#
database
#
architecture
1
reaction
Comments
Add Comment
3 min read
Playing Window Function in Postgres
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Dec 26 '22
Playing Window Function in Postgres
#
sql
#
tutorial
#
bigdata
#
datascience
Comments
Add Comment
4 min read
Read Hierarchical Data Format file
masoomjethwa
masoomjethwa
masoomjethwa
Follow
Dec 25 '22
Read Hierarchical Data Format file
#
python
#
bigdata
#
hdf
Comments
Add Comment
1 min read
Working with large CSV files in Python from Scratch
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Follow
Dec 21 '22
Working with large CSV files in Python from Scratch
#
datascience
#
dataengineering
#
bigdata
#
python
6
reactions
Comments
Add Comment
1 min read
Explaining Pagination in ElasticSearch
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Dec 19 '22
Explaining Pagination in ElasticSearch
#
tutorial
#
programming
#
database
#
bigdata
3
reactions
Comments
Add Comment
5 min read
Technology will be the star of the World Cup
Albérico Junior
Albérico Junior
Albérico Junior
Follow
Nov 24 '22
Technology will be the star of the World Cup
#
bigdata
#
qata
#
inteligênciaartificial
#
tecnologia
4
reactions
Comments
1
comment
2 min read
Java serialization with Avro
Jerónimo López
Jerónimo López
Jerónimo López
Follow
Dec 5 '22
Java serialization with Avro
#
avro
#
java
#
serialization
#
bigdata
6
reactions
Comments
Add Comment
10 min read
Real Time Data Infra Stack
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Dec 5 '22
Real Time Data Infra Stack
#
eventdriven
#
architecture
#
tutorial
#
bigdata
4
reactions
Comments
Add Comment
6 min read
Example of applying CDC to JSON files with PySpark
romerito
romerito
romerito
Follow
Nov 30 '22
Example of applying CDC to JSON files with PySpark
#
cdc
#
spark
#
bigdata
#
deltalake
2
reactions
Comments
1
comment
7 min read
To study Apache Kafka Architecture in details, and how to install, deploy configure Apache kafka.
Ashwin Telmore
Ashwin Telmore
Ashwin Telmore
Follow
Nov 17 '22
To study Apache Kafka Architecture in details, and how to install, deploy configure Apache kafka.
#
bigdata
#
apache
#
kafka
#
manual
4
reactions
Comments
Add Comment
3 min read
Azure Data Factory - Incrementally load data from Azure SQL to Azure Data Lake using Watermark
Balram Prasad
Balram Prasad
Balram Prasad
Follow
Nov 13 '22
Azure Data Factory - Incrementally load data from Azure SQL to Azure Data Lake using Watermark
#
azure
#
bigdata
#
azuredatafactory
#
dataengineering
4
reactions
Comments
Add Comment
1 min read
How to create Stored Procedure in MySQL
The Dream Coding
The Dream Coding
The Dream Coding
Follow
Nov 13 '22
How to create Stored Procedure in MySQL
#
mysql
#
sql
#
bigdata
#
database
2
reactions
Comments
Add Comment
1 min read
How to use delimiter in MySQL
The Dream Coding
The Dream Coding
The Dream Coding
Follow
Nov 12 '22
How to use delimiter in MySQL
#
mysql
#
sql
#
database
#
bigdata
2
reactions
Comments
Add Comment
1 min read
Playing PyFlink from Scratch
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Oct 17 '22
Playing PyFlink from Scratch
#
bigdata
#
tutorial
#
eventdriven
#
programming
1
reaction
Comments
Add Comment
4 min read
Apache Spark with java
J S SUNIL
J S SUNIL
J S SUNIL
Follow
Oct 29 '22
Apache Spark with java
#
apachespark
#
java
#
bigdata
#
spark
5
reactions
Comments
Add Comment
5 min read
Playing PyFlink in a Nutshell
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Oct 24 '22
Playing PyFlink in a Nutshell
#
bigdata
#
eventdriven
#
python
#
tutorial
7
reactions
Comments
Add Comment
5 min read
Podcast with Josh Long on Apache Pulsar and Spring
Timothy Spann. 🇺🇦
Timothy Spann. 🇺🇦
Timothy Spann. 🇺🇦
Follow
Sep 16 '22
Podcast with Josh Long on Apache Pulsar and Spring
#
apachepulsar
#
spring
#
java
#
bigdata
3
reactions
Comments
Add Comment
1 min read
Optimizing massive MongoDB inserts, load 50 million records faster by 33%!
Dmtro Harazdovskiy
Dmtro Harazdovskiy
Dmtro Harazdovskiy
Follow
Oct 16 '22
Optimizing massive MongoDB inserts, load 50 million records faster by 33%!
#
mongodb
#
bigdata
#
node
#
performance
10
reactions
Comments
1
comment
12 min read
Docker Alternatives That Can Boost Your Productivity
James Wilson
James Wilson
James Wilson
Follow
Sep 22 '22
Docker Alternatives That Can Boost Your Productivity
#
cloud
#
docker
#
devops
#
bigdata
1
reaction
Comments
Add Comment
4 min read
Building Apache Pinot and Presto
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Oct 10 '22
Building Apache Pinot and Presto
#
bigdata
#
eventdriven
#
tutorial
#
programming
2
reactions
Comments
Add Comment
4 min read
O que é dark data?
Rita Carolina
Rita Carolina
Rita Carolina
Follow
for
Feministech
Oct 6 '22
O que é dark data?
#
bigdata
#
braziliandevs
#
darkdata
10
reactions
Comments
Add Comment
1 min read
Apache-Spark introduction for SQL developers
Cesar Mostacero
Cesar Mostacero
Cesar Mostacero
Follow
Sep 29 '22
Apache-Spark introduction for SQL developers
#
apachespark
#
dataengineering
#
beginners
#
bigdata
2
reactions
Comments
Add Comment
7 min read
Design Pattern of Streaming Enrichment
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
Aug 29 '22
Design Pattern of Streaming Enrichment
#
eventdriven
#
bigdata
#
architecture
#
programming
Comments
Add Comment
6 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account