Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
DEV Community
Close
#
bigdata
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Data ingestion – definition, types and best practices
DBSync
DBSync
DBSync
Follow
Jul 23
Data ingestion – definition, types and best practices
#
cloud
#
data
#
bigdata
Comments
Add Comment
8 min read
How to Handle Databases with Billions of Records
DbVisualizer
DbVisualizer
DbVisualizer
Follow
Aug 12
How to Handle Databases with Billions of Records
#
bigdata
3
 reactions
Comments
Add Comment
1 min read
Effective Strategies for Scaling Databases: Enhancing Performance for Growing Data Needs
Nguyen Gia Huy
Nguyen Gia Huy
Nguyen Gia Huy
Follow
Aug 6
Effective Strategies for Scaling Databases: Enhancing Performance for Growing Data Needs
#
database
#
learning
#
webdev
#
bigdata
4
 reactions
Comments
Add Comment
5 min read
Databricks - Variant Type Analysis
Debashis Adak
Debashis Adak
Debashis Adak
Follow
Jun 29
Databricks - Variant Type Analysis
#
databricks
#
spark
#
bigdata
#
datalake
Comments
Add Comment
7 min read
Working with Parquet files in Java using Carpet
JerĂłnimo LĂłpez
JerĂłnimo LĂłpez
JerĂłnimo LĂłpez
Follow
Jun 19
Working with Parquet files in Java using Carpet
#
parquet
#
java
#
bigdata
#
dataengineering
1
 reaction
Comments
Add Comment
6 min read
Optimizing ETL Processes for Efficient Data Loading in EDWs
Ovais
Ovais
Ovais
Follow
Jul 12
Optimizing ETL Processes for Efficient Data Loading in EDWs
#
emterprisedatawarehouse
#
etl
#
datascience
#
bigdata
Comments
Add Comment
4 min read
Patient-Centered Care and Data Integration in Population Health Management
Ovais
Ovais
Ovais
Follow
Jul 12
Patient-Centered Care and Data Integration in Population Health Management
#
powerapps
#
healthcare
#
datascience
#
bigdata
Comments
Add Comment
4 min read
The Basics of Big Data: What You Need to Know
bvanderbilt0033
bvanderbilt0033
bvanderbilt0033
Follow
Jun 7
The Basics of Big Data: What You Need to Know
#
dataprotection
#
dataanalytics
#
dataprivacy
#
bigdata
Comments
Add Comment
3 min read
Why Apache Doris is the Best Open Source Alternative to Rockset
Apache Doris
Apache Doris
Apache Doris
Follow
Jul 1
Why Apache Doris is the Best Open Source Alternative to Rockset
#
database
#
bigdata
#
dataengineering
#
openai
3
 reactions
Comments
Add Comment
3 min read
Introduction to Apache Hadoop & MapReduce
Shivansh Yadav
Shivansh Yadav
Shivansh Yadav
Follow
Jun 30
Introduction to Apache Hadoop & MapReduce
#
hadoop
#
dataengineering
#
bigdata
#
datascience
5
 reactions
Comments
Add Comment
3 min read
Blazingly-Fast Serialization: Apache Fury 0.5.1 released
Shawn
Shawn
Shawn
Follow
May 31
Blazingly-Fast Serialization: Apache Fury 0.5.1 released
#
rpc
#
bigdata
#
microservices
#
distributedsystems
Comments
Add Comment
3 min read
Metadata for win — Apache Parquet
Rahul Dubey
Rahul Dubey
Rahul Dubey
Follow
May 25
Metadata for win — Apache Parquet
#
python
#
bigdata
#
datascience
#
dataengineering
Comments
Add Comment
5 min read
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark
Chetan Gupta
Chetan Gupta
Chetan Gupta
Follow
Jun 27
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark
#
pyspark
#
bigdata
#
mongodb
#
spark
Comments
Add Comment
3 min read
Advanced Insights into Automated Data Processing Tools
Data Expertise
Data Expertise
Data Expertise
Follow
Jun 16
Advanced Insights into Automated Data Processing Tools
#
automateddataprocessing
#
machinelearning
#
bigdata
#
datascience
1
 reaction
Comments
Add Comment
4 min read
Real-Time Sentiment Analysis using PySpark and FastAPI
raghavtwenty
raghavtwenty
raghavtwenty
Follow
Jun 14
Real-Time Sentiment Analysis using PySpark and FastAPI
#
bigdata
#
spark
#
python
#
fastapi
2
 reactions
Comments
Add Comment
1 min read
Documenting Rate Limits and Throttling in REST APIs
Ovais
Ovais
Ovais
Follow
Jun 12
Documenting Rate Limits and Throttling in REST APIs
#
api
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
5 min read
How to Build an API with Strong Security Measures
Ovais
Ovais
Ovais
Follow
Jun 12
How to Build an API with Strong Security Measures
#
api
#
bigdata
#
datascience
#
datamanagement
Comments
Add Comment
4 min read
GraphQL API Design Best Practices for Efficient Data Management
Ovais
Ovais
Ovais
Follow
Jun 12
GraphQL API Design Best Practices for Efficient Data Management
#
api
#
datamanagement
#
bigdata
#
graphql
Comments
Add Comment
5 min read
The current Lakehouse is like a false proposition
Judy
Judy
Judy
Follow
Jun 12
The current Lakehouse is like a false proposition
#
lackhouse
#
bigdata
#
development
#
programming
6
 reactions
Comments
1
 comment
10 min read
Is distributed technology the panacea for big data processing?
Judy
Judy
Judy
Follow
Jun 6
Is distributed technology the panacea for big data processing?
#
bigdata
#
processing
#
development
#
lauguage
7
 reactions
Comments
1
 comment
10 min read
What Should Be Followed While Scraping Data From Local Citations?
Momenul Ahmad
Momenul Ahmad
Momenul Ahmad
Follow
May 10
What Should Be Followed While Scraping Data From Local Citations?
#
citation
#
scraping
#
data
#
bigdata
Comments
Add Comment
1 min read
Big Data: a ferramenta que precisamos.
Delmiro Ribeiro
Delmiro Ribeiro
Delmiro Ribeiro
Follow
May 26
Big Data: a ferramenta que precisamos.
#
bigdata
#
database
#
datascience
#
backend
Comments
Add Comment
2 min read
PySpark: missing value
ChelseaLiu0822
ChelseaLiu0822
ChelseaLiu0822
Follow
Apr 18
PySpark: missing value
#
pyspark
#
python
#
dataengineering
#
bigdata
Comments
Add Comment
2 min read
Cross-cluster replication for read-write separation
Apache Doris
Apache Doris
Apache Doris
Follow
May 21
Cross-cluster replication for read-write separation
#
database
#
bigdata
#
dataengineering
#
tutorial
2
 reactions
Comments
Add Comment
4 min read
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)
Asanka Boteju
Asanka Boteju
Asanka Boteju
Follow
May 20
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)
#
bigdata
#
kinesis
#
aws
#
strems
12
 reactions
Comments
Add Comment
7 min read
Trino & Iceberg Made Easy: A Ready-to-Use Playground
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
May 20
Trino & Iceberg Made Easy: A Ready-to-Use Playground
#
bigdata
#
datascience
#
tutorial
#
dataengineering
15
 reactions
Comments
Add Comment
3 min read
The Role of Data Integration in Healthcare Research and Precision Medicine
Ovais
Ovais
Ovais
Follow
May 13
The Role of Data Integration in Healthcare Research and Precision Medicine
#
dataintegration
#
healthcare
#
datascience
#
bigdata
Comments
Add Comment
4 min read
Automating Data Processes for Efficiency and Accuracy
Ovais
Ovais
Ovais
Follow
May 8
Automating Data Processes for Efficiency and Accuracy
#
dataextraction
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
5 min read
Auto-increment columns in Apache Doris
Apache Doris
Apache Doris
Apache Doris
Follow
May 8
Auto-increment columns in Apache Doris
#
database
#
dataegnineering
#
tutorial
#
bigdata
Comments
Add Comment
11 min read
What to use parquet or CSV?
Hitesh
Hitesh
Hitesh
Follow
May 7
What to use parquet or CSV?
#
datascience
#
database
#
python
#
bigdata
17
 reactions
Comments
Add Comment
3 min read
Accelerating ETL Processes for Timely Business Intelligence
Ovais
Ovais
Ovais
Follow
May 7
Accelerating ETL Processes for Timely Business Intelligence
#
changedatacapture
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
4 min read
Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?
jbx1279
jbx1279
jbx1279
Follow
Apr 13
Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?
#
sql
#
performance
#
bigdata
#
database
Comments
Add Comment
4 min read
A glimpse into the future of data processing infrastructure.
Kostas Pardalis
Kostas Pardalis
Kostas Pardalis
Follow
May 2
A glimpse into the future of data processing infrastructure.
#
database
#
bigdata
#
snowflake
#
spark
Comments
Add Comment
9 min read
Safeguarding Data Quality By Addressing Data Privacy and Security Concerns
Ovais
Ovais
Ovais
Follow
Apr 30
Safeguarding Data Quality By Addressing Data Privacy and Security Concerns
#
datascience
#
bigdata
#
datamanagement
#
datamigration
1
 reaction
Comments
1
 comment
4 min read
Best Practices for Designing an Efficient ETL Pipeline
Ovais
Ovais
Ovais
Follow
Apr 30
Best Practices for Designing an Efficient ETL Pipeline
#
etl
#
datascience
#
bigdata
#
datamanagement
4
 reactions
Comments
Add Comment
4 min read
The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage
Ajay
Ajay
Ajay
Follow
Mar 27
The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage
#
bigdata
#
bfsi
#
data
#
analytics
Comments
Add Comment
4 min read
LLMs, DevOps, and Big Data Musings
bfuller
bfuller
bfuller
Follow
Apr 25
LLMs, DevOps, and Big Data Musings
#
devops
#
llm
#
ai
#
bigdata
Comments
Add Comment
3 min read
Understanding and Mitigating Message Loss in Apache Kafka
Yusen Meng
Yusen Meng
Yusen Meng
Follow
Apr 25
Understanding and Mitigating Message Loss in Apache Kafka
#
bigdata
#
datareliability
#
streamprocessing
#
distributed
11
 reactions
Comments
Add Comment
9 min read
Snowflake 101: A Comprehensive Guide to the Data Cloud
Suyash Salvi
Suyash Salvi
Suyash Salvi
Follow
Apr 23
Snowflake 101: A Comprehensive Guide to the Data Cloud
#
virtualdatawarehouse
#
snowflake
#
bigdata
#
datacloud
2
 reactions
Comments
Add Comment
4 min read
Blockchain Technology and Data Governance: Enhancing Security and Trust
Ovais
Ovais
Ovais
Follow
Apr 30
Blockchain Technology and Data Governance: Enhancing Security and Trust
#
blockchain
#
datamanagement
#
datascience
#
bigdata
1
 reaction
Comments
1
 comment
4 min read
SQL Pro Tips : industrial AWS Athena SQL using WITH
hexfloor
hexfloor
hexfloor
Follow
Mar 28
SQL Pro Tips : industrial AWS Athena SQL using WITH
#
aws
#
database
#
bigdata
#
sql
3
 reactions
Comments
Add Comment
4 min read
SQL Pro Tips : industrial GCP BigQuery SQL using WITH
hexfloor
hexfloor
hexfloor
Follow
Mar 28
SQL Pro Tips : industrial GCP BigQuery SQL using WITH
#
gcp
#
sql
#
database
#
bigdata
3
 reactions
Comments
Add Comment
5 min read
Tools Every Data Scientist Should Know
Shaheryar
Shaheryar
Shaheryar
Follow
Mar 14
Tools Every Data Scientist Should Know
#
datascience
#
python
#
machinelearning
#
bigdata
Comments
Add Comment
2 min read
AI enthusiasm #3 - AlphaFold2, a game-changer🧬
Astra Bertelli
Astra Bertelli
Astra Bertelli
Follow
Apr 12
AI enthusiasm #3 - AlphaFold2, a game-changer🧬
#
opensource
#
learning
#
bigdata
#
ai
Comments
Add Comment
2 min read
Redis License Change: A Look at the Competitive Game between OSS and Cloud Computing Giants
AutoMQ
AutoMQ
AutoMQ
Follow
Apr 10
Redis License Change: A Look at the Competitive Game between OSS and Cloud Computing Giants
#
bsl
#
opensource
#
automq
#
bigdata
Comments
Add Comment
5 min read
MWAA Plugins and Dependency Survival Guide
elliott cordo
elliott cordo
elliott cordo
Follow
for
AWS Heroes
Apr 5
MWAA Plugins and Dependency Survival Guide
#
airflow
#
bigdata
#
dataengineering
#
aws
5
 reactions
Comments
Add Comment
3 min read
GenAI Model Optimization: Guide to Fine-Tuning and Quantization
Farrruh
Farrruh
Farrruh
Follow
Apr 3
GenAI Model Optimization: Guide to Fine-Tuning and Quantization
#
ai
#
aiops
#
cloud
#
bigdata
2
 reactions
Comments
Add Comment
4 min read
What is Surrogate Key in SQL?
Sandeep
Sandeep
Sandeep
Follow
Apr 2
What is Surrogate Key in SQL?
#
sql
#
database
#
bigdata
Comments
Add Comment
2 min read
SQL Pro Tips : industrial Oracle SQL using WITH
hexfloor
hexfloor
hexfloor
Follow
Mar 28
SQL Pro Tips : industrial Oracle SQL using WITH
#
sql
#
oracle
#
bigdata
#
database
3
 reactions
Comments
Add Comment
4 min read
How come there are tens of thousands of tables in a database
jbx1279
jbx1279
jbx1279
Follow
Mar 23
How come there are tens of thousands of tables in a database
#
database
#
bigdata
#
sql
2
 reactions
Comments
1
 comment
5 min read
Data Streaming Architecture
Jose Luis Sastoque Rey
Jose Luis Sastoque Rey
Jose Luis Sastoque Rey
Follow
for
AWS Community Builders
Mar 27
Data Streaming Architecture
#
aws
#
bigdata
#
architecture
4
 reactions
Comments
Add Comment
4 min read
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
Apache Doris
Apache Doris
Apache Doris
Follow
Mar 27
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
#
database
#
dataengineering
#
bigdata
#
logging
Comments
Add Comment
12 min read
Amazon EMR deployment on EKS
vivekpophale
vivekpophale
vivekpophale
Follow
Mar 23
Amazon EMR deployment on EKS
#
emr
#
eks
#
bigdata
#
aws
2
 reactions
Comments
Add Comment
7 min read
Understanding the Battle of Database Storage: Row-Oriented vs. Columnar
Sunny Srinidhi
Sunny Srinidhi
Sunny Srinidhi
Follow
Mar 8
Understanding the Battle of Database Storage: Row-Oriented vs. Columnar
#
database
#
bigdata
#
storage
#
datascience
1
 reaction
Comments
1
 comment
6 min read
The Role of AI in Enhancing Data Governance Strategies
Ovais
Ovais
Ovais
Follow
Mar 12
The Role of AI in Enhancing Data Governance Strategies
#
datascience
#
bigdata
#
ai
#
webdev
2
 reactions
Comments
Add Comment
5 min read
Why Python and SQL are Must-Have Skills for Marketing Analysts in the Age of Big Data
Scofield Idehen
Scofield Idehen
Scofield Idehen
Follow
Feb 23
Why Python and SQL are Must-Have Skills for Marketing Analysts in the Age of Big Data
#
bigdata
#
python
#
datascience
#
sql
10
 reactions
Comments
Add Comment
6 min read
Big data with Software Systems
Ravikanth Kowdeed
Ravikanth Kowdeed
Ravikanth Kowdeed
Follow
Feb 14
Big data with Software Systems
#
softwareengineering
#
bigdata
1
 reaction
Comments
Add Comment
1 min read
BigQuery Machine Learning
Cris Crawford
Cris Crawford
Cris Crawford
Follow
Feb 10
BigQuery Machine Learning
#
bigdata
#
machinelearning
#
googlecloud
#
sql
2
 reactions
Comments
Add Comment
5 min read
Understanding Elasticsearch. A Guide for Beginners
nivelepsilon
nivelepsilon
nivelepsilon
Follow
Feb 10
Understanding Elasticsearch. A Guide for Beginners
#
elasticsearch
#
devops
#
bigdata
#
beginners
1
 reaction
Comments
Add Comment
4 min read
BigQuery best practices
Cris Crawford
Cris Crawford
Cris Crawford
Follow
Feb 10
BigQuery best practices
#
dataengineering
#
bigdata
4
 reactions
Comments
Add Comment
2 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account