DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The 800 Million Weekly ChatGPT Users Who Are Just Getting Started

The 800 Million Weekly ChatGPT Users Who Are Just Getting Started

Comments
5 min read
The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

Comments
2 min read
Scaling data systems: How we process millions of records with Python

Scaling data systems: How we process millions of records with Python

Comments
3 min read
Data Management Systems: Transactional to Analytical Architectures

Data Management Systems: Transactional to Analytical Architectures

Comments
7 min read
OLAP vs OLTP: A Deep Dive into Database Processing Systems

OLAP vs OLTP: A Deep Dive into Database Processing Systems

Comments
3 min read
Why ClickHouse Loves Append-Heavy Workloads

Why ClickHouse Loves Append-Heavy Workloads

3
Comments
4 min read
Apache Kafka: A Beginner's Guide to Key Concepts

Apache Kafka: A Beginner's Guide to Key Concepts

5
Comments
5 min read
OLAP vs OLTP: Understanding the Backbone of Modern Data Systems

OLAP vs OLTP: Understanding the Backbone of Modern Data Systems

1
Comments
2 min read
Apache Kafka and the Rise of Real-Time Data Streaming

Apache Kafka and the Rise of Real-Time Data Streaming

1
Comments
4 min read
How Databricks Genie Turns Plain English Into SQL Code

How Databricks Genie Turns Plain English Into SQL Code

5
Comments
11 min read
I built a DuckDB extension to handle chemistry data without pandas or RDKit

I built a DuckDB extension to handle chemistry data without pandas or RDKit

Comments
5 min read
Training Data Provenance: The Manifest Diff That Explains the Hash

Training Data Provenance: The Manifest Diff That Explains the Hash

Comments
8 min read
I Analyzed 10 Million Records in 47 Seconds Using Python + DuckDB (No Spark, No Cloud)

I Analyzed 10 Million Records in 47 Seconds Using Python + DuckDB (No Spark, No Cloud)

2
Comments 1
3 min read
Performance and Apache Iceberg's Metadata

Performance and Apache Iceberg's Metadata

Comments
7 min read
Stop Using Subqueries: 3 Advanced SQL CTE Patterns That Saved My Production Database

Stop Using Subqueries: 3 Advanced SQL CTE Patterns That Saved My Production Database

Comments 1
3 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.