DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
From Script to Spreadsheet: Building a Self-Serve Etsy Competitor Tracker

From Script to Spreadsheet: Building a Self-Serve Etsy Competitor Tracker

2
Comments
5 min read
Building a 'Data-on-Demand' Microservice: Wrapping Alibaba Scrapers for Internal Tools

Building a 'Data-on-Demand' Microservice: Wrapping Alibaba Scrapers for Internal Tools

2
Comments
5 min read
Part 2: dbt Project Structure & Building Models 📁

Part 2: dbt Project Structure & Building Models 📁

Comments
4 min read
# Module 4 Summary - Analytics Engineering with dbt

# Module 4 Summary - Analytics Engineering with dbt

Comments
2 min read
Part 3: Testing, Documentation & Deployment 🚀

Part 3: Testing, Documentation & Deployment 🚀

Comments
5 min read
Machine Learning Starts With a WHERE Clause

Machine Learning Starts With a WHERE Clause

1
Comments
2 min read
Pandas 3.0's PyArrow String Revolution: A Deep Dive into Memory and Performance

Pandas 3.0's PyArrow String Revolution: A Deep Dive into Memory and Performance

2
Comments
6 min read
How We Built a Deterministic File Import Pipeline in TypeScript (CSV, XLSX, ZIP)

How We Built a Deterministic File Import Pipeline in TypeScript (CSV, XLSX, ZIP)

Comments
2 min read
AWS Data Engineer Associate (DEA-C01): What Each Domain Actually Tests (From Someone Who Just Passed)

AWS Data Engineer Associate (DEA-C01): What Each Domain Actually Tests (From Someone Who Just Passed)

Comments
2 min read
Why Most Data Governance Tools Miss the Real Relationships — and What to Do About It

Why Most Data Governance Tools Miss the Real Relationships — and What to Do About It

Comments
2 min read
11 Compaction Optimizations for Iceberg Data Lakes

11 Compaction Optimizations for Iceberg Data Lakes

1
Comments
25 min read
Under the Hood of Arisyn: How Statistical Field Fingerprinting Enables Deterministic Data Linking

Under the Hood of Arisyn: How Statistical Field Fingerprinting Enables Deterministic Data Linking

Comments
2 min read
ELI25: Apache Kafka Quick Notes for Interviews

ELI25: Apache Kafka Quick Notes for Interviews

Comments
4 min read
Postmortem: Eliminating OOM Failures in Spark on Kubernetes (Azure) After Cloud Migration

Postmortem: Eliminating OOM Failures in Spark on Kubernetes (Azure) After Cloud Migration

Comments
5 min read
We All Accepted the "Python Tax.", Pandas 3.0 Just Reduced It.

We All Accepted the "Python Tax.", Pandas 3.0 Just Reduced It.

2
Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.