DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Data Engineer’s Codex: From First Principles to the Modern Lakehouse

The Data Engineer’s Codex: From First Principles to the Modern Lakehouse

6
Comments
10 min read
Breaking Into Gaming Analytics: From 1 Billion Mobile Users to 5B Daily Events

Breaking Into Gaming Analytics: From 1 Billion Mobile Users to 5B Daily Events

Comments 1
6 min read
Building a Real-Time Data Lake on AWS: S3, Glue, and Athena in Production

Building a Real-Time Data Lake on AWS: S3, Glue, and Athena in Production

1
Comments
5 min read
Embeddings and Vector Similarity: How Machines Understand Meaning

Embeddings and Vector Similarity: How Machines Understand Meaning

1
Comments
19 min read
Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Comments
2 min read
Join OSA CON 2025: Two Days of Open‑Source Analytics and AI (Nov. 4–5)

Join OSA CON 2025: Two Days of Open‑Source Analytics and AI (Nov. 4–5)

Comments
3 min read
AWS Glue for ETL

AWS Glue for ETL

Comments
5 min read
What to use for data preparation in report, query or analysis business?

What to use for data preparation in report, query or analysis business?

5
Comments
10 min read
Optimizing Data Processing on AWS with Data Compaction

Optimizing Data Processing on AWS with Data Compaction

4
Comments
7 min read
Real-Time Earthquake CDC Pipeline

Real-Time Earthquake CDC Pipeline

Comments
5 min read
Designing a Cost-Efficient Parallel Data Pipeline on AWS Using Lambda and SQS

Designing a Cost-Efficient Parallel Data Pipeline on AWS Using Lambda and SQS

3
Comments
6 min read
The Offline Data Engineer: Building Resilient API Pipelines that Work on an Airplane

The Offline Data Engineer: Building Resilient API Pipelines that Work on an Airplane

4
Comments
5 min read
Understanding Kafka Architecture, Schema Registry, ksqlDB, PostgreSQL, Couchbase, and Microservices

Understanding Kafka Architecture, Schema Registry, ksqlDB, PostgreSQL, Couchbase, and Microservices

2
Comments
3 min read
Building a 75,000-Product Image Feature Dataset for the Amazon ML Challenge 2025

Building a 75,000-Product Image Feature Dataset for the Amazon ML Challenge 2025

1
Comments
4 min read
An Exploration of the Commercial Iceberg Catalog Ecosystem

An Exploration of the Commercial Iceberg Catalog Ecosystem

Comments
14 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.