DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

Comments
5 min read
🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

Comments
2 min read
Marmot: Data catalog without the complex infrastructure

Marmot: Data catalog without the complex infrastructure

1
Comments
3 min read
TDD for dbt: unit testing the way it should be

TDD for dbt: unit testing the way it should be

1
Comments
12 min read
Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Comments
5 min read
HackerRank 'The Pads' MySQL

HackerRank 'The Pads' MySQL

Comments
3 min read
🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API

🔥 Day 5: Introduction to DataFrames - The Most Importantce of Spark API

Comments
2 min read
Generating Table Schema from AWS Glue Table

Generating Table Schema from AWS Glue Table

2
Comments
1 min read
Comparing Great Expectations and CsvPath Framework

Comparing Great Expectations and CsvPath Framework

Comments
8 min read
Financial Transaction Data Reconciler PayPal

Financial Transaction Data Reconciler PayPal

Comments
5 min read
Streamlit desde cero: cómo crear una app para explorar y visualizar datos

Streamlit desde cero: cómo crear una app para explorar y visualizar datos

1
Comments
4 min read
Introducing dremioframe - A Pythonic DataFrame Interface for Dremio

Introducing dremioframe - A Pythonic DataFrame Interface for Dremio

Comments
9 min read
Stifel Modern Data Platform

Stifel Modern Data Platform

Comments
4 min read
Core Microsoft Fabric Concepts

Core Microsoft Fabric Concepts

1
Comments
3 min read
Implementing a CDC pipeline with Debezium

Implementing a CDC pipeline with Debezium

Comments
8 min read
Building Bulletproof Data Pipelines: Orchestration, Testing, and Monitoring (Part 3 of 3)

Building Bulletproof Data Pipelines: Orchestration, Testing, and Monitoring (Part 3 of 3)

1
Comments
9 min read
LogInSight: A Lightweight CloudWatch Log Analytics Tool for Faster Debugging and Real-Time Insights

LogInSight: A Lightweight CloudWatch Log Analytics Tool for Faster Debugging and Real-Time Insights

2
Comments
3 min read
The Proxy Economy: Residential, Datacenter, and ISP Rotation

The Proxy Economy: Residential, Datacenter, and ISP Rotation

Comments 2
5 min read
The Database Query That Could Cost a Company Millions(And Why Data Engineers Exist)

The Database Query That Could Cost a Company Millions(And Why Data Engineers Exist)

Comments
5 min read
RAG Evaluation Metrics: Measuring What Actually Matters

RAG Evaluation Metrics: Measuring What Actually Matters

1
Comments
10 min read
Building Streaming Iceberg Tables for Real-Time Logistics Analytics

Building Streaming Iceberg Tables for Real-Time Logistics Analytics

Comments
4 min read
Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

Comments
4 min read
The Great Table Format Debate: A Deep Dive into Apache Iceberg, Delta Lake, and Apache Hudi

The Great Table Format Debate: A Deep Dive into Apache Iceberg, Delta Lake, and Apache Hudi

Comments
18 min read
KISSing the Data Architecture

KISSing the Data Architecture

Comments
20 min read
Amazon Kinesis vs Amazon MSK: The Complete Guide for Stream Processing on AWS

Amazon Kinesis vs Amazon MSK: The Complete Guide for Stream Processing on AWS

Comments
29 min read
loading...