DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Lessons Learned from Building Product Dashboards That Drive Real Decisions

Lessons Learned from Building Product Dashboards That Drive Real Decisions

5
Comments 1
4 min read
Data Collection and Preparation for Machine Learning

Data Collection and Preparation for Machine Learning

6
Comments 1
4 min read
Containerization for Data Engineering: A practical Guide with Docker and Docker Compose

Containerization for Data Engineering: A practical Guide with Docker and Docker Compose

Comments
3 min read
Understanding Data Formats in Cloud & Data Analytics

Understanding Data Formats in Cloud & Data Analytics

Comments
3 min read
Stop Copy-Pasting Between Excel and Code: Automate Your Data Workflows with GridScript

Stop Copy-Pasting Between Excel and Code: Automate Your Data Workflows with GridScript

Comments
2 min read
SQL: Summing categories

SQL: Summing categories

Comments
2 min read
Fix Slow Query: A Developer's Guide to Data Warehouse Performance

Fix Slow Query: A Developer's Guide to Data Warehouse Performance

1
Comments
14 min read
Data in Cloud

Data in Cloud

Comments
4 min read
🧠 Real-Time Comment Ranking with Kafka and Sentiment Analysis

🧠 Real-Time Comment Ranking with Kafka and Sentiment Analysis

Comments
3 min read
How I Used AWS Glue and Athena for Serverless Data Analytics

How I Used AWS Glue and Athena for Serverless Data Analytics

Comments
2 min read
Comparing CsvPath and CSV Schema

Comparing CsvPath and CSV Schema

Comments
4 min read
A Deep Dive into Apache Spark Architecture

A Deep Dive into Apache Spark Architecture

1
Comments
4 min read
# Data Ingestion & Vector Store #llmszoomcamp

# Data Ingestion & Vector Store #llmszoomcamp

Comments
2 min read
Database Fundamentals

Database Fundamentals

Comments
3 min read
Distributed Media Inferencing with Kafka

Distributed Media Inferencing with Kafka

Comments 1
5 min read
🧑‍💻 Apache Kafka CLI – Detailed Course

🧑‍💻 Apache Kafka CLI – Detailed Course

Comments
2 min read
🌍 Automating Africa’s Energy Data Collection Using Python, Playwright(+Why Playwright ?), and MongoDB (2000–2024)

🌍 Automating Africa’s Energy Data Collection Using Python, Playwright(+Why Playwright ?), and MongoDB (2000–2024)

5
Comments
5 min read
Apache Doris 4.0: One Engine for Analytics, Full-Text Search, and Vector Search

Apache Doris 4.0: One Engine for Analytics, Full-Text Search, and Vector Search

4
Comments
22 min read
From 8 Minutes to 40 Seconds: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout

From 8 Minutes to 40 Seconds: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout

Comments
5 min read
Create a Microsoft Fabric Lakehouse

Create a Microsoft Fabric Lakehouse

5
Comments
6 min read
Core Concepts of Kafka

Core Concepts of Kafka

Comments
8 min read
From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks.

From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks.

Comments
9 min read
Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches

Apache Kafka: ZooKeeper vs. KRaft — A Complete Comparison of Approaches

Comments
6 min read
Introduction to Apache Airflow

Introduction to Apache Airflow

1
Comments
4 min read
Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

Building a Production-Ready Data Lake: PostgreSQL to S3 with AWS DMS, Glue, and Athena using CDK

2
Comments
8 min read
loading...