DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Apache Doris 2.1.0: TPC-DS, Parallel Adaptive Scan, Local Shuffle, Arrow Flight-based HTTP Data API

Apache Doris 2.1.0: TPC-DS, Parallel Adaptive Scan, Local Shuffle, Arrow Flight-based HTTP Data API

Comments
29 min read
RisingWave workshop

RisingWave workshop

2
Comments
5 min read
Production and CI/CD in dbt

Production and CI/CD in dbt

2
Comments
3 min read
My Experience with Apache Airflow

My Experience with Apache Airflow

9
Comments
3 min read
"Day 42 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -21)

"Day 42 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -21)

1
Comments
1 min read
Different file formats, a benchmark doing basic operations

Different file formats, a benchmark doing basic operations

10
Comments 2
9 min read
5 reasons Dremio is the ideal Apache Iceberg Lakehouse Platform

5 reasons Dremio is the ideal Apache Iceberg Lakehouse Platform

Comments
5 min read
When Metrics Go Awry: Analyzing KPIs using machine learning, regression analysis, and Shapley values

When Metrics Go Awry: Analyzing KPIs using machine learning, regression analysis, and Shapley values

Comments
5 min read
How to manage tags for objects in Snowflake

How to manage tags for objects in Snowflake

Comments
6 min read
“Data has a Dream” — A Short comic about data mesh and how it can transform your company

“Data has a Dream” — A Short comic about data mesh and how it can transform your company

1
Comments 1
2 min read
The Apache Iceberg Lakehouse: The Great Data Equalizer (disrupting the Snowflake/Databricks status quo)

The Apache Iceberg Lakehouse: The Great Data Equalizer (disrupting the Snowflake/Databricks status quo)

2
Comments
7 min read
How moving from Pandas to Polars made me write better code without writing better code

How moving from Pandas to Polars made me write better code without writing better code

40
Comments 4
14 min read
📢 About job offers, innovation & data strategy 🔭

📢 About job offers, innovation & data strategy 🔭

Comments 3
3 min read
"Day 39 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -18)

"Day 39 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -18)

1
Comments
2 min read
GroupBy and Join in Spark

GroupBy and Join in Spark

3
Comments
2 min read
How Tables and indexes stored on Disk

How Tables and indexes stored on Disk

2
Comments
2 min read
"Day 38 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -17)

"Day 38 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -17)

1
Comments
3 min read
4 numeric distribution metrics to track in Snowflake (and how to track them)

4 numeric distribution metrics to track in Snowflake (and how to track them)

Comments
9 min read
10 Reasons to Make Apache Iceberg and Dremio Part of your Data Lakehouse Strategy

10 Reasons to Make Apache Iceberg and Dremio Part of your Data Lakehouse Strategy

Comments
9 min read
Learn Python

Learn Python

Comments
1 min read
A deep dive into the concept and world of Apache Iceberg Catalogs

A deep dive into the concept and world of Apache Iceberg Catalogs

5
Comments
8 min read
Exploring Feature Stores: Personal Insights and Notes on Hopsworks pt.2

Exploring Feature Stores: Personal Insights and Notes on Hopsworks pt.2

1
Comments
1 min read
Build a Real-time Materialized View from Postgres Changes using Confluent’s ksqlDB

Build a Real-time Materialized View from Postgres Changes using Confluent’s ksqlDB

Comments
11 min read
"Day 35 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -14)

"Day 35 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -14)

1
Comments
2 min read
The Role of Ontologies in Data Management

The Role of Ontologies in Data Management

1
Comments
6 min read
loading...