DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
CSV Processing Gotchas: Don’t Let Invalid Data Slip Through the Cracks!!!

CSV Processing Gotchas: Don’t Let Invalid Data Slip Through the Cracks!!!

Comments
1 min read
Streaming SQL Engine: Lightweight Cross-Data Source Integration for Resource-Constrained Environments.

Streaming SQL Engine: Lightweight Cross-Data Source Integration for Resource-Constrained Environments.

10
Comments
1 min read
Stop Manually Tracing Azure Synapse Dependencies

Stop Manually Tracing Azure Synapse Dependencies

Comments
1 min read
Building Pangolin: My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious

Building Pangolin: My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious

Comments
6 min read
Part 8: Databricks Pipeline & Dashboard

Part 8: Databricks Pipeline & Dashboard

Comments
2 min read
Part 4: Building the Bronze Layer with Auto Loader and Delta Lake

Part 4: Building the Bronze Layer with Auto Loader and Delta Lake

Comments
2 min read
Part 5: Building a ZIP Code Dimension Table

Part 5: Building a ZIP Code Dimension Table

Comments
2 min read
Part 2: Project Architecture

Part 2: Project Architecture

Comments
2 min read
Part 1: Creating Databricks Workspace and Enabling Unity Catalog

Part 1: Creating Databricks Workspace and Enabling Unity Catalog

Comments
2 min read
Part 3: Simulating Real-Time Streaming Data Using Databricks Sample Datasets

Part 3: Simulating Real-Time Streaming Data Using Databricks Sample Datasets

Comments
1 min read
Automating Serverless Data Ingestion: How to Connect External APIs to BigQuery using Python and Cloud Functions

Automating Serverless Data Ingestion: How to Connect External APIs to BigQuery using Python and Cloud Functions

Comments
12 min read
The Data Liberation: Amazon Athena and the Architecting of a Serverless Future

The Data Liberation: Amazon Athena and the Architecting of a Serverless Future

Comments
3 min read
Why Apache Ozone is the Preferred Object Store for Big Data

Why Apache Ozone is the Preferred Object Store for Big Data

Comments
3 min read
Be Essential or Be Optional: A Reality Check for Data Teams

Be Essential or Be Optional: A Reality Check for Data Teams

3
Comments
1 min read
Markdown Is Not The Future of LLM Data Infrastructure

Markdown Is Not The Future of LLM Data Infrastructure

7
Comments 1
7 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.