DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Kafka Connect JDBC Sink deep-dive: Working with Primary Keys

Kafka Connect JDBC Sink deep-dive: Working with Primary Keys

3
Comments
28 min read
Quick profiling of data in Apache Kafka using kafkacat and visidata

Quick profiling of data in Apache Kafka using kafkacat and visidata

2
Comments 1
2 min read
📼 ksqlDB HOWTO - A mini video series 📼

📼 ksqlDB HOWTO - A mini video series 📼

10
Comments
4 min read
Tech Exceptions Show: Accelerating Data Engineering with Azure

Tech Exceptions Show: Accelerating Data Engineering with Azure

12
Comments
2 min read
Apache Spark Ecosystem, Jan 2021 Highlights

Apache Spark Ecosystem, Jan 2021 Highlights

11
Comments
4 min read
Running a self-managed Kafka Connect worker for Confluent Cloud

Running a self-managed Kafka Connect worker for Confluent Cloud

8
Comments
11 min read
ETL com Apache Airflow, Web Scraping, AWS S3, Apache Spark e Redshift | Parte 1

ETL com Apache Airflow, Web Scraping, AWS S3, Apache Spark e Redshift | Parte 1

21
Comments 2
7 min read
Kafka Connect - Deep Dive into Single Message Transforms

Kafka Connect - Deep Dive into Single Message Transforms

4
Comments
3 min read
First Look: AWS Glue DataBrew

First Look: AWS Glue DataBrew

10
Comments
7 min read
My favourite re:Invent data announcements

My favourite re:Invent data announcements

8
Comments
5 min read
🎄 Twelve Days of SMT 🎄 - Day 6: InsertField II

🎄 Twelve Days of SMT 🎄 - Day 6: InsertField II

6
Comments
3 min read
New Features in Amazon DynamoDB - PartiQL, Export to S3, Integration with Kinesis Data Streams

New Features in Amazon DynamoDB - PartiQL, Export to S3, Integration with Kinesis Data Streams

11
Comments
12 min read
🎄 Twelve Days of SMT 🎄 - Day 1: InsertField (timestamp)

🎄 Twelve Days of SMT 🎄 - Day 1: InsertField (timestamp)

5
Comments
3 min read
Datetimes Are Hard: Part 1 - Incoming data and formats

Datetimes Are Hard: Part 1 - Incoming data and formats

4
Comments 1
4 min read
Tidying up Pipelines with DataClasses

Tidying up Pipelines with DataClasses

5
Comments
5 min read
Uniform Data Distribution Among Kinesis Data Stream Shards

Uniform Data Distribution Among Kinesis Data Stream Shards

2
Comments 2
3 min read
Cut data warehouse costs with run caching

Cut data warehouse costs with run caching

5
Comments
3 min read
Introduction to Data Pipelines

Introduction to Data Pipelines

2
Comments 1
4 min read
Dagster with User Code Deployments (gRPC)

Dagster with User Code Deployments (gRPC)

21
Comments 2
6 min read
12 Ways of Applying a Function to Python Pandas DataFrame

12 Ways of Applying a Function to Python Pandas DataFrame

3
Comments
1 min read
Data engineering essentials

Data engineering essentials

4
Comments 1
1 min read
Some of my favourite public data sets

Some of my favourite public data sets

8
Comments 3
2 min read
Becoming a Data Engineer

Becoming a Data Engineer

64
Comments 2
1 min read
Transform AWS CloudTrail data using AWS Data Wrangler

Transform AWS CloudTrail data using AWS Data Wrangler

3
Comments
8 min read
5 Essential skills for becoming a Data Engineer

5 Essential skills for becoming a Data Engineer

8
Comments
6 min read
loading...