Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
First Look: AWS Glue DataBrew
Rich Dudley
Rich Dudley
Rich Dudley
Follow
for
AWS Community Builders
Dec 29 '20
First Look: AWS Glue DataBrew
#
glue
#
databrew
#
dataengineering
#
etl
10
reactions
Comments
Add Comment
7 min read
My favourite re:Invent data announcements
Peter Hanssens #BlackLivesMatter
Peter Hanssens #BlackLivesMatter
Peter Hanssens #BlackLivesMatter
Follow
for
AWS Heroes
Dec 17 '20
My favourite re:Invent data announcements
#
aws
#
dataengineering
#
redshift
#
reinvent2020
8
reactions
Comments
Add Comment
5 min read
🎄 Twelve Days of SMT 🎄 - Day 6: InsertField II
Robin Moffatt
Robin Moffatt
Robin Moffatt
Follow
Dec 15 '20
🎄 Twelve Days of SMT 🎄 - Day 6: InsertField II
#
apachekafka
#
kafkaconnect
#
dataengineering
#
twelvedaysofsmt
6
reactions
Comments
Add Comment
3 min read
New Features in Amazon DynamoDB - PartiQL, Export to S3, Integration with Kinesis Data Streams
Anand
Anand
Anand
Follow
Dec 15 '20
New Features in Amazon DynamoDB - PartiQL, Export to S3, Integration with Kinesis Data Streams
#
aws
#
dynamodb
#
database
#
dataengineering
11
reactions
Comments
Add Comment
12 min read
🎄 Twelve Days of SMT 🎄 - Day 1: InsertField (timestamp)
Robin Moffatt
Robin Moffatt
Robin Moffatt
Follow
Dec 8 '20
🎄 Twelve Days of SMT 🎄 - Day 1: InsertField (timestamp)
#
apachekafka
#
kafkaconnect
#
dataengineering
#
twelvedaysofsmt
5
reactions
Comments
Add Comment
3 min read
Datetimes Are Hard: Part 1 - Incoming data and formats
Anniina Sallinen
Anniina Sallinen
Anniina Sallinen
Follow
for
Ompeluseura LevelUP Koodarit
Nov 22 '20
Datetimes Are Hard: Part 1 - Incoming data and formats
#
dataengineering
#
data
#
series
4
reactions
Comments
1
comment
4 min read
Tidying up Pipelines with DataClasses
Sephi Berry
Sephi Berry
Sephi Berry
Follow
Nov 16 '20
Tidying up Pipelines with DataClasses
#
pipeline
#
dataengineering
#
python
#
scikit
5
reactions
Comments
Add Comment
5 min read
Uniform Data Distribution Among Kinesis Data Stream Shards
Irtiza Ali
Irtiza Ali
Irtiza Ali
Follow
Nov 12 '20
Uniform Data Distribution Among Kinesis Data Stream Shards
#
kinesis
#
dataengineering
#
python3
#
aws
2
reactions
Comments
2
comments
3 min read
Cut data warehouse costs with run caching
BenBirt
BenBirt
BenBirt
Follow
for
Dataform
Sep 24 '20
Cut data warehouse costs with run caching
#
elt
#
dataengineering
#
pipeline
#
etl
5
reactions
Comments
Add Comment
3 min read
Introduction to Data Pipelines
Eshban Suleman
Eshban Suleman
Eshban Suleman
Follow
for
Traindex
Oct 26 '20
Introduction to Data Pipelines
#
datascience
#
bigdata
#
dataengineering
#
pipelines
2
reactions
Comments
1
comment
4 min read
Dagster with User Code Deployments (gRPC)
Michiel Ghyselinck
Michiel Ghyselinck
Michiel Ghyselinck
Follow
Oct 15 '20
Dagster with User Code Deployments (gRPC)
#
dataengineering
#
etl
#
dagster
#
kubernetes
20
reactions
Comments
2
comments
6 min read
12 Ways of Applying a Function to Python Pandas DataFrame
Satish Chandra Gupta
Satish Chandra Gupta
Satish Chandra Gupta
Follow
Oct 10 '20
12 Ways of Applying a Function to Python Pandas DataFrame
#
python
#
datascience
#
machinelearning
#
dataengineering
3
reactions
Comments
Add Comment
1 min read
Data engineering essentials
Aman Ranjan Verma
Aman Ranjan Verma
Aman Ranjan Verma
Follow
Oct 3 '20
Data engineering essentials
#
hacktoberfest
#
python
#
devops
#
dataengineering
4
reactions
Comments
1
comment
1 min read
Some of my favourite public data sets
Robin Moffatt
Robin Moffatt
Robin Moffatt
Follow
Sep 25 '20
Some of my favourite public data sets
#
opendata
#
data
#
dataengineering
#
datascience
8
reactions
Comments
3
comments
2 min read
Becoming a Data Engineer
Adi Polak
Adi Polak
Adi Polak
Follow
Sep 21 '20
Becoming a Data Engineer
#
database
#
career
#
beginners
#
dataengineering
64
reactions
Comments
2
comments
1 min read
Transform AWS CloudTrail data using AWS Data Wrangler
Anand
Anand
Anand
Follow
Sep 20 '20
Transform AWS CloudTrail data using AWS Data Wrangler
#
aws
#
bigdata
#
cloud
#
dataengineering
3
reactions
Comments
Add Comment
8 min read
5 Essential skills for becoming a Data Engineer
Bartosz Gajda
Bartosz Gajda
Bartosz Gajda
Follow
Sep 19 '20
5 Essential skills for becoming a Data Engineer
#
beginners
#
datascience
#
career
#
dataengineering
8
reactions
Comments
Add Comment
6 min read
The Most Popular Data Science Newsletters
Greg
Greg
Greg
Follow
Sep 17 '20
The Most Popular Data Science Newsletters
#
datascience
#
machinelearning
#
dataengineering
11
reactions
Comments
Add Comment
9 min read
Build a monitored code-based pipeline to move data from Postgres to Snowflake
Simon Yu
Simon Yu
Simon Yu
Follow
Sep 17 '20
Build a monitored code-based pipeline to move data from Postgres to Snowflake
#
aws
#
postgres
#
snowflake
#
dataengineering
7
reactions
Comments
Add Comment
9 min read
Handling upstream data changes via Change Data Capture
Wai Yan
Wai Yan
Wai Yan
Follow
Sep 14 '20
Handling upstream data changes via Change Data Capture
#
dataengineering
#
database
#
datapipeline
8
reactions
Comments
Add Comment
8 min read
Intoduction to Apache Spark
maninekkalapudi
maninekkalapudi
maninekkalapudi
Follow
Sep 14 '20
Intoduction to Apache Spark
#
dataengineering
#
apachespark
#
bigdata
#
spark
10
reactions
Comments
Add Comment
6 min read
Kafka Connect in 60 seconds
01:00
Robin Moffatt
Robin Moffatt
Robin Moffatt
Follow
Sep 11 '20
Kafka Connect in 60 seconds
#
apachekafka
#
dataengineering
#
bigdata
#
dataintegration
4
reactions
Comments
Add Comment
2 min read
Deploying data pipelines to AWS Fargate - with monitoring and alerts built-in
Simon Yu
Simon Yu
Simon Yu
Follow
Aug 25 '20
Deploying data pipelines to AWS Fargate - with monitoring and alerts built-in
#
aws
#
dataengineering
#
serverless
#
docker
6
reactions
Comments
Add Comment
3 min read
Windowing in Streaming Data: Theory and a Scikit-Multiflow Example
Nazli Ander
Nazli Ander
Nazli Ander
Follow
Aug 14 '20
Windowing in Streaming Data: Theory and a Scikit-Multiflow Example
#
python
#
datascience
#
dataengineering
#
datastreams
2
reactions
Comments
Add Comment
4 min read
Data Warehouse - The Minimal Architectural Approach
Darsh Shukla
Darsh Shukla
Darsh Shukla
Follow
Aug 11 '20
Data Warehouse - The Minimal Architectural Approach
#
dataengineering
#
datascience
#
datawarehouse
#
cloud
3
reactions
Comments
1
comment
2 min read
Data Lake - 5 Major Principles
Darsh Shukla
Darsh Shukla
Darsh Shukla
Follow
Aug 11 '20
Data Lake - 5 Major Principles
#
dataengineering
#
datascience
#
datalake
#
cloud
2
reactions
Comments
Add Comment
2 min read
Scrape Structured Data with Python and Extruct
Todd Birchard
Todd Birchard
Todd Birchard
Follow
for
Hackers And Slackers
Aug 8 '20
Scrape Structured Data with Python and Extruct
#
python
#
scraping
#
dataengineering
#
scraper
10
reactions
Comments
Add Comment
16 min read
How To Run Airflow on Windows (with Docker)
Josh Holbrook
Josh Holbrook
Josh Holbrook
Follow
Jul 12 '20
How To Run Airflow on Windows (with Docker)
#
dataengineering
#
etl
#
airflow
25
reactions
Comments
3
comments
8 min read
Implementing a graph network pipeline with Dagster
Sephi Berry
Sephi Berry
Sephi Berry
Follow
Jul 9 '20
Implementing a graph network pipeline with Dagster
#
dagster
#
dataengineering
#
pipeline
#
graph
22
reactions
Comments
1
comment
12 min read
What differentiates schema on read from schema on write?
Krithika
Krithika
Krithika
Follow
Jun 21 '20
What differentiates schema on read from schema on write?
#
database
#
data
#
dataengineering
#
schema
3
reactions
Comments
2
comments
3 min read
Loading CSV data into Kafka - video walkthrough
Robin Moffatt
Robin Moffatt
Robin Moffatt
Follow
for
Confluent
Jun 23 '20
Loading CSV data into Kafka - video walkthrough
#
apachekafka
#
tutorial
#
csv
#
dataengineering
5
reactions
Comments
Add Comment
10 min read
Scraping Data on the Web with BeautifulSoup
Todd Birchard
Todd Birchard
Todd Birchard
Follow
for
Hackers And Slackers
Jun 11 '20
Scraping Data on the Web with BeautifulSoup
#
python
#
scrapers
#
data
#
dataengineering
33
reactions
Comments
Add Comment
12 min read
CI/CD for ETL/ELT pipelines
BenBirt
BenBirt
BenBirt
Follow
for
Dataform
Jun 8 '20
CI/CD for ETL/ELT pipelines
#
cicd
#
etl
#
elt
#
dataengineering
19
reactions
Comments
Add Comment
3 min read
A proven approach to land a Data Engineering job
Joseph
Joseph
Joseph
Follow
Jun 3 '20
A proven approach to land a Data Engineering job
#
database
#
beginners
#
career
#
dataengineering
6
reactions
Comments
Add Comment
5 min read
Data Engineering Project for Beginners - Batch edition
Joseph
Joseph
Joseph
Follow
May 28 '20
Data Engineering Project for Beginners - Batch edition
#
dataengineering
#
tutorial
#
beginners
#
aws
26
reactions
Comments
Add Comment
19 min read
10 Key skills, to help you become a data engineer
Joseph
Joseph
Joseph
Follow
May 11 '20
10 Key skills, to help you become a data engineer
#
dataengineering
#
beginners
#
database
#
etl
9
reactions
Comments
Add Comment
3 min read
Airflow UI with Role-Based Access Control
CitizenK
CitizenK
CitizenK
Follow
Apr 23 '20
Airflow UI with Role-Based Access Control
#
python
#
apacheairflow
#
dataengineering
5
reactions
Comments
Add Comment
1 min read
Apache Airflow Installation - mysql+celery
CitizenK
CitizenK
CitizenK
Follow
Apr 22 '20
Apache Airflow Installation - mysql+celery
#
python
#
apacheairflow
#
dataengineering
7
reactions
Comments
Add Comment
1 min read
Extract Nested Data From Complex JSON
Todd Birchard
Todd Birchard
Todd Birchard
Follow
for
Hackers And Slackers
Mar 26 '20
Extract Nested Data From Complex JSON
#
python
#
restapis
#
dataengineering
10
reactions
Comments
Add Comment
6 min read
DataOps - A Made-Up Term or Actual Practice
JoLo
JoLo
JoLo
Follow
Mar 3 '20
DataOps - A Made-Up Term or Actual Practice
#
dataops
#
dataengineering
#
datascience
#
devops
13
reactions
Comments
Add Comment
7 min read
🛢Create New Kedro Pipeline (kedro new)
Waylon Walker
Waylon Walker
Waylon Walker
Follow
Mar 2 '20
🛢Create New Kedro Pipeline (kedro new)
#
data
#
dataengineering
#
kedro
#
datascience
5
reactions
Comments
Add Comment
4 min read
🤷♀️ What is Kedro (The Parts)
Waylon Walker
Waylon Walker
Waylon Walker
Follow
Feb 24 '20
🤷♀️ What is Kedro (The Parts)
#
data
#
dataengineering
#
kedro
#
datascience
18
reactions
Comments
4
comments
3 min read
Data engineering portfolio projects?
Josh Yap
Josh Yap
Josh Yap
Follow
Feb 12 '20
Data engineering portfolio projects?
#
dataengineering
#
data
#
portfolio
29
reactions
Comments
1
comment
1 min read
Apache Airflow Core Concepts
Zahidul Islam
Zahidul Islam
Zahidul Islam
Follow
Jan 3 '20
Apache Airflow Core Concepts
#
airflow
#
workflow
#
dataengineering
26
reactions
Comments
Add Comment
4 min read
Coding MapReduce in C from Scratch using Threads: Map
Luciano Strika
Luciano Strika
Luciano Strika
Follow
Oct 19 '19
Coding MapReduce in C from Scratch using Threads: Map
#
programming
#
beginners
#
c
#
dataengineering
9
reactions
Comments
Add Comment
9 min read
I am a junior data engineer without a senior engineer. What should I do?
Caleb Ariel
Caleb Ariel
Caleb Ariel
Follow
Sep 30 '19
I am a junior data engineer without a senior engineer. What should I do?
#
startup
#
dataengineering
7
reactions
Comments
1
comment
1 min read
Toward GCP Data Engineer certification
Cong
Cong
Cong
Follow
Sep 26 '19
Toward GCP Data Engineer certification
#
gcp
#
machinelearning
#
bigdata
#
dataengineering
9
reactions
Comments
Add Comment
1 min read
Data Engineering Skills
00:31
Yan Parker
Yan Parker
Yan Parker
Follow
Sep 14 '19
Data Engineering Skills
#
dataengineering
#
datascience
14
reactions
Comments
Add Comment
1 min read
Intro to Data Ingestion and Data Lakes
Flo Comuzzi
Flo Comuzzi
Flo Comuzzi
Follow
Aug 9 '19
Intro to Data Ingestion and Data Lakes
#
dataengineering
#
datalake
#
dataingestion
8
reactions
Comments
1
comment
3 min read
Data Engineering — Complete Reference Guide From A-Z [2019]
Yan Parker
Yan Parker
Yan Parker
Follow
Aug 7 '19
Data Engineering — Complete Reference Guide From A-Z [2019]
#
dataengineering
#
datascience
#
bigdata
31
reactions
Comments
Add Comment
16 min read
Overview of the different approaches to putting Machine Learning (ML) models in production
Julien Kervizic
Julien Kervizic
Julien Kervizic
Follow
Aug 6 '19
Overview of the different approaches to putting Machine Learning (ML) models in production
#
data
#
dataengineering
#
machinelearning
#
software
9
reactions
Comments
Add Comment
14 min read
ON the evolution of Data Engineering
Julien Kervizic
Julien Kervizic
Julien Kervizic
Follow
Jul 30 '19
ON the evolution of Data Engineering
#
data
#
dataengineering
#
etl
#
sql
15
reactions
Comments
Add Comment
4 min read
Understanding and Optimizing Throughput in Azure Cosmos DB
Will Velida
Will Velida
Will Velida
Follow
Jul 15 '19
Understanding and Optimizing Throughput in Azure Cosmos DB
#
database
#
dataengineering
#
bigdata
#
azure
4
reactions
Comments
2
comments
8 min read
10 Days to Become a Google Cloud Certified Professional Data Engineer
Jeff Hale
Jeff Hale
Jeff Hale
Follow
Jun 22 '19
10 Days to Become a Google Cloud Certified Professional Data Engineer
#
cloud
#
database
#
dataengineering
#
google
26
reactions
Comments
2
comments
11 min read
Manage Data Pipelines with Apache Airflow
Todd Birchard
Todd Birchard
Todd Birchard
Follow
for
Hackers And Slackers
Jun 25 '19
Manage Data Pipelines with Apache Airflow
#
apache
#
python
#
dataengineering
#
etl
76
reactions
Comments
Add Comment
13 min read
Data Analyst vs Data Engineer vs Data Scientist: Skills, Responsibilities, Salary
aayushi94
aayushi94
aayushi94
Follow
Jun 3 '19
Data Analyst vs Data Engineer vs Data Scientist: Skills, Responsibilities, Salary
#
dataanalyst
#
dataengineering
#
datascientist
23
reactions
Comments
Add Comment
4 min read
How to collect the data you need to bootstrap your digital marketing analytics
Julien Kervizic
Julien Kervizic
Julien Kervizic
Follow
Oct 19 '19
How to collect the data you need to bootstrap your digital marketing analytics
#
dataengineering
#
data
#
marketing
#
analytics
14
reactions
Comments
Add Comment
12 min read
5 Considerations to have when using Airflow
Julien Kervizic
Julien Kervizic
Julien Kervizic
Follow
Oct 18 '19
5 Considerations to have when using Airflow
#
analytics
#
machinelearning
#
data
#
dataengineering
13
reactions
Comments
Add Comment
6 min read
Structured Streaming in PySpark
Todd Birchard
Todd Birchard
Todd Birchard
Follow
for
Hackers And Slackers
Oct 10 '19
Structured Streaming in PySpark
#
spark
#
apache
#
python
#
dataengineering
13
reactions
Comments
Add Comment
9 min read
How to Run Parallel Data Analysis in Python using Dask Dataframes
Luciano Strika
Luciano Strika
Luciano Strika
Follow
Apr 14 '19
How to Run Parallel Data Analysis in Python using Dask Dataframes
#
dataanalysis
#
programming
#
dataengineering
#
parallel
7
reactions
Comments
Add Comment
6 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account