Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Integrando uma Web API com Datastore Emulator
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Feb 21 '23
Integrando uma Web API com Datastore Emulator
#
python
#
gcp
#
dataengineering
#
webdev
1
reaction
Comments
Add Comment
4 min read
Python functions and lambda functions in data engineering.
muriuki muriungi erick
muriuki muriungi erick
muriuki muriungi erick
Follow
Feb 20 '23
Python functions and lambda functions in data engineering.
#
dataengineering
#
python
#
lambda
#
functions
7
reactions
Comments
Add Comment
3 min read
Creating Data Pipelines as DAGs in Apache Airflow (Part 1)
franklinobasy
franklinobasy
franklinobasy
Follow
Feb 20 '23
Creating Data Pipelines as DAGs in Apache Airflow (Part 1)
#
dataengineering
#
airflow
#
dags
#
python
1
reaction
Comments
Add Comment
6 min read
Using python dictionary in data engineering.
muriuki muriungi erick
muriuki muriungi erick
muriuki muriungi erick
Follow
Feb 19 '23
Using python dictionary in data engineering.
#
dataengineering
#
python
#
tutorial
#
100daysofcode
6
reactions
Comments
2
comments
2 min read
prefect vs apache airflow
James
James
James
Follow
Feb 17 '23
prefect vs apache airflow
#
dataengineering
#
workfloworchestration
#
prefect
#
airflow
4
reactions
Comments
Add Comment
4 min read
SQL101: Introduction to SQL
Marriane Akeyo
Marriane Akeyo
Marriane Akeyo
Follow
Feb 17 '23
SQL101: Introduction to SQL
#
sql
#
datascience
#
database
#
dataengineering
Comments
2
comments
14 min read
22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023
Pramit Marattha
Pramit Marattha
Pramit Marattha
Follow
for
Chaos Genius
Feb 2 '23
22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023
#
dataops
#
data
#
beginners
#
dataengineering
16
reactions
Comments
2
comments
30 min read
Data Pipelines with Great Expectations | Introduction
Samuel Earl
Samuel Earl
Samuel Earl
Follow
Jan 31 '23
Data Pipelines with Great Expectations | Introduction
#
datascience
#
dataengineering
#
datavalidation
#
greatexpectations
4
reactions
Comments
Add Comment
2 min read
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset
Mike Houngbadji
Mike Houngbadji
Mike Houngbadji
Follow
Jan 20 '23
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset
#
dataengineering
#
datascience
#
python
#
database
16
reactions
Comments
2
comments
8 min read
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark
PaulOpu
PaulOpu
PaulOpu
Follow
Jan 14 '23
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark
#
python
#
pyspark
#
bigdata
#
dataengineering
3
reactions
Comments
Add Comment
4 min read
PySpark: A brief analysis to the most common words in Dracula, by Bram Stoker
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Jan 11 '23
PySpark: A brief analysis to the most common words in Dracula, by Bram Stoker
#
python
#
dataengineering
#
spark
#
datascience
18
reactions
Comments
Add Comment
5 min read
AWS Data Engineering Services: Everything you need to know
Parth soni
Parth soni
Parth soni
Follow
Jan 12 '23
AWS Data Engineering Services: Everything you need to know
#
aws
#
cloudskills
#
dataengineering
#
awscommunity
5
reactions
Comments
1
comment
9 min read
Working with Map() function in Python, Pyspark and Apache Beam
Ruma Sinha
Ruma Sinha
Ruma Sinha
Follow
Dec 25 '22
Working with Map() function in Python, Pyspark and Apache Beam
#
python
#
pyspark
#
distributedcomputing
#
dataengineering
1
reaction
Comments
Add Comment
3 min read
Time Series Database and Analytics using Azure Data Explorer
Yogesh Dipankar
Yogesh Dipankar
Yogesh Dipankar
Follow
for
OCP
Dec 15 '22
Time Series Database and Analytics using Azure Data Explorer
#
dataengineering
#
datascience
#
timeseries
#
azure
1
reaction
Comments
Add Comment
4 min read
How I built a real-time Machine Learning system with Kafka, Elasticsearch, Kibana, and Docker
Dipankar Medhi
Dipankar Medhi
Dipankar Medhi
Follow
Dec 28 '22
How I built a real-time Machine Learning system with Kafka, Elasticsearch, Kibana, and Docker
#
streaming
#
machinelearning
#
dataengineering
#
datascience
1
reaction
Comments
Add Comment
4 min read
Handling schema changes in snowflake
Aparna Aravind
Aparna Aravind
Aparna Aravind
Follow
Nov 25 '22
Handling schema changes in snowflake
#
snowflake
#
dataengineering
#
spark
#
schemaevolution
3
reactions
Comments
Add Comment
5 min read
Redshift Deep Dive
Sanjay Krishna
Sanjay Krishna
Sanjay Krishna
Follow
Oct 25 '22
Redshift Deep Dive
#
aws
#
redshift
#
dataengineering
#
dataanalytics
1
reaction
Comments
Add Comment
5 min read
Data Engineering Trends for 2023
avital trifsik
avital trifsik
avital trifsik
Follow
for
Memphis.dev
Nov 9 '22
Data Engineering Trends for 2023
#
dataengineering
#
datacontracts
#
eventsourcing
#
streamingdata
3
reactions
Comments
Add Comment
4 min read
The Changing Face Of ETL
avital trifsik
avital trifsik
avital trifsik
Follow
for
Memphis.dev
Nov 2 '22
The Changing Face Of ETL
#
etl
#
dataengineering
#
datapipelines
#
elt
3
reactions
Comments
1
comment
12 min read
Ultimate guide to becoming a Data Analyst/Data Scientist
James Oyanna
James Oyanna
James Oyanna
Follow
Oct 31 '22
Ultimate guide to becoming a Data Analyst/Data Scientist
#
datascience
#
dataanalyst
#
dataengineering
#
machinelearning
5
reactions
Comments
Add Comment
4 min read
Amazon SQS and serverless DataEngineering workloads
prasanth mathesh
prasanth mathesh
prasanth mathesh
Follow
for
AWS Community Builders
Oct 25 '22
Amazon SQS and serverless DataEngineering workloads
#
sqs
#
amazon
#
dataengineering
#
serverless
2
reactions
Comments
Add Comment
3 min read
2022 Beginner Friendly Modern Data Engineering Career path With Learning Resources.
Mwenda Harun Mbaabu
Mwenda Harun Mbaabu
Mwenda Harun Mbaabu
Follow
Oct 10 '22
2022 Beginner Friendly Modern Data Engineering Career path With Learning Resources.
#
python
#
datascience
#
dataengineering
#
machinelearning
20
reactions
Comments
2
comments
2 min read
Learn Ansible and how to Install it in Ubuntu 22.04.
Kinyungu Denis
Kinyungu Denis
Kinyungu Denis
Follow
Oct 5 '22
Learn Ansible and how to Install it in Ubuntu 22.04.
#
tutorial
#
install
#
dataengineering
Comments
Add Comment
3 min read
Uma breve Introdução ao processamento de dados em tempo real com Spark Structured Streaming e Apache Kafka
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Sep 29 '22
Uma breve Introdução ao processamento de dados em tempo real com Spark Structured Streaming e Apache Kafka
#
python
#
dataengineering
#
braziliandevs
#
spark
5
reactions
Comments
Add Comment
8 min read
Apache-Spark introduction for SQL developers
Cesar Mostacero
Cesar Mostacero
Cesar Mostacero
Follow
Sep 29 '22
Apache-Spark introduction for SQL developers
#
apachespark
#
dataengineering
#
beginners
#
bigdata
2
reactions
Comments
Add Comment
7 min read
PySpark: uma breve análise das palavras mais comuns em Drácula, por Bram Stoker
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Sep 24 '22
PySpark: uma breve análise das palavras mais comuns em Drácula, por Bram Stoker
#
python
#
dataengineering
#
spark
#
braziliandevs
9
reactions
Comments
6
comments
6 min read
Introdução à análise de dados com PySpark utilizando os dados dos campeões de League of Legends
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Sep 15 '22
Introdução à análise de dados com PySpark utilizando os dados dos campeões de League of Legends
#
pyspark
#
python
#
dataanalysis
#
dataengineering
3
reactions
Comments
Add Comment
8 min read
Pokemons Flow: desenvolvendo uma pipeline de dados com apache airflow para extração de pokemon via API
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Sep 13 '22
Pokemons Flow: desenvolvendo uma pipeline de dados com apache airflow para extração de pokemon via API
#
python
#
dataengineering
#
braziliandevs
#
airflow
10
reactions
Comments
Add Comment
6 min read
Apache PySpark for Data Engineering
Kinyungu Denis
Kinyungu Denis
Kinyungu Denis
Follow
Sep 9 '22
Apache PySpark for Data Engineering
#
beginners
#
python
#
dataengineering
#
sql
13
reactions
Comments
4
comments
9 min read
Introduction to Python for Data Engineering
Haji Rufai
Haji Rufai
Haji Rufai
Follow
Sep 1 '22
Introduction to Python for Data Engineering
#
python
#
dataengineering
#
pythonfordataengineering
#
beginners
4
reactions
Comments
Add Comment
5 min read
Kubernetes Was Never Designed for Batch Jobs
Hyunho Richard Lee
Hyunho Richard Lee
Hyunho Richard Lee
Follow
for
Meadowrun
Sep 1 '22
Kubernetes Was Never Designed for Batch Jobs
#
airflow
#
mlops
#
dataengineering
#
kubernetes
3
reactions
Comments
2
comments
17 min read
Data Engineering 102: Introduction to Python for Data Engineering.
Kinyungu Denis
Kinyungu Denis
Kinyungu Denis
Follow
Aug 31 '22
Data Engineering 102: Introduction to Python for Data Engineering.
#
python
#
dataengineering
#
beginners
6
reactions
Comments
Add Comment
10 min read
Introduction to Python for Data Engineering
muriuki muriungi erick
muriuki muriungi erick
muriuki muriungi erick
Follow
Aug 31 '22
Introduction to Python for Data Engineering
#
python
#
dataengineering
#
aws
#
iot
4
reactions
Comments
Add Comment
7 min read
INTRODUCTION TO PYTHON FOR DATA ENGINEERING
fatumakaliku
fatumakaliku
fatumakaliku
Follow
Aug 31 '22
INTRODUCTION TO PYTHON FOR DATA ENGINEERING
#
python
#
data
#
dataengineering
#
beginners
Comments
Add Comment
4 min read
DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.
viola kinya kithinji
viola kinya kithinji
viola kinya kithinji
Follow
Aug 24 '22
DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.
#
codenewbie
#
dataengineering
#
beginners
#
programming
5
reactions
Comments
Add Comment
2 min read
Fundamentos da Engenharia de Dados
Armando Tadeu
Armando Tadeu
Armando Tadeu
Follow
Aug 24 '22
Fundamentos da Engenharia de Dados
#
dataengineering
#
database
6
reactions
Comments
Add Comment
9 min read
Data Engineering 101: Introduction to Data Engineering
Emmanuel Kariithi
Emmanuel Kariithi
Emmanuel Kariithi
Follow
Aug 23 '22
Data Engineering 101: Introduction to Data Engineering
#
newbie
#
dataengineering
5
reactions
Comments
Add Comment
2 min read
Online SQL Client for low code data management
Pavlo Paska
Pavlo Paska
Pavlo Paska
Follow
Aug 21 '22
Online SQL Client for low code data management
#
sql
#
database
#
dataengineering
#
lowcode
5
reactions
Comments
1
comment
5 min read
Data Engineering 101: Introduction to Data Engineering.
Kinyungu Denis
Kinyungu Denis
Kinyungu Denis
Follow
Aug 19 '22
Data Engineering 101: Introduction to Data Engineering.
#
dataengineering
#
beginners
8
reactions
Comments
1
comment
6 min read
Introduction to data engineering
muriuki muriungi erick
muriuki muriungi erick
muriuki muriungi erick
Follow
Aug 19 '22
Introduction to data engineering
#
datascience
#
dataengineering
#
python
#
sql
5
reactions
Comments
Add Comment
4 min read
Create Jira Ticket on Prefect Task Failure
Falk
Falk
Falk
Follow
Aug 16 '22
Create Jira Ticket on Prefect Task Failure
#
python
#
dataengineering
1
reaction
Comments
Add Comment
2 min read
Hash Personal Identifiable Information (PII) in your ELT pipelines
Falk
Falk
Falk
Follow
Aug 15 '22
Hash Personal Identifiable Information (PII) in your ELT pipelines
#
python
#
dataengineering
#
database
3
reactions
Comments
Add Comment
3 min read
Difference Between Data Engineer and Data Scientist?
Muhammad Rameez
Muhammad Rameez
Muhammad Rameez
Follow
Aug 4 '22
Difference Between Data Engineer and Data Scientist?
#
difference
#
pipeline
#
dataengineering
#
datascientist
7
reactions
Comments
Add Comment
3 min read
Learning Workflow Schedulers (Oozie)
Ruikai Li
Ruikai Li
Ruikai Li
Follow
Jul 29 '22
Learning Workflow Schedulers (Oozie)
#
bigdata
#
datascience
#
dataengineering
2
reactions
Comments
Add Comment
5 min read
Solving AttributeError: 'float' object has no attribute 'rint'
Olakusibe Aremu-Oluwole
Olakusibe Aremu-Oluwole
Olakusibe Aremu-Oluwole
Follow
Jul 26 '22
Solving AttributeError: 'float' object has no attribute 'rint'
#
python
#
etl
#
dataengineering
#
pandas
5
reactions
Comments
Add Comment
2 min read
[Spark-k8s] — Getting started # Part 1
Tiago Xavier
Tiago Xavier
Tiago Xavier
Follow
Jul 19 '22
[Spark-k8s] — Getting started # Part 1
#
spark
#
kubernetes
#
dataengineering
3
reactions
Comments
Add Comment
4 min read
Websites to find Dataset for your Data Engineering projects.
SAIFULLAH🇮🇳
SAIFULLAH🇮🇳
SAIFULLAH🇮🇳
Follow
Jul 17 '22
Websites to find Dataset for your Data Engineering projects.
#
datascience
#
dataengineering
#
database
#
dataset
5
reactions
Comments
Add Comment
1 min read
Data engineers must-see: The future trend of big data cloud services
DMetaSoul
DMetaSoul
DMetaSoul
Follow
Jun 26 '22
Data engineers must-see: The future trend of big data cloud services
#
database
#
dataengineering
#
bigdata
#
opensource
8
reactions
Comments
1
comment
8 min read
Data Engineering Projects for Beginners
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Ramses Alexander Coraspe
Follow
Jun 15 '22
Data Engineering Projects for Beginners
#
database
#
dataengineering
#
python
24
reactions
Comments
2
comments
2 min read
Data Pipelines with Apache Airflow - Book Review
Albert Ulysses
Albert Ulysses
Albert Ulysses
Follow
Jun 13 '22
Data Pipelines with Apache Airflow - Book Review
#
python
#
dataengineering
#
books
#
bigdata
8
reactions
Comments
Add Comment
2 min read
ETL vs Interactive Queries: The Case for Both
Monica Miller
Monica Miller
Monica Miller
Follow
Jun 13 '22
ETL vs Interactive Queries: The Case for Both
#
database
#
dataengineering
#
datascience
#
etl
6
reactions
Comments
Add Comment
8 min read
Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow
Nkwam Philip
Nkwam Philip
Nkwam Philip
Follow
Jun 9 '22
Data Engineering - Creating a Streaming Data Pipeline for a Real-Time Dashboard with Dataflow
#
database
#
googlecloud
#
dataengineering
#
cloud
10
reactions
Comments
Add Comment
4 min read
Parsing logs from multiple data sources with Ahana and Cube
Bartosz Mikulski
Bartosz Mikulski
Bartosz Mikulski
Follow
for
Cube
Jun 8 '22
Parsing logs from multiple data sources with Ahana and Cube
#
ahana
#
presto
#
dataengineering
14
reactions
Comments
Add Comment
24 min read
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics
DMetaSoul
DMetaSoul
DMetaSoul
Follow
May 29 '22
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics
#
opensource
#
database
#
dataengineering
#
bigdata
7
reactions
Comments
Add Comment
3 min read
What is the Lakehouse, the latest Direction of Big Data Architecture?
DMetaSoul
DMetaSoul
DMetaSoul
Follow
May 14 '22
What is the Lakehouse, the latest Direction of Big Data Architecture?
#
opensource
#
dataengineering
#
bigdata
#
database
9
reactions
Comments
Add Comment
10 min read
Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL
Team RudderStack
Team RudderStack
Team RudderStack
Follow
for
RudderStack
Apr 4 '22
Making Data Engineering Easier: Operational Analytics With Event Streaming and Reverse ETL
#
datascience
#
eventstreaming
#
dataengineering
#
customerdataplatform
7
reactions
Comments
Add Comment
6 min read
Using dbt for Transformation Tasks on BigQuery
Cem Keskin
Cem Keskin
Cem Keskin
Follow
May 2 '22
Using dbt for Transformation Tasks on BigQuery
#
dezoomcamp
#
dataengineering
10
reactions
Comments
1
comment
4 min read
Docker and Kubernetes
Abdullah Paracha
Abdullah Paracha
Abdullah Paracha
Follow
for
AWS Community Builders
Apr 27 '22
Docker and Kubernetes
#
devops
#
dataengineering
#
softwareengineer
#
programming
6
reactions
Comments
Add Comment
3 min read
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset
Cem Keskin
Cem Keskin
Cem Keskin
Follow
Apr 24 '22
How to Use Apache Airflow to Get 1000+ Files From a Public Dataset
#
dezoomcamp
#
dataengineering
#
airflow
#
publicdatasets
8
reactions
Comments
Add Comment
10 min read
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)
Rubens Barbosa
Rubens Barbosa
Rubens Barbosa
Follow
Apr 24 '22
ELT Data Pipeline with Kubernetes CronJob, Azure Data Lake, Azure Databricks (Part 1)
#
etl
#
dataengineering
#
python
#
azure
9
reactions
Comments
Add Comment
5 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account