DEV Community

Kimeu
Kimeu

Posted on

Introduction to Data Engineering

Data Engineering is a discipline that entails collecting, translating and validating data for analysis. A good data engineer makes quality data available for analysis and data-driven decision making.There are four disciplines a data engineer should be well aversed with:
Data. There are different types of data file formats for example csv,tsv,json.
Data stores and repository. This include relational and non-relational databases, data lakes and data warehouses
Data pipelines. Entails collecting and gathering data from different sources
Analytics and data driven decision making.
Python language is the preferred programming language for data engineering as it has a wide variety of packages which are easy to import and enhance performance in data wrangling, ETL(Extract, Transform, Load), Feature engineering.

Top comments (0)