DEV Community

Cover image for Data Engineering 102: Introduction to Python for Data Engineering
Gathuru_M
Gathuru_M

Posted on • Updated on

Data Engineering 102: Introduction to Python for Data Engineering

Python is today’s most popular programming language with endless applications in various fields. It is ideally suited for deployment, analysis, and maintenance thanks to its flexible and dynamic nature.
It is one of the crucial skills required in the field of Data Engineering, to create Data Pipelines, set up Statistical Models, and perform a thorough analysis of them.
To start using python, you'll need it installed in the Operating system you're currently using be it Linux, Mac OS, or the most common Windows.
Python provides an ample amount of libraries and packages for various applications. These are the top 5 Python for Data Engineering packages. They include:

-Pandas
-pygrametl
-petl
-Beautiful Soup
-SciPy

But firstly, in Learning python, these are some of the topics you should perhaps start with:-

1.MATH EXPRESSIONS
Syntax Math Meaning
a+b a+b addition
a-b a-b) subtraction
a*b a\times b\ multiplication
a/b a\div b\, division (see note below)
a//b a\div b\ division - in python 2.2 & abv
a%b a mod b modulo
-a -a negation
abs(a) |a| absolute value
a**b a^{b} exponent
math.sqrt square root

2. Strings
Strings in python are surrounded by either single quotation marks or double quotation marks.
e.g.
'hello' is the same as "hello".

You can display a string literal with the print() function

3. Variables
Variables are containers for storing data values.
A variable is created the moment you first assign a value to it.
x = 5
y = "John"
print(x)
print(y)

4.Loops
Python provides three ways for executing the loops

(a)While Loop:
In python, a while loop is used to execute a block of statements repeatedly until a given condition is satisfied. And when the condition becomes false, the line immediately after the loop in the program is executed.

while expression:
statement(s)

(b) For in Loop:
For loops are used for sequential traversal. For example: traversing a list or string or array etc. In Python, there is no C style for loop, i.e., for (i=0; i<n; i++). There is “for in” loop which is similar to for each loop in other languages. Let us learn how to use for in loop for sequential traversals

for iterator_var in sequence:
statements(s)

(c)Nested Loops:
Python programming language allows using one loop inside another loop. The following section shows a few examples to illustrate the concept.

Syntax:

while expression:
while expression:
statement(s)
statement(s)

5. Functions
A function is a block of code that only runs when it is called.

The basic syntax is:
def my_function():
print("Hello from a function")

6. List, Tuples, Dictionary, and sets

It's also important to learn how to connect databases with:-
1.BOTO3
2.Psycopg2
3.MySQL

So, as long as there is data to process, data engineers will be in demand. I wish you all the best as you choose to pursue this journey.

Thanks for reading!
Any questions? Leave your comment below to start fantastic discussions!

Top comments (0)