DEV Community

Cover image for Python Pandas, data processing
petercour
petercour

Posted on

Python Pandas, data processing

The pandas module lets you parse data. For instance, you can have excel data that you want to read.

excel

You can load an excel file with the method read_excel(filename), where filename may include a path. It can read both xls and xlsx.

That data is stored in a data frame. The data frame is a data structure in pandas, which you can edit or plot.

#!/usr/bin/python3
# coding: utf-8
import  pandas  as pd

df = pd.read_excel('example.xls')
data1 = df.head(7)
data2 = df.values
print("A \n{0}".format(data1))
print("B \n{0}".format(data2))

What we are doing here is very simple. I'll describe the steps.

Load the excel data. This file has to be in the same directory, else a path must be specified.

df = pd.read_excel('example.xls')

Store data from the data frame into variables

data1 = df.head(7)
data2 = df.values

Output those variables. Because it's not a single value we format it.

print("A \n{0}".format(data1))
print("B \n{0}".format(data2))

Run it in a terminal (or IDE if you prefer)

python3 example.py  

Outputs the data from the excel:

A 
       id    name  class       date  stature
0  201901   Aaron      1 2019-01-01        1
1  201902  Arthur      1 2019-01-02        1
2  201903   Angus      1 2019-01-03        1
3  201904  Albert      2 2019-01-04        2
4  201905  Adrian      2 2019-01-05        2
5  201906    Adam      3 2019-01-06        1
6  201907  Andres      3 2019-01-07        1
B 
[[201901 'Aaron' 1 Timestamp('2019-01-01 00:00:00') 1]
 [201902 'Arthur' 1 Timestamp('2019-01-02 00:00:00') 1]
 [201903 'Angus' 1 Timestamp('2019-01-03 00:00:00') 1]
 [201904 'Albert' 2 Timestamp('2019-01-04 00:00:00') 2]
 [201905 'Adrian' 2 Timestamp('2019-01-05 00:00:00') 2]
 [201906 'Adam' 3 Timestamp('2019-01-06 00:00:00') 1]
 [201907 'Andres' 3 Timestamp('2019-01-07 00:00:00') 1]
 [201908 'Alex' 3 Timestamp('2019-01-08 00:00:00') 1]]

Related links:

Top comments (0)