DEV Community

Cover image for Most used Pandas features — DataFrame [PART II]
Muhammet Tan
Muhammet Tan

Posted on

Most used Pandas features — DataFrame [PART II]

Hello, today I will talk about the pandas library. Pandas DataFrame can be initialize like this.

pandas.DataFrame(data, index, columns)
Enter fullscreen mode Exit fullscreen mode
#data     -> it can be our list, dict,python obj etc.
#index    -> it is not compulsory, byb default start from 0
#columns  -> its optional parameter, we can specify the columns names
Enter fullscreen mode Exit fullscreen mode
df=pd.DataFrame(["Paris", "Berlin", "Roma", "Ankara"])
print(df)

Output: 
        0
0   Paris
1  Berlin
2    Roma
3  Ankara
Enter fullscreen mode Exit fullscreen mode

As you see in above code index and column title generated by default.Of course we can specify these values.Let’s learn how to create a Pandas Dataframe from dictionary with user defined column names and indexes.

import pandas as pd
import numpy as np

dict = {'capitals':["Paris", "Berlin", "Roma", "Ankara"],
        'population':[15, 12, 18, 9]}


row_label=["a","b","c","d"]

df = pd.DataFrame(dict,index=row_label)

print(df)

Output:
  capitals  population
a    Paris          15
b   Berlin          12
c     Roma          18
d   Ankara           9
Enter fullscreen mode Exit fullscreen mode

We can also create dataframe from Series. Using concat() methot allows us to merge different series to make a new dataframe object.This proccess is very customizable via passing parameters.

import pandas as pd
import numpy as np

person = pd.Series(["Jack","Linda"])
fees = pd.Series([20000,25000])
department = pd.Series(['Acccountant','Manager'])


df=pd.concat([person,fees,department],axis=1,
             keys= ['Person', 'Salary', 'Department'])

Output:
  Person  Salary   Department
0   Jack   20000  Acccountant
1  Linda   25000      Manager
Enter fullscreen mode Exit fullscreen mode

axis=1 to specify merge series as columns instead of rows

with key parameter we added column labels

Of course we could get the same result by doing this

person = pd.Series(["Jack","Linda"])
fees = pd.Series([20000,25000])
department = pd.Series(['Acccountant','Manager'])

df=pd.concat({"Person":person,"Salary":fees,"Department":department},axis=1)
Enter fullscreen mode Exit fullscreen mode

We can create DataFrame using zip() function.Different list can be merged by this method like this.

person = ["Jack","Linda"]
fees = [20000,25000]
department = ['Acccountant','Manager']

tuples_list = list(zip(person, fees, department))
df = pd.DataFrame(tuples_list, columns = ['Person', 'Salary', 'Department'])

print(df)
Output:
  Person  Salary   Department
0   Jack   20000  Acccountant
1  Linda   25000      Manager
Enter fullscreen mode Exit fullscreen mode

Top comments (0)