Hello, today I will talk about the pandas library. Pandas DataFrame can be initialize like this.
pandas.DataFrame(data, index, columns)
#data -> it can be our list, dict,python obj etc.
#index -> it is not compulsory, byb default start from 0
#columns -> its optional parameter, we can specify the columns names
df=pd.DataFrame(["Paris", "Berlin", "Roma", "Ankara"])
print(df)
Output:
0
0 Paris
1 Berlin
2 Roma
3 Ankara
As you see in above code index and column title generated by default.Of course we can specify these values.Let’s learn how to create a Pandas Dataframe from dictionary with user defined column names and indexes.
import pandas as pd
import numpy as np
dict = {'capitals':["Paris", "Berlin", "Roma", "Ankara"],
'population':[15, 12, 18, 9]}
row_label=["a","b","c","d"]
df = pd.DataFrame(dict,index=row_label)
print(df)
Output:
capitals population
a Paris 15
b Berlin 12
c Roma 18
d Ankara 9
We can also create dataframe from Series. Using concat() methot allows us to merge different series to make a new dataframe object.This proccess is very customizable via passing parameters.
import pandas as pd
import numpy as np
person = pd.Series(["Jack","Linda"])
fees = pd.Series([20000,25000])
department = pd.Series(['Acccountant','Manager'])
df=pd.concat([person,fees,department],axis=1,
keys= ['Person', 'Salary', 'Department'])
Output:
Person Salary Department
0 Jack 20000 Acccountant
1 Linda 25000 Manager
axis=1 to specify merge series as columns instead of rows
with key parameter we added column labels
Of course we could get the same result by doing this
person = pd.Series(["Jack","Linda"])
fees = pd.Series([20000,25000])
department = pd.Series(['Acccountant','Manager'])
df=pd.concat({"Person":person,"Salary":fees,"Department":department},axis=1)
We can create DataFrame using zip() function.Different list can be merged by this method like this.
person = ["Jack","Linda"]
fees = [20000,25000]
department = ['Acccountant','Manager']
tuples_list = list(zip(person, fees, department))
df = pd.DataFrame(tuples_list, columns = ['Person', 'Salary', 'Department'])
print(df)
Output:
Person Salary Department
0 Jack 20000 Acccountant
1 Linda 25000 Manager
Top comments (0)