Exploring Data in Pandas Using GroupBy

Importing and Parsing

Import pandas as pd
df = pd.read_csv('file_name.csv')

Checking the mean of all attributes (columns) in a data set


Finding the mean of attributes when holding one attribute as an index

I.e: Here, I want to find the mean of all the other attributes with each occurence of the petal length attribute.


We can even add multiple entries to hold as an index

df.groupby(['petal_length', 'color']).mean()

If we don't want the attributes we choose to be made as an index, we can use as_index=false:

df.groupby(['petal_length', 'color'], as_index=False).mean()

And finally, if we are interested in only one attribute (column) we can index it as follows:

df.groupby(['petal_length', 'color'], as_index=False)['petal_width'].mean()

