DEV Community

GharamElhendy
GharamElhendy

Posted on

Using GroupBy to Investigate Data from a Certain Scope According to One or More Specific Attributes

In this scenario, we have a dataframe that's made up of multiple attributes and we want to find the means of some of those attributes but from the scope of one or two main attributes.

For example, if we want to find the mean height in a population that consists of males and females with different age groups:

Bear in mind that my dataframe is called population and there are attributes like (for example) weight, height, BMI, and the age and gender, which we will use to split the data during analysis.


Importing and Parsing

Import pandas as pd
population_df = pd.read_csv('investigation_data.csv')
Enter fullscreen mode Exit fullscreen mode

To view means relative to the age of the person:

population_df.groupby('age').mean()
Enter fullscreen mode Exit fullscreen mode

This will result in showing us the mean age of all samples with a certain age, which will be specified in the first column of my dataset.

To view means relative to the age and then relative to the gender:

So, to use multiple columns with groupby, we can do the following:

population_df.groupby(['age', 'gender']).mean()
Enter fullscreen mode Exit fullscreen mode

Which will show us the mean of the attributes according to age, and then gender.

Top comments (0)