Let's say I have a DataFrame like below,
years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez", "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]
df = pd.DataFrame({'year': years,
'vehicle_type': vehicle_types,
'company': companies
})
df.head()
And I want to plot the distribution of vehicle types per year, something like this,
Turns out, this can easily be done in one line with pandas,
df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()
It's amazing how a single statement takes care of,
- Null counts
- Plotting the histogram bars side by side
- And aesthetics like labels, legends, etc.
The critical part here was the unstack
function and how it fits well with the multi-index created by value_counts()
.
Top comments (0)