Plotting a group level comparison histogram in pandas

Let's say I have a DataFrame like below,

years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez",  "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]

df = pd.DataFrame({'year': years,
                    'vehicle_type': vehicle_types,
                    'company': companies
                   })

df.head()

And I want to plot the distribution of vehicle types per year, something like this,

Turns out, this can easily be done in one line with pandas,

df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()

It's amazing how a single statement takes care of,

Null counts
Plotting the histogram bars side by side
And aesthetics like labels, legends, etc.

The critical part here was the unstack function and how it fits well with the multi-index created by value_counts().

Top comments (0)

3D Flip Card

Manas - Nov 25

#103 — Deduplication of Row-Based Data — by Key Column — Keeping the Original Order

Judith-Excel-Sharing - Nov 25

Spring Microservice

Chan samangrathana - Nov 25

Next.js 13 Crash Course: A Deep Dive into Modern Web Development

GetVM - Nov 25

DEV Community