DEV Community

Satwik Kansal
Satwik Kansal

Posted on

Plotting a group level comparison histogram in pandas

Let's say I have a DataFrame like below,

years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez",  "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]

df = pd.DataFrame({'year': years,
                    'vehicle_type': vehicle_types,
                    'company': companies
                   })

df.head()

Alt Text

And I want to plot the distribution of vehicle types per year, something like this,

Alt Text

Turns out, this can easily be done in one line with pandas,

df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()

It's amazing how a single statement takes care of,

  • Null counts
  • Plotting the histogram bars side by side
  • And aesthetics like labels, legends, etc.

The critical part here was the unstack function and how it fits well with the multi-index created by value_counts().

Alt Text

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay