DEV Community

Satwik Kansal
Satwik Kansal

Posted on

Plotting a group level comparison histogram in pandas

Let's say I have a DataFrame like below,

years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez",  "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]

df = pd.DataFrame({'year': years,
                    'vehicle_type': vehicle_types,
                    'company': companies
                   })

df.head()

Alt Text

And I want to plot the distribution of vehicle types per year, something like this,

Alt Text

Turns out, this can easily be done in one line with pandas,

df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()

It's amazing how a single statement takes care of,

  • Null counts
  • Plotting the histogram bars side by side
  • And aesthetics like labels, legends, etc.

The critical part here was the unstack function and how it fits well with the multi-index created by value_counts().

Alt Text

Image of Docusign

Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more