DEV Community

Mac McCarthy
Mac McCarthy

Posted on • Edited on

Pandas and Beer

For the first time since its inception in 1982, the Great American Beer Festival has been canceled, in turn, replaced by an opportunity for a lonely pub crawl. Digital passports were handed out, and "attendees" got their first looks at the available breweries in their area:

Alt Text

Now, that view is all well and good. Provides the needed information, along with directions or a website a user could peruse to find beers that can quench their unshakeable thirst during this pandemic.

What is it missing? Python. And Pandas.

After scraping (may write a future post) the data into a CSV, I had a collection separated by three columns: Brewery name, Location (city/state), and Deal.

As always, the first step is to import all of that and turn it into a DataFrame:

import pandas as pd
df = pd.read_csv("breweries.csv")

pd.set_option("max_colwidth", None)
Enter fullscreen mode Exit fullscreen mode

The set_option was an important addition. Otherwise, crucial information would be cut off in the limited column width.

1) Find all breweries in a city

This required a simple column lookup via location:

denver_breweries = df.loc[df["Location"] == "Denver, Colorado"]

This got me a better view of the possibilities around me:

Alt Text

2) Find breweries according to deals

Now, what if I wasn't alone on my boozy adventures? What if I actually had friends that wanted to see me? This is when a Buy One - Get One Free deal would come in handy. I could search by substrings within the Deal column.

def find_bogo_deals(brew_list):    
    new_df = brew_list[brew_list['Deal'].str.contains("BOGO") | brew_list['Deal'].str.contains("buy one, get one", case=False)]
    df = pd.DataFrame(new_df)
    return df

find_bogo_deals(denver_breweries)
Enter fullscreen mode Exit fullscreen mode

I declared the variable denver_breweries at the beginning of the notebook

Now, we are cooking with fire!

Alt Text

But what if we wanted the good stuff? Those special releases that have been brewed and barrel-aged since last year, in anticipation for the fall of 2020. Then we would just change up our substring filter using regex.

import re

def find_specials(brew_list):

    special_list = ["special", "exclusive", "special release"]

    esc_lst = [re.escape(s) for s in special_list]   

    spec_pattern = '|'.join(esc_lst)

    new_df = brew_list[brew_list['Deal'].str.contains(spec_pattern, case=False)]    

    df = pd.DataFrame(new_df)

    return df


find_specials(denver_breweries)
Enter fullscreen mode Exit fullscreen mode

Alt Text

Excellent! This list should ensure that I won't be driving.

Obviously, we could do a lot more manipulation here: splitting up location to be City - State; cleaning up the text in the Deal column to remove newlines; even exporting that CSV into a SQL database to play around some more.

In any case, I hope this post inspired you to break out your notebooks and apply Python & Pandas to your everyday life. GABF deals end October 17th (this Saturday), so don your mask and get out there to support your local breweries in these trying times.

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more