DEV Community

Muhammad
Muhammad

Posted on • Originally published at muhammadraza.me on

My Favorites Pandas Tricks

In this post I will be writing about my favorite tricks about pandas that I use when doing some data analysis.

  • Finding unique values
import pandas as pd
data = pd.read_csv('https://gist.githubusercontent.com/tiangechen/b68782efa49a16edaf07dc2cdaa855ea/raw/0c794a9717f18b094eabab2cd6a6b9a226903577/movies.csv')
data.Film.unique()

This will print only unique values in Film column in the csv.

  • Filtering Data

Lets say in the dataset you are only looking for movies that had audience score above 50 and were comedy only. You can use filtering in this case which is really useful.

new_data = (data.Audience score % > 50) & (data.Genre == 'Comedy')
  • Saving to csv.

Pandas have a function that allows you to save data to csv file. For instance in order to save all the unique movie names we have to convert it to a data frame

uniq = data.Film.unique()
out = pd.DataFrame(uniq)
out.to_csv('uniq.csv')

This will create a csv file of unique names.

  • Groupby

This allows us to group data into groups. For instance if we want to look at the count of movies according genre we can use groupby.

data.groupby('Genre').Film.agg(['count'])

This will out put the total numbers of movies for each genre. You can also use other parameters like sum , mean and median.

  • String Operations

You can also use string operations when working with text data.

lower case a specific column

data['Genre'] = data['Genre'].str.lower()

This will lowercase your Genre column in the data. you can also use upper() for uppercase and you can also apply your own regex by using replace.

Anyways these were my favorite things about pandas and I hope you enjoyed reading it. Let me know in the comments what’s your favorite thing about pandas.

Discussion (4)

Collapse
mmphego profile image
Mpho Mphego

Nice post.
I created a pandas utility package that's available on PyPI.
Contributions are very welcome: github.com/mmphego/pandas_utility

Check it out and contribute where you can.

Happy coding.

Collapse
mraza007 profile image
Muhammad Author

Nice !!

Collapse
mellen profile image
Matt Ellen

I haven't used pandas, so I apologise if I'm jumping the gun, but your filtering example doesn't look like valid python.

Does pandas do something to allow python syntax to change?

Collapse
mraza007 profile image
Muhammad Author

No its not changing the syntax its actually column name which is really weird Audience score % but its totally correct so in this case you might need to change the column name since this has alot of spaces