DEV Community

Mostafa Gazar
Mostafa Gazar

Posted on

3

Some Pandas handy snippets for Data Scientists

Iterate through rows

import pandas as pd

for index, row in df.iterrows():
    pass
Enter fullscreen mode Exit fullscreen mode

Count unique values in dataframe

df.labels.value_counts()
Enter fullscreen mode Exit fullscreen mode

Style max value in a row or column

# Inspired by https://stackoverflow.com/a/45606572/2874139
def highlight_max(data, color='yellow', isBold=True):
    # Styling
    attrs = []
    if color is not None:
        attrs.append(f'background-color: {color}')
    if isBold:
        attrs.append('font-weight: bold')
    attrs = '; '.join(attrs)

    if data.ndim == 1:
        is_max = data == data.max()
        return [attrs if value else '' for value in is_max]
    else:
        is_max = data == data.max().max()
        return pd.DataFrame(np.where(is_max, attrs, ''), index=data.index, columns=data.columns)

df.style.apply(highlight_max, axis=1) # Max in row
df.style.apply(highlight_max, axis=0) # Max in column
Enter fullscreen mode Exit fullscreen mode

Display 1000 rows and columns

# source: fast.ai material
def display_all(df):
    with pd.option_context("display.max_rows", 1000, "display.max_columns", 1000): 
        display(df)

display_all(df)
Enter fullscreen mode Exit fullscreen mode

Save dataframe as CSV file

# index specifies whether to add a sequential index to the saved file
df.to_csv(csv_path, index=False)
Enter fullscreen mode Exit fullscreen mode

Create dataframe form python dictionary

all_questions = []  # rows of column 'all_questions'
all_good_answers = []  # rows of column 'all_good_answers'
all_bad_answers = []  # rows of column 'all_bad_answers'

qa_dict = {'question': all_questions, 'good_answer': all_good_answers, 'bad_answer': all_bad_answers}

# Create a dataframe with 3 columns: question, good_answer and bad_answer
df = pd.DataFrame(data=qa_dict)
Enter fullscreen mode Exit fullscreen mode

Parse dates in dataframe

df = pd.read_csv("train.csv", low_memory=False, parse_dates=["createddate"])
Enter fullscreen mode Exit fullscreen mode

I am working on a project called ML Studio, want to get early access to and product updates? Subscribe here or follow me on twitter.

Heroku

Built for developers, by developers.

Whether you're building a simple prototype or a business-critical product, Heroku's fully-managed platform gives you the simplest path to delivering apps quickly — using the tools and languages you already love!

Learn More

Top comments (0)

Quickstart image

Django MongoDB Backend Quickstart! A Step-by-Step Tutorial

Get up and running with the new Django MongoDB Backend Python library! This tutorial covers creating a Django application, connecting it to MongoDB Atlas, performing CRUD operations, and configuring the Django admin for MongoDB.

Watch full video →

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay