Jeff Hale

Posted on Jul 19, 2019 • Edited on Mar 26, 2020

How to Remember Pandas Index Methods

#python #tutorial #machinelearning #pandas

When method names are similar, it's difficult to keep them separate in your mind.
This makes remembering them harder.

Pandas has a slew of methods for creating and adjusting a DataFrame index.
This is a brief guide to help you create a little mental space between methods for easier memorization.

The Jupyter Notebook is on Kaggle here.

import pandas as pd
import numpy as np

Make a DataFrame without specifying an index (you get a default index).

df = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]))
df

	a	b
0	1	2
1	2	5
2	3	6
3	4	4

Make a DataFrame with an index by using the index keyword argument.

df2 = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]), index = [1,2,5,6])
df2

	a	b
1	1	2
2	2	5
5	3	6
6	4	4

Move a column to be the index with .set_index()

df3 = df2.set_index("a")
df3

	b
a
1	2
2	5
3	6
4	4

Rename the index values from scratch with .index

df3.index = [2,3,4,5]
df3

	b
2	2
3	5
4	6
5	4

Note that index is a property of the DataFrame not a method, so the syntax is different.

Nuke the index values and start over from 0 with .reset_index()

df4 = df3.reset_index()
df4

	index	b
0	2	2
1	3	5
2	4	6
3	5	4

If you don't want the index to become a column, pass drop=True to reset_index().

df5 = df3.reset_index(drop=True)
df5

	b
0	2
1	5
2	6
3	4

Reorder the rows with .reindex()

df6 = df5.reindex([2,3,1,0])
df6

	b
2	6
3	4
1	5
0	2

Passing a value that isn't in the index results in a NaN.

df7 = df5.reindex([2,3,1,0,6])
df7

	b
2	6.0
3	4.0
1	5.0
0	2.0
6	NaN

Advice

Ideally, add an index when you create your DataFrame with index =.

If reading from a .csv file you can set an index column by passing the column number.

For example:

df = pd.read_csv(my_csv, index_col=3)

Or pass index_col=False to exlcude.

How to set or change the index:

df.set_index() - move a column to the index
df.index - add an index manually
df.reset_index() - reset the index to 0, 1, 2 ...
df.reindex() - reorder the rows

Word associations to remember:

set_index() - move column
index - manual
reset_index() - reset
reindex - reorder

Wrap

I hope this article helped you create a little mental space to keep Pandas index methods straight. If it did, please give it some love so other people can find it, too.

I write about Data Science, Dev Ops, Python and other stuff. Check out my other articles if any of that sounds interesting.

Follow me and connect:
Medium
Dev.to
Twitter
LinkedIn
Kaggle
GitHub

Happy indexing!

DEV Community

How to Remember Pandas Index Methods

Make a DataFrame without specifying an index (you get a default index).

Make a DataFrame with an index by using the index keyword argument.

Move a column to be the index with .set_index()

Rename the index values from scratch with .index

Nuke the index values and start over from 0 with .reset_index()

Reorder the rows with .reindex()

Advice

How to set or change the index:

Word associations to remember:

Wrap

Top comments (0)

Read next

Choosing an SSL certificate: paid or free — or whether you can do without one

From Beginner to Pro: Unlock the Power of CSS Inheritance

Is the EU Falling Behind in the AI Race?

Using DSPy(COPRO) to refine prompt instructions