Applying a function to rows of a Pandas DataFrame is one of the most common operations during data wrangling. There are many ways of doing it.
I plotted the performance of various ways of applying a function to each row of a Pandas DataFrame, for up to a million rows.
I was surprised to see itertuples() beating apply(), and humble list comprehension beating them both.
So far, I was using apply() whenever I found vectorization difficult. Somehow I thought it was the 2nd best option.
I have been using "%timeit" often. In this exercise, I learned how to do line-level profiling in Python and also plotting the performance over input size.
Top comments (0)