DEV Community

Cover image for PANDAS VS POLARS: THE PYTHON DATA MANIPULATION WAR.
Adefemi Adeyanju
Adefemi Adeyanju

Posted on

PANDAS VS POLARS: THE PYTHON DATA MANIPULATION WAR.

INTRODUCTION
For those accustomed with python for data manipulation, Pandas is an household name. It can be used to manipulate a
particular set of data until it is clean and useful for usage.

A new data manipulation framework: Polars, has been recently introduced and this new library might just be a saving grace to python users. Read and you just might find yourself switching from Pandas to Polars.

WHY SWITCH TO POLARS?
Pandas is an essential library in the field of Data Science which is primarily used in data manipulation. Although Pandas is a great library, it does comes with a certain drawback: It is very slow in processing large datasets. As such, Polars was designed to process data much faster than Pandas, making Polars a Pandas alternative.

Let's take a look at some of the similarities and differences between the Pandas and Polars code.

  • Importing Data

Pandas

import pandas as pd
Enter fullscreen mode Exit fullscreen mode

Polars

import polars as pl
Enter fullscreen mode Exit fullscreen mode
  • Reading CSV file

Pandas

df = pd.read_csv(file)
Enter fullscreen mode Exit fullscreen mode

Polars

df = pl.read_csv(file)
Enter fullscreen mode Exit fullscreen mode
  • Memory Usage

Pandas

df.memory_usage()
Enter fullscreen mode Exit fullscreen mode

Polars

df.estimate_size()
Enter fullscreen mode Exit fullscreen mode
  • Delete Column

Pandas

df.drop(columns=["columns"])
Enter fullscreen mode Exit fullscreen mode

Polars

df.drop(name=["columns"])
Enter fullscreen mode Exit fullscreen mode
  • Sort

Pandas

df.sort_values("column")

Enter fullscreen mode Exit fullscreen mode

Polars

df.sort("column")
Enter fullscreen mode Exit fullscreen mode
  • Unique values

Pandas

df.column.unique()
Enter fullscreen mode Exit fullscreen mode

Polars

df.column.unique()
Enter fullscreen mode Exit fullscreen mode
  • Lazy Execution

Pandas

Not Supported
Enter fullscreen mode Exit fullscreen mode

Polars

df.lazy()
Enter fullscreen mode Exit fullscreen mode
  • Filter Data

Pandas

df.df[column > 10]
Enter fullscreen mode Exit fullscreen mode

Polars

df.df[column > 10]
        or
df.filter(pl.col("column" >10))
Enter fullscreen mode Exit fullscreen mode

CONCLUSION
Both are great libraries to use but Polars might just have the advantage as regards speed. Although most pandas users might be a little reluctant to shift over to Polars as they are well accustomed to pandas and going to Polars might just mean they will have to adjust to some of the code differences.

Top comments (0)