DEV Community

Dang Hoang Nhu Nguyen
Dang Hoang Nhu Nguyen

Posted on

[BTY] Day 2: Fancy packages to work with Dataframe

2 packages I want to mention is pandas-profiling and Mito.

pandas-profiling

It will generate profile reports from a pandas DataFrame. It's more powerful than the default df.describe(). These are statistics presented in an interactive HTML report:

  • Type inference: detect the types of columns in a dataframe.
  • Essentials: type, unique values, missing values
  • Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range
  • Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
  • Most frequent values
  • Histogram
  • Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
  • Missing values matrix, count, heatmap and dendrogram of missing values
  • Text analysis learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data.
  • File and Image analysis extract file sizes, creation dates, and dimensions and scan for truncated images or those containing EXIF information.

Mito

The main functionalities are exploring, transforming, and presenting your data with the ease of Excel. All without leaving Jupyter (see the video demo). It's so easy to use and it really makes "advanced data analysis accessible to all."

Check it out!

Top comments (0)