Awesome Data Science with Python

github logo ・1 min read

I have created a list of useful python packages for data science.

r0f1 / datascience

Curated list of python packages and tutorials for data science.

Data Science Awesome


pandas | Data structures built on top of numpy.
scikit-learn | Core ML library.
matplotlib | Plotting library.
seaborn | Python data visualization library based on matplotlib.
pandas_summary | Basic statistics using DataFrameSummary(df).summary().
pandas_profiling | Descriptive statistics using ProfileReport.
sklearn_pandas | Helpful DataFrameMapper class.
janitor | Clean messy column names.
missingno | Missing data visualization.

Pandas and Jupyter

General ticks: link
nteract | Open Jupyter Notebooks with doubleclick.
modin | Parallelization library for faster pandas DataFrame.
xarray | Extends pandas to n-dimensional arrays.
blackcellmagic | Code formatting for jupyter notebooks.
pivottablejs | Drag n drop Pivot Tables and Charts for jupyter notebooks.
qgrid | Pandas DataFrame sorting.
nbdime | Diff two notebook files, Alternative Github App: ReviewNB.


textract | Extract text from any document.

Big Data

spark | DataFrame for big data.
spark cheatsheet
dask | Pandas DataFrame for big data…

Sometimes, I have also linked to Youtube Talks, other Github Repos that contain short examples, etc.

Want to contribute? Let me know.

twitter logo DISCUSS (1)
markdown guide

Short examples are great in this space. Appreciate the list.

Classic DEV Post from Mar 10

Best blogs/podcasts to follow for Python developers

I was always eager to find new blogs about python but found just a handful of them.

Florian Rohrer profile image
such software.. much wow!