Awesome Data Science with Python

・1 min read

I have created a list of useful python packages for data science.

r0f1 / datascience

Curated list of python packages and tutorials for data science.

Data Science Awesome


pandas | Data structures built on top of numpy.
scikit-learn | Core ML library.
matplotlib | Plotting library.
seaborn | Python data visualization library based on matplotlib.
pandas_summary | Basic statistics using DataFrameSummary(df).summary().
pandas_profiling | Descriptive statistics using ProfileReport.
sklearn_pandas | Helpful DataFrameMapper class.
janitor | Clean messy column names.
missingno | Missing data visualization.

Pandas and Jupyter

General ticks: link
nteract | Open Jupyter Notebooks with doubleclick.
modin | Parallelization library for faster pandas DataFrame.
xarray | Extends pandas to n-dimensional arrays.
blackcellmagic | Code formatting for jupyter notebooks.
pivottablejs | Drag n drop Pivot Tables and Charts for jupyter notebooks.
qgrid | Pandas DataFrame sorting.
nbdime | Diff two notebook files, Alternative Github App: ReviewNB.


textract | Extract text from any document.

Big Data

spark | DataFrame for big data.
spark cheatsheet
dask | Pandas DataFrame for big data…

Sometimes, I have also linked to Youtube Talks, other Github Repos that contain short examples, etc.

Want to contribute? Let me know.


Short examples are great in this space. Appreciate the list.

Classic DEV Post from Sep 17 '18

Who's looking for open source contributors? (September 17 edition)

Find something to work on or promote your project here. Please shamelessly pro...

such software.. much wow!

πŸ‘‹ Hey reader.

Do you prefer sans serif over serif?

You can change your font preferences in the "misc" section of your settings. ❀️